Extract data using Regex function

68 Views Asked by At

Need help with one of the issues i'm struggling with.

I have two columns name , brand in a DataFrame and i'm trying to extract model name from name column(series) and create column.

I've used a regex function to clean the data and extract the model name from name column.

But i'm kinda stuck here because it is returning list of values a series, but i only want to extract model name of it.

DataFrame I have is

enter image description here

Code:

test = autos["name"].apply(lambda x : (re.findall(r"([A-Za-z0-9]+)",x))).tail(10)

output

enter image description here

Second way I tried is as below

Code

modle_name =[]

for mod in autos["name"]:
    output = re.findall(r"([A-Za-z0-9]+)",mod)
    if len(output)<=1:
        modle_name.append(output[0])
    else:
         modle_name.append(output[1])
        
    # print(output)

print(modle_name)

Again this return list of values.

Sample data
        name                                                                Brand
__SCIROCCO_GT_II__                                                       volkswagen
____Astra_h_kombi____                                                       opel
?Golf_5_TSI1.4_?HU_2017_#bis_Mitte_April_noch!!#                        volkswagen
!!!!!_Montag_Angebot_!!!!_Nissan_Micra_1.2_Acenta_Pure_Drive                nissan
??BMW_320d_T√úV_10/2017_TOP_??  ,                                           bmw
??????mercedes_benz_s_klasse??????  ,                                   mercedes_benz
****_Festpreis****__Top_Rover__45_mit_Vollausstattung_oder_Tausch               rover
***_SMART_forTwo_cabrio_softouch_passion___super_Ausstattung_***                smart
!!!_TOP_!!!_gepflegter_Colt_zu_verkaufen                                    mitsubishi
???Vw_Passat_3BG_Highline_TDI_4Motion???                                    volkswagen
!_Volkswagen_Golf_4_IV___1.6___100_PS_!                                 volkswagen
Kleinwagen                                                                  renault

enter image description here

0

There are 0 best solutions below