mecab python extract company name

21 Views Asked by At

I'm trying to run the data in a column and extract only the company name using MeCab library and list them in a new column. The target column is a comment column which includes employee names, company names, invoice number etc all together or by itself depending on the transaction. Listed below is my code trying to extract only the company name. Please note the below code is still in production, but just wanted to post something to start with. Sorry in advance for my messy coding...

Thank you,

import mecab-python3
import ipadic
df = pd.read_csv("")
m = MeCab.Tagger(ipadic.MECAB_ARGS) 

def kaiseki(column):
    list=  df[column].values.tolist()
    new_list = []
    new_list2 = []

    for li in list:
        li = m.parse(li)
        new_list.append(li)

        li2 = li.split('\n')
        new_list2.append(li2)

    for li1 in li2:
        li2 = li1.split('\t')

    for li2_1 in li2:
        li2_1_1 = li2_1.split(',')[0]

#組織名 means company name in Japanese

        if li2_1_1 == '組織名':
            print(li1.split()[0])
        else:
            continue

    df[column] = new_list
    df["column2"] = new_list2
    return df["columns2"]

columns = ['column']
for column in columns:
    kaiseki(column)
0

There are 0 best solutions below