Why does the SQlite function executemany run slow with updating large amount of data?

394 Views Asked by At

I'm new to SQlite and I ran into a problem while trying to update 2 columns of a SQlite database for a sentiment analysis on tweets using the TextBlob library. I have to update 6.5 million rows and I want to do it as efficiently as possible. I used the following code for this.

conn = sqlite3.connect('tweets.db')
c = conn.cursor() 

from textblob import TextBlob

english_query = """
    SELECT tweet_id, text
    FROM tweetInfo
    WHERE lang = 'en'
"""
c.execute(english_query)
e_tweets = c.fetchall()
conn.commit()

data_list = []

for i in range(len(e_tweets)):
    small_list = [TextBlob(e_tweets[i][1]).sentiment.polarity, TextBlob(e_tweets[i][1]).sentiment.subjectivity, 
                    e_tweets[i][0]]
    data_list.append(small_list)

update_query = '''
    UPDATE tweetInfo
    SET polarity = ?, subjectivity = ?
    WHERE tweet_id = ?
'''
data = data_list 
c.executemany(update_query, data)
conn.commit()

I used the executemany function because I found online that it was suppossed to be fast with handeling large amount of data. The code seems to work fine, but it takes multiple hours to finish, so I'm wondering if I did anything wrong here with the executemany function or the code in general. Does anyone have a solution for this?

0

There are 0 best solutions below