Currrently I am calculating TA like below. Is there anyway I can make it faster or more efficient for large amount of data

55 Views Asked by At
import pandas as pd
import talib as ta
import datetime 

start=datetime.datetime.now()
a = ["ACC.csv", "ADANIPORTS.csv", "AMBUJACEM.csv", 'ASHOKLEY.csv', 'ASIANPAINT.csv']

dataframes = {}  

for file_name in a:
    df = pd.read_csv(file_name)
    df['SMA'] = ta.SMA(df['close'], timeperiod=14)
    macd, signal, macd_histogram = ta.MACD(df['close'], fastperiod=12, slowperiod=26)
    df['MACD'] = macd
    df['MACD_Signal'] = signal
    df['RSI'] = ta.RSI(df['close'], timeperiod=14)
    file_key = file_name.replace(".csv", "")  
    dataframes[file_key] = df  

for key, df in dataframes.items():
    print(f"File: {key}")
    print(df)
    print("\n")
print(datetime.datetime.now()-start)

Currently I am calculating TA like above. Is there anyway I can make it faster or more efficient for large amount of data.

1

There are 1 best solutions below

0
Rathan On

you use something like below to apply multithreading.

# Use a ThreadPoolExecutor to parallelize the processing of files
with concurrent.futures.ThreadPoolExecutor() as executor:
    # Submit each file for processing
    futures = [executor.submit(process_file, file_name) for file_name in a]
    # Wait for all futures to complete
    concurrent.futures.wait(futures)