I am unexperienced in coding, need help in this simple code.
I can’t split Index, expirydate, strike and opt type.
I am stuck in splitting expiry date.
**Sample 1**
data = ["NIFTY2431322000PE", "NIFTY2441522000PE"]
df = pd.DataFrame({'details': data})
# Regular expression pattern to match the desired format
pattern = r'([A-Z]+)(\d{2})(\d{1})(\d{2})(\d{5})([A-Z]+)'
df[['Instrument', 'Year', 'month', 'Date', 'strike', 'CEorPE']] = df['details'].str.extract(pattern)
RESULT
details Instrument Year month Date strike CEorPE
0 NIFTY2431322000PE NIFTY 24 3 13 22000 PE
1 NIFTY2441522000PE NIFTY 24 4 15 22000 PE
SAMPLE 2
data = ["NIFTY24MAR22000PE", "NIFTY24DEC22000PE"]
df = pd.DataFrame({'details': data})
pattern = r'([A-Z]+)(\d{2})([A-Z]+)(\d{5})([A-Z]+)'
df[['Instrument', 'Year','month', 'strike', 'CEorPE']] = df['details'].str.extract(pattern)
df['date'] = '30'
Result
details Instrument Year month strike CEorPE date
0 NIFTY24MAR22000PE NIFTY 24 MAR 22000 PE 30
1 NIFTY24DEC22000PE NIFTY 24 DEC 22000 PE 30
I can’t figure out how to run on dataframe which have multiple variation
data = ["NIFTY2431322000PE", "NIFTY24DEC22000PE"]
If possible I want expiry date in 1 column..
thank you in advance
You could use:
Same with named capturing groups:
Output:
regex demo