Extract multiple values from a string at once

45 Views Asked by At

I'm trying to clean up my data in strings in this format:

'Ti': ['88.115', '199.2', '44.4', '39.0', '1.89', '89', '0.870']

I want to extract values [0], values [1] and values [-2]. Doing this manually with a single string works fine, however in my code it only shows values [1] twice instead of values [-2].

Input Data:

corrected Dataframes: ('Peak'{'Al': ['90.569', '533.3', '32.6', '28.8', '1.02', '115', '9.588'], 'Ca': ['107.254', '5759.7', '57.5', '53.0', '0.30', '83', '102.945'], 'Cr': ['73.359', '-0.2', '89.8', '76.0', '100.00', '', '100', '0.000'], 'Fe': ['134.750', '581.0', '20.3', '16.8', '0.96', '164', '7.775'], 'K': ['119.624', '-4.7', '37.4', '30.0', '100.00', '', '91', '0.000'], 'Mg': ['107.507', '5699.9', '40.7', '33.5', '0.30', '51', '31.063'], 'Mn': ['146.274', '20.3', '12.3', '13.5', '7.49', '143', '0.256'], 'Na': ['129.500', '64.2', '17.6', '12.0', '4.77', '71', '2.252'], 'Si': ['77.272', '7183.2', '92.6', '56.8', '0.27', '196', '75.399'], 'Ti': ['88.115', '199.2', '44.4', '39.0', '1.89', '89', '0.870']})

code where element_data is a dataframe and Peak is a key in that dataframe:

         if 'Peak' in element_data:
            for key in element_data['Peak'].keys():
                element_data['Peak'][key] = [element_data['Peak'][key][0], element_data['Peak'][key][1], element_data['Peak'][key][-2]]

Output:

'Peak' data Al: ['90.569', '533.3', '533.3']
'Peak' data Ca: ['107.254', '5759.7', '5759.7']
'Peak' data Cr: ['73.359', '-0.2', '-0.2']
'Peak' data Fe: ['134.750', '581.0', '581.0']
'Peak' data K: ['119.624', '-4.7', '-4.7']
'Peak' data Mg: ['107.507', '5699.9', '5699.9']
'Peak' data Mn: ['146.274', '20.3', '20.3']
'Peak' data Na: ['129.500', '64.2', '64.2']
'Peak' data Si: ['77.272', '7183.2', '7183.2']
'Peak' data Ti: ['88.115', '199.2', '199.2']

Wanted Output:

'Peak' data Al: ['90.569', '533.3', '115']
'Peak' data Ca: ['107.254', '5759.7', '83']
'Peak' data Cr: ['73.359', '-0.2', '100']
'Peak' data Fe: ['134.750', '581.0', '164']
'Peak' data K: ['119.624', '-4.7', '91']
'Peak' data Mg: ['107.507', '5699.9', '51']
'Peak' data Mn: ['146.274', '20.3', '143']
'Peak' data Na: ['129.500', '64.2', '71']
'Peak' data Si: ['77.272', '7183.2', '196']
'Peak' data Ti: ['88.115', '199.2', '89']
2

There are 2 best solutions below

1
Aria Nova On

Did not reproduce that. Perhaps the problem is somewhere else?

import pandas as pd

element_data = {
    "Peak": {
        "Al": ["90.569", "533.3", "32.6", "28.8", "1.02", "115", "9.588"],
        "Ca": ["107.254", "5759.7", "57.5", "53.0", "0.30", "83", "102.945"],
        "Cr": ["73.359", "-0.2", "89.8", "76.0", "100.00", "", "100", "0.000"],
        "Fe": ["134.750", "581.0", "20.3", "16.8", "0.96", "164", "7.775"],
        "K": ["119.624", "-4.7", "37.4", "30.0", "100.00", "", "91", "0.000"],
        "Mg": ["107.507", "5699.9", "40.7", "33.5", "0.30", "51", "31.063"],
        "Mn": ["146.274", "20.3", "12.3", "13.5", "7.49", "143", "0.256"],
        "Na": ["129.500", "64.2", "17.6", "12.0", "4.77", "71", "2.252"],
        "Si": ["77.272", "7183.2", "92.6", "56.8", "0.27", "196", "75.399"],
        "Ti": ["88.115", "199.2", "44.4", "39.0", "1.89", "89", "0.870"],
    }
}

if "Peak" in element_data:
    for key in element_data["Peak"].keys():
        element_data["Peak"][key] = [
            element_data["Peak"][key][0],
            element_data["Peak"][key][1],
            element_data["Peak"][key][-2],
        ]

for k, v in element_data["Peak"].items():
    print(k, v)

output:

Al ['90.569', '533.3', '115']
Ca ['107.254', '5759.7', '83']
Cr ['73.359', '-0.2', '100']
Fe ['134.750', '581.0', '164']
K ['119.624', '-4.7', '91']
Mg ['107.507', '5699.9', '51']
Mn ['146.274', '20.3', '143']
Na ['129.500', '64.2', '71']
Si ['77.272', '7183.2', '196']
Ti ['88.115', '199.2', '89']

Edit: you might have a redundent repeat in your code. After the code above:

...
# A redundant repeat of the above code
if "Peak" in element_data:
    for key in element_data["Peak"].keys():
        element_data["Peak"][key] = [
            element_data["Peak"][key][0],
            element_data["Peak"][key][1],
            element_data["Peak"][key][-2],
        ]

for k, v in element_data["Peak"].items():
    print(k, v)

Then you get your output:

Al ['90.569', '533.3', '533.3']
Ca ['107.254', '5759.7', '5759.7']
Cr ['73.359', '-0.2', '-0.2']
Fe ['134.750', '581.0', '581.0']
K ['119.624', '-4.7', '-4.7']
Mg ['107.507', '5699.9', '5699.9']
Mn ['146.274', '20.3', '20.3']
Na ['129.500', '64.2', '64.2']
Si ['77.272', '7183.2', '7183.2']
Ti ['88.115', '199.2', '199.2']
0
SIGHUP On

I think you have a Python dictionary rather than a pandas DataFrame. If that's the case then:

data = {
    "Peak": {
        "Al": ["90.569", "533.3", "32.6", "28.8", "1.02", "115", "9.588"],
        "Ca": ["107.254", "5759.7", "57.5", "53.0", "0.30", "83", "102.945"],
        "Cr": ["73.359", "-0.2", "89.8", "76.0", "100.00", "", "100", "0.000"],
        "Fe": ["134.750", "581.0", "20.3", "16.8", "0.96", "164", "7.775"],
        "K": ["119.624", "-4.7", "37.4", "30.0", "100.00", "", "91", "0.000"],
        "Mg": ["107.507", "5699.9", "40.7", "33.5", "0.30", "51", "31.063"],
        "Mn": ["146.274", "20.3", "12.3", "13.5", "7.49", "143", "0.256"],
        "Na": ["129.500", "64.2", "17.6", "12.0", "4.77", "71", "2.252"],
        "Si": ["77.272", "7183.2", "92.6", "56.8", "0.27", "196", "75.399"],
        "Ti": ["88.115", "199.2", "44.4", "39.0", "1.89", "89", "0.870"],
    }
}

if (d := data.get("Peak")) is not None:
    for k, v in d.items():
        s = [v[0], v[1], v[-2]]
        print(f"'Peak' data {k}: {s}")

Output:

'Peak' data Al: ['90.569', '533.3', '115']
'Peak' data Ca: ['107.254', '5759.7', '83']
'Peak' data Cr: ['73.359', '-0.2', '100']
'Peak' data Fe: ['134.750', '581.0', '164']
'Peak' data K: ['119.624', '-4.7', '91']
'Peak' data Mg: ['107.507', '5699.9', '51']
'Peak' data Mn: ['146.274', '20.3', '143']
'Peak' data Na: ['129.500', '64.2', '71']
'Peak' data Si: ['77.272', '7183.2', '196']
'Peak' data Ti: ['88.115', '199.2', '89']