convert string which contains sub string to dictionary

Question

convert string which contains sub string to dictionary

78 Views Asked by param At 16 February 2023 at 03:15

I am tring to convert particular strings which are in particular format to Python dictionary. String format is like below,

st1 = 'key1 key2=value2 key3="key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4" key4'

I want to parse it and convert to dictionary as below,

dict1 {
    key1: None,
    key2: value2,
    key3: {
            key3.1: None,
            key3.2: value3.2,
            key3.3: value3.3,
            key3.2: None
          }
    key4: None,

I tried to use python re package and string split function. not able to acheive the result. I have thousands of string in same format, I am trying to automate it. could someone help.

Original Q&A

There are 2 best solutions below

HALF9000 On 16 February 2023 at 09:40

Consider use parsing tool like lark. A simple example to your case:

_grammar = r'''
    ?start: value
    
    ?value: object
           | NON_SEPARATOR_STRING?

    object : "\"" [pair (_SEPARATOR pair)*] "\""
    pair : NON_SEPARATOR_STRING [_PAIRTOR] value

    
    NON_SEPARATOR_STRING: /[a-zA-z0-9\.]+/
    _SEPARATOR: /[,  ]+/
            | ","
    _PAIRTOR: " = "
            | "="
'''

parser = Lark(_grammar)

st1 = 'key1 key2=value2 key3="key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4" key4'

tree = parser.parse(f'"{st1}"')
print(tree.pretty())

"""
object
  pair
    key1
    value
  pair
    key2
    value2
  pair
    key3
    object
      pair
        key3.1
        value
      pair
        key3.2
        value3.2
      pair
        key3.3
        value3.3
      pair
        key3.4
        value
  pair
    key4
    value

"""

Then you can write your own Transformer to transform this tree to your desired date type.

**Nolan Walker** · Accepted Answer · 2023-02-16T05:38:03.993000

If all your strings are consistent, and only have 1 layer of sub dict, this code below should do the trick, you may need to make tweaks/changes to it.

import json

st1 = 'key1 key2=item2 key3="key3.1, key3.2=item3.2 , key3.3 = item3.3, key3.4" key4'
st1 = st1.replace(' = ', '=')
st1 = st1.replace(' ,', ',')
new_dict = {}
no_keys=False

while not no_keys:
    st1 = st1.lstrip()
    
    if " " in st1:
        item = st1.split(" ")[0]
    else:
        item = st1
    
    if '=' in item:
        if '="' in item:
            item = item.split('=')[0]
            new_dict[item] = {}     
            
            st1 = st1.replace(f'{item}=','')
            sub_items = st1.split('"')[1]
            sub_values = sub_items.split(',')

            for sub_item in sub_values:
                if "=" in sub_item:
                    sub_key, sub_value = sub_item.split('=')
                    new_dict[item].update({sub_key.strip():sub_value.strip()})
                else:
                    new_dict[item].update({sub_item.strip(): None})
            
            st1 = st1.replace(f'"{sub_items}"', '')
        else:
            key, value = item.split('=')
            new_dict.update({key:value})
            st1 = st1.replace(f"{item} ","")
    else:
        new_dict.update({item: None})
        st1 = st1.replace(f"{item}","")
        
    if st1 == "":
        no_keys=True    
    
print(json.dumps(new_dict, indent=4))

convert string which contains sub string to dictionary

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PYTHON-REGEX

Trending Questions

Popular # Hahtags

Popular Questions