I have a situation where user can enter commands with optional key value pairs and value may contain spaces ..
here are 4 - different form user input where key and value are separated with = sign and values have space:
"cmd=create-folder name=SelfServe - Test ride"
"cmd=create-folder name=SelfServe - Test ride server=prd"
"cmd=create-folder name=cert - Test ride server=dev site=Service"
"cmd=create-folder name=cert - Test ride server=dev site=Service permission=locked"
Requirement: I am trying to parse this string and split into a dictionary based on the key and value present on a string .
If user enter First form of Statement, that wold produce a dictionary like :
query_dict = {
'cmd' : 'create-folder',
'name' : 'selfserve - Test ride'
}
if user enter second form of statement that would produce /add the additional key /value pair
query_dict = {
'cmd' : 'create-folder',
'name' : 'selfserve - Test ride',
'server' : 'prd'
}
if user enter third form of statement that would produce
query_dict ={
'cmd' : 'create-folder',
'name' : 'cert - Test ride',
'server' : 'dev',
'site': 'Service'
}
forth form produce the dictionary with key/value split like below
query_dict ={
'cmd' : 'create-folder',
'name' : 'cert - Test ride',
'server' : 'dev',
'site': 'Service',
'permission' : 'locked' }
-idea is to parse a string where key and value are separated with = symbol and where the values can have one or more space and extract the matching key /value pair .
I tried multiple methods to match but unable to figure out a single generic regular expression pattern which can match/extract any string where we have this kind of pattern
Appreciate your help.
i tried several pattern map based different possible user input but that is not a scalable approach . example :
i created three pattern to match three variety of user input but it would be nice if i can have one generic pattern that can match any combination of key=values in a string (i am hard coding the key in the pattern which is not ideal
'(cmd=create-folder).*(name=.*).*' ,
'(cmd=create-pfolder).*(name=.*).*(server=.*).*',
'(cmd=create-pfolder).*(name=.*).*(server=.*).*(site=.*)'
I would suggest using
split, and thenzipto feed thedictconstructor:Example runs:
Outputs:
Explanation
Using this input as example:
The
splitregex identifies these parts:The strings that are not matched by it will end up a results, so we have these:
The first string is empty, because it is what precedes the first match.
Now, as the regex has a capture group, the string that is captured by that group, is also returned in the result list, at odd indices. So
partsends up like this:The keys we are interested in, occur at odd indices. We can get those with
parts[1::2], where1is the starting index, and2is the step.The corresponding values for those keys occur at even indices, ignoring the empty string at index 0. So we get those with
parts[2::2]. With the call tozip, we pair those keys and values together as we want them.Finally, the
dictconstructor can take an argument with key/value pairs, which is exactly what thatzipcall provides.