Python search for the errors in the JSON

32 Views Asked by At

Premise: those dictionaries are strings. Those are JSON with a bad format. I search for the errors inside those JSON and print them.

example 1:

input

{
    "ced": {
        "CED_PH2": "_VOID_",
        "CED_PH": "_VOID_",
        "CED_IR": "_VOID_"
    },
    "shh": {
        "An": {
            "name": "c",
            "ines": "Sam " "ples",
            "in": "bu"   
        },
        "Ar"l": {
            "name": "uu",
            "i": "aa",   
        },
        "At": {
            "name": "Ru" "tp",
            "inute": "Ae",
            "intColor": "Dlor",
            "tal": "tal"
        },
        "We": {
            "name": "Wue",
            "ior": "Pour",
            "iss": "Wus"
        }
    }
}

I want it show me as errors:

"ines": "Sam " "ples"

"Ru" "tp"

"Ar"l": {"name": "uu" or "Ar"l"

also I want from

input 
{
       "CED_PHS": "_VOID_",
       "CED_PH": "_VOI""D_",
       "CED_IR": "_VO"ID_",
       "name": "Wue",
           "ior": "Pour",
           "iss": "Wus"
}

show me as error :

"CED_PH": "VOI""D",

"CED_IR": "VO"ID",

My code is:

def writeProblems(content):
    print(content)
    start_index = content.find('{') + 1  
    end_index = content.rfind('}') 
    if start_index > 0 and end_index > 0:
        content = content[start_index:end_index]  
    lines = content.split(',\n')
    counter = 0
    pattern = r'^\s*"[^"]*"\s*:\s*"[^"]*"\s*$'
    pattern2 =  r'\s*"[^"]+"(?=:)\s*:\s*{(?:\s*"[^"]+"\s*:\s*{(?:\s*"[^"]+"\s*:\s*{(?:\s*"[^"]+"\s*:\s*"[^"]*"\s*,?\s*)*}\s*,?\s*)*}\s*,?\s*)*}\s*'


    for line in lines:
        counter += 1
        if not re.match(pattern, line) and not re.match(pattern2, line):
            print('line %s: has a " in the middle --->%s<---' % (counter, line))
    return False

1

There are 1 best solutions below

0
Diogo Correia On

Have you tried using grammars with ANTLR4?
Here is an example from an official antlr github repository working with JSON.