bad_alloc when using antlr4 with big strings

42 Views Asked by dahko37 At 19 February 2024 at 08:59

I am creating a compiler using ANTLR4 C++ library. When I run the compiler with big programs, it returns bad_alloc.

I have created a minimal test case, where I just do Loop unrolling of a for loop. Here is the cpp code for the compiler:

int main(int argc, const char* argv[]) {
    ANTLRInputStream input("");
    qasm3Lexer lexer(&input);
    CommonTokenStream tokens(&lexer);
    qasm3Parser parser(&tokens);

    fstream input_stream;
    ofstream output_stream;

    input_stream.open(argv[1]);
    stringstream string_stream;
    string_stream << input_stream.rdbuf();
    string compiled_text = string_stream.str();
    ForUnrollPass for_unroll_pass(&tokens);

    input.reset();
    input.load(compiled_text);
    lexer.setInputStream(&input);
    tokens.setTokenSource(&lexer);
    parser.setTokenStream(&tokens);

    ParseTreeWalker::DEFAULT.walk(&for_unroll_pass, parser.program());
    compiled_text = for_unroll_pass.getText();

    output_stream.open(argv[2]);
    output_stream << compiled_text;
    output_stream.close();
    return 0;
}

After tracking down with the debugger, my ForUnrollPass does 1700 rewrite operations (i am now rewriting every token so is very inefficient). ANTLR4 does lazy writing, so the operations are not applied until the getText() function. Then in the getText() it applies 1602 operations and then it crashes on this instruction.

rop->text = iop->text + (!rop->text.empty() ? rop->text : "");

After profiling with valgrind, I have reached 800,000 Bytes of allocated memory, although that does not seem to be the problem, as the minimal case fails at the same point.

ANTLR4 uses and an augmented transition network under the hood that may be using memory, but I do not fully understand how it works, or if 800.000 bytes should be a thing to worry.

ulimit -a returns 8192KB for stack size on Linux, maybe im running short of stack space?

I am not an expert in memory so Im not sure where is the problem.

EDIT

After analyzing the call stack on the operation that produces the bad_alloc, iop->text seems to be at a corrupt memory address.

Original Q&A

bad_alloc when using antlr4 with big strings

There are 0 best solutions below

Related Questions in C++

Related Questions in MEMORY-MANAGEMENT

Related Questions in HEAP-MEMORY

Related Questions in ANTLR

Trending Questions

Popular # Hahtags

Popular Questions