Question
How can one read an arbitrary Python file, build an abstract syntax tree from it, modify that, and then write the modified AST back to file, in Java? (Small note, for a concrete syntax tree (which includes spacing comments etc), one could call this pip package from Java.)
Approach
I tried the following method to first read the Python code to generate the abstract syntax tree (AST):
package com.doctestbot.cli;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.python.core.Py;
import org.python.core.PyObject;
import org.python.core.PyString;
import org.python.util.PythonInterpreter;
/**
* A class to retrieve the Python abstract syntax tree using Jython. This is a utility class,
* meaning one only calls its method, and one does not instantiate the object.
*/
public final class PythonAstRetriever {
/**
* Retrieves the Python abstract syntax tree for the given Python code.
*
* @param pythonCode The Python code for which to retrieve the AST.
* @return The Python abstract syntax tree as a PyObject.
*/
@SuppressWarnings({"PMD.LawOfDemeter"})
public static PyObject getPythonAst(String pythonCode) {
// Create a PythonInterpreter
PythonInterpreter interpreter = new PythonInterpreter();
// Access the "ast" module from Python
PyObject astModule = interpreter.get("ast");
// Parse the Python code and generate the AST
PyObject invokeArg = new PyString(pythonCode);
return astModule.invoke("parse", invokeArg, Py.None, Py.None);
}
/**
* Reads the content of a Python code file from the specified file path.
*
* @param filePath The path to the Python code file to read.
* @return The content of the Python code file as a string.
* @throws IOException If an I/O error occurs while reading the file.
*/
public static String readPythonCodeFromFile(String filePath) throws IOException {
Path path = Paths.get(filePath);
return Files.readString(path);
}
// Private constructor to prevent instantiation of the utility class.
private PythonAstRetriever() {
throw new AssertionError("PythonAstRetriever class should not be instantiated.");
}
}
However, when I run it with:
String pythonCode =
"\"\"\"Example python file with a function.\"\"\"\n" +
"\n" +
"from typeguard import typechecked\n" +
"\n" +
"@typechecked\n" +
"def add_two(*, x: int) -> int:\n" +
" \"\"\"Adds a value to an incoming number.\"\"\"\n" +
" return x + 2";
PyObject astTree = PythonAstRetriever.getPythonAst(pythonCode);
However, that yields error:
Error
PythonAstRetriever.java:34: error: incompatible types: PyObject cannot be converted to PyObject[]
return astModule.invoke("parse", invokeArg, Py.None, Py.None);
^
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
Full Stacktrace
In response to the comments, below is the full stacktrace:
PythonAstRetriever.java:34: error: no suitable method found for invoke(String,PyObject,PyObject,PyObject)
return astModule.invoke("parse", invokeArg, Py.None, Py.None);
^
method PyObject.invoke(String,PyObject[],String[]) is not applicable
(actual and formal argument lists differ in length)
method PyObject.invoke(String,PyObject[]) is not applicable
(actual and formal argument lists differ in length)
method PyObject.invoke(String) is not applicable
(actual and formal argument lists differ in length)
method PyObject.invoke(String,PyObject) is not applicable
(actual and formal argument lists differ in length)
method PyObject.invoke(String,PyObject,PyObject) is not applicable
(actual and formal argument lists differ in length)
method PyObject.invoke(String,PyObject,PyObject[],String[]) is not applicable
(argument mismatch; PyObject cannot be converted to PyObject[])
1 error
FAILURE: Build failed with an exception.
XY-problem
In response to the comments, the XY-problem is a bot that modifies code: changes or writes docstrings, function documentations and/or function comments, and writes tests for those functions. I would like to perform a separate modification/creation per modular component of the code of a file. So instead of writing a regex, or a manual Python code parser, I assumed using the AST could be an effective strategy to obtain the code components in a hierarchical and modular fashion.
Scope
The syntax error, on the
Py.Noneargument was resolved. However, it seems to me that converting an AST back into python code is non-trivial. Hence, this is not an answer to the XY-problem.Syntax Error Solution
This code resolves the syntax error:
Test File
Which was tested with the following test file: