I was thinking of developing a function for AGE that returns a graph's adjacency matrix. An adjacency matrix stores the number of edges between two nodes. The complexity to check if an edge exists within an adjacency matrix is O(1), while the space complexity is O(v^2) with v = number of vertices. AGE stores the vertices of each graph in graph_name._ag_label_vertex, while the edges are stored in graph_name._ag_label_edge.
The vertex table contains two columns: id and properties. The id column is of integer type, while the properties column is a JSON like format. The edge table contains the id of the edge, the id of the vertex which it comes from, the id of the vertex that it goes to, and the edge's properties.
Example:
SELECT * FROM "graph_name"."Vertex";
id | properties
------------------+-------------
844424930131971 | {}
1407374883553281 | {}
SELECT * FROM "graph_name"."Edges";
id | start_id | end_id | properties
------------------+-----------------+------------------+------------
1125899906842625 | 844424930131971 | 1407374883553281 | {}
(1 row)
I was thinking on traversing the vertex column first, so that it is possible to store the vertices ids initially:
[0, 1, 2, 3]
[1, 0, 0, 0]
[2, 0, 0, 0]
[3, 0, 0, 0]
After that, loop through the edges table and store 1 if the edge between two nodes exists (if there are multiple edges between a pair of nodes, add 1 to the already set value for each edge). But I don't know exactly how to do this in C for a PostgreSQL function. Which function should I call to traverse the tables and get the values from the id column?
I thought of using SPI twice, first for the vertices and the second time for the edges, but I don't know if this is the best approach.
In the age--1.3.0.sql file, I declared the function as
CREATE FUNCTION ag_catalog.get_adjacency_matrix(graph_name name)
RETURNS integer[][]
LANGUAGE c
AS 'MODULE_PATHNAME';
And its definition in src/backend/commands/graph_commands.c:
PG_FUNCTION_INFO_V1(get_adjacency_matrix);
/*
* Function for returning the graph's adjacency matrix. This adjacency matrix
* stores the number of edges between two nodes. The complexity to check if an
* edge exists within an adjacency matrix is O(1), while the space complexity
* is O(v^2) with v = number of vertices.
*/
Datum get_adjacency_matrix(PG_FUNCTION_ARGS)
{
PG_RETURN_VOID();
}
Apparently, the best way to traverse a table utilizing a C function for PostgreSQL is with the Server Programming Interface (SPI). Here's a code example:
From PostgreSQL's Documentation:
When we call
SPI_execute(const char * command, bool read_only, long count), it will execute the specified SQL command forcountrows, and, ifread_onlyis true, the command must be read only. Ifcountis zero then the command is executed for all rows. This function returns a value depending on what query was executed. In this case, it returnsSPI_OK_SELECTif the execution was successful.The actual number of rows for which the (last) command was executed is returned in the global variable
SPI_processed. If the return value of the function is indeedSPI_OK_SELECT, then we can use the global pointerSPITupleTable *SPI_tuptableto access the result rows.The
SPITupleTablestruct is declared as:We can access the corresponding row from the
SPITupleTableby doingSPI_tuptable->vals[i]whereiis the number of the row. To get the value of a certain column from the row, in this case, we need to callSPI_getvalue(HeapTuple row, TupleDesc rowdesc, int colnumber). This function will return achar*to the corresponding data.If we want to convert it to
int64, we callDatumGetInt64(DirectFunctionCall1(int8in, CStringGetDatum(result_string))).