Update PageRank of Existing Dataset in Janus / Nebula Graph Database

78 Views Asked by Bradford Griggs At 17 May 2022 at 17:27

I’m using JanusGraph / Nebula Graph to calculate the page rank of a super large dataset (hundreds of billions of pages, trillions of edges). Every day tens of millions of new pages are indexed & I want to add the new pages to the graph and update the page rank of all of the existing pages (as new pages can contain links to previously indexed pages and vice versa). However, I don’t want to have to compute the PageRank of all existing pages from scratch. I only want to feed the new data into the system and compute the PageRank of existing pages based on new data. In other word, I don’t want to perform the same computation every day from scratch.

Is there a way to save the existing page rank model so that I only have to compute PageRank of the newly indexed pages w/o starting the process from scratch?

Original Q&A

There are 1 best solutions below

HadoopMarc On 18 May 2022 at 05:54

Sure, the following paper should give relevant links: https://www.researchgate.net/publication/340281398_DiffPageRank_an_efficient_differential_PageRank_approach_in_MapReduce

As to the implementation, Apache TinkerPop allows to run a custom VertexProgram

Update PageRank of Existing Dataset in Janus / Nebula Graph Database

There are 1 best solutions below

Related Questions in JAVA

Related Questions in DATASET

Related Questions in GRAPH-DATABASES

Related Questions in JANUSGRAPH

Related Questions in PAGERANK

Trending Questions

Popular # Hahtags

Popular Questions