AWS Neptune Gremlin query gives MemoryLimitExceededException

451 Views Asked by At

We have 3.7M nodes and 11.2M relations in AWS Neptune. Here we need these nodes: Organization, Member, Proposal, Vote and relations: Member-[:IN]->Organization, Vote-[:BELONGS_TO]->Proposal, Vote-[:VOTED_BY]->Member, Member-[:IN]->Organization.

The goal is to build the query which finds pairs of members in organization and count of proposals they voted with the same vote.choice. Here is the query:

g.V().has('Organization','id','${id}').as('d').
match(
 as('d').in('IN').hasLabel('Vote').as('v1').out('BELONGS_TO').hasLabel('Proposal').as('p'),
 as('d').in('IN').hasLabel('Vote').as('v2').out('BELONGS_TO').hasLabel('Proposal').as('p1').
  select('p1','p').where('p1',eq('p')
 ),
 as('v1').out('VOTED_BY').hasLabel('Member').as('m1'),
 as('v2').out('VOTED_BY').hasLabel('Member').as('m2').
  select('v1','v2').by('choice').where('v1',eq('v2'))).
  select('m1','m2').by('address').where('m1',lt('m2')).
   group().by(
    select('m1','m2')).by(select('p').count()
   ).
   order(local).by(values, desc).
   limit(local, 20)

The issue is that the query returns this error for Organizations with a lot of Members and Votes:

{"code":"MemoryLimitExceededException","detailedMessage":"Query cannot be completed due to memory limitations.","requestId":"e8f8a361-40c4-4db9-8da4-a618d0e20d92"}

Can the memory config be adjusted on AWS Neptune? Or should the query be optimized and what are ways to optimize it?

UPDATE:

The query without match:

g.V().has('Dao','organizationId','${id}').as('d').in('IN').hasLabel('Vote').as('v1').out('BELONGS_TO').hasLabel('Proposal').as('p'). 
 select('d').in('IN').hasLabel('Vote').as('v2').out('BELONGS_TO').hasLabel('Proposal').as('p1').where(eq('p')).select('v1').out('VOTED_BY').hasLabel('Member').as('m1'). 
 select('v2').out('VOTED_BY').hasLabel('Member').as('m2').where('v1',eq('v2')).by('choice').
 select('m1','m2').by('address').where('m1',lt('m2')).
   group().by( 
     select('m1','m2') 
   ).by(
     select('p').count()
   ).
 order(local).by(values, desc).
 limit(local,20)

Original Cypher query:

CALL {
    MATCH (o:Organization { id: '${id}' })<-[:IN]-(p:Proposal)
    RETURN COUNT(p) AS total_proposals_count
}
WITH total_proposals_count
MATCH (o:Organization { id: '${id}' })<-[:IN]-(v1:Vote)-[:BELONGS_TO]->(p:Proposal)
MATCH (o)<-[:IN]-(v2:Vote)-[:BELONGS_TO]->(p)
MATCH (v1)-[:VOTED_BY]->(m1:Member)
MATCH (v2)-[:VOTED_BY]->(m2:Member)
WHERE m1.address < m2.address AND v1.choice = v2.choice
RETURN [m1, m2] AS members, COUNT(p) AS voted_together, total_proposals_count
ORDER BY voted_together DESC
LIMIT 20
0

There are 0 best solutions below