We have 3.7M nodes and 11.2M relations in AWS Neptune. Here we need these nodes: Organization, Member, Proposal, Vote and relations:
Member-[:IN]->Organization, Vote-[:BELONGS_TO]->Proposal, Vote-[:VOTED_BY]->Member, Member-[:IN]->Organization.
The goal is to build the query which finds pairs of members in organization and count of proposals they voted with the same vote.choice. Here is the query:
g.V().has('Organization','id','${id}').as('d').
match(
as('d').in('IN').hasLabel('Vote').as('v1').out('BELONGS_TO').hasLabel('Proposal').as('p'),
as('d').in('IN').hasLabel('Vote').as('v2').out('BELONGS_TO').hasLabel('Proposal').as('p1').
select('p1','p').where('p1',eq('p')
),
as('v1').out('VOTED_BY').hasLabel('Member').as('m1'),
as('v2').out('VOTED_BY').hasLabel('Member').as('m2').
select('v1','v2').by('choice').where('v1',eq('v2'))).
select('m1','m2').by('address').where('m1',lt('m2')).
group().by(
select('m1','m2')).by(select('p').count()
).
order(local).by(values, desc).
limit(local, 20)
The issue is that the query returns this error for Organizations with a lot of Members and Votes:
{"code":"MemoryLimitExceededException","detailedMessage":"Query cannot be completed due to memory limitations.","requestId":"e8f8a361-40c4-4db9-8da4-a618d0e20d92"}
Can the memory config be adjusted on AWS Neptune? Or should the query be optimized and what are ways to optimize it?
UPDATE:
The query without match:
g.V().has('Dao','organizationId','${id}').as('d').in('IN').hasLabel('Vote').as('v1').out('BELONGS_TO').hasLabel('Proposal').as('p').
select('d').in('IN').hasLabel('Vote').as('v2').out('BELONGS_TO').hasLabel('Proposal').as('p1').where(eq('p')).select('v1').out('VOTED_BY').hasLabel('Member').as('m1').
select('v2').out('VOTED_BY').hasLabel('Member').as('m2').where('v1',eq('v2')).by('choice').
select('m1','m2').by('address').where('m1',lt('m2')).
group().by(
select('m1','m2')
).by(
select('p').count()
).
order(local).by(values, desc).
limit(local,20)
Original Cypher query:
CALL {
MATCH (o:Organization { id: '${id}' })<-[:IN]-(p:Proposal)
RETURN COUNT(p) AS total_proposals_count
}
WITH total_proposals_count
MATCH (o:Organization { id: '${id}' })<-[:IN]-(v1:Vote)-[:BELONGS_TO]->(p:Proposal)
MATCH (o)<-[:IN]-(v2:Vote)-[:BELONGS_TO]->(p)
MATCH (v1)-[:VOTED_BY]->(m1:Member)
MATCH (v2)-[:VOTED_BY]->(m2:Member)
WHERE m1.address < m2.address AND v1.choice = v2.choice
RETURN [m1, m2] AS members, COUNT(p) AS voted_together, total_proposals_count
ORDER BY voted_together DESC
LIMIT 20