I have multiple entities and properties of the following kind:
- Linkedin company
- name
- phone
- Facebook
- facebook_url
- name
- website_url
- phone
- website
- linkedin_url
- facebook_url
- phone
Not all entities have all their properties filled.
I want to create a unified dataset that will be based on matched corresponding values between all the entities
I'm considering using a graphdb, neo4j in particular
But if each entity is a node, then I will have to create each relationship by programaticcaly checking the equality of each property to the corresponding property in all other entities.
I also consider using some kind of an sql join, but then it seems like maintaining it when the data model widens will be hard.
What is the write approach to solve this problem?
Which technology is best for this?
Here is one approach for doing that in neo4j. (Stackoverflow is not the right place to ask about the "best" technology for doing something, as that tends to be very subjective.)
You can create unique
URL,Phone,Person, andAccountnodes, and have eachAccountconnected to the appropriateURL,Phone, andPersonnodes.For example, assuming your 3 sample accounts are related to the same person, here is how you could represent that in the DB:
Then, if you wanted to find all the
AccountandPersoncombinations related to a specific URL, you could do this: