I really need help with code to create a weighted adjacency matrix from a dataset; some rows contain 1 or 2 ingredients, but others have more (up to 8). The resulting matrix will likely be upwards of 16x16 based on the number of unique ingredients in the dataset.
My data currently looks like the example below (but with different information). What column an ingredient shows up in is not important for the purposes of this network analysis but the co-occurrences and weighting are.
| name1 | name2 | name3 | name4 | name5 | name6 | name7 | name8 |
|---|---|---|---|---|---|---|---|
| pineapple | sugar | mango | water | salt | blueberry | ||
| pineapple | asca | ||||||
| sugar | pineapple | water | lime | ||||
| lime | asca | pepper | salt | water | |||
| blueberry | pineapple | water | salt | strawberry | banana | asca | sugar |
| mango |
How do I write the code so that it will find all the co-occurrences/edges from all the columns, and not just the first two columns? That's one issue I'm having with trying to do the adjacency matrix from this data directly in R. I also need to preserve the names for the nodes (ingredients) so that when I create my network graph, the names will show up and not numbers, another issue I've had.
I have solid code that creates the network graph from an adjacency matrix for this new project, but previously I manually calculated the weighted adjacency matrix for a sample set as I was on a tight deadline.
If the row-wise incidents are desired, you can modify the answer by @ThomsIsCoding:
Set the main diagonal to
0, if you want.Data: