I have two data frames. The first one contains a gene-gene correlation matrix, 1484 x 1484 (each cell corresponds to the correlation value between I and J genes). The second one contains a key -> value sort of information, and it looks like this:
Complex Protein_ID
1 BCL6-HDAC4 complex Bcl6
125 BCL6-HDAC5 complex Hdac5
249 BCL6-HDAC7 complex Bcl6
373 Multisubunit ACTR coactivator complex Ep300
497 Condensin I complex Smc2
621 BLOC-3 Hps4
I am interested in extracting the correlations of genes belonging to the same complex from my matrix and storing them on a new data frame, where I will have, per complex, the values of gene-gene correlations. It would ideally look like this:
#this is a simulated data.frame
Complex Correlation values
BCL6-HDAC4 complex 0.64
BCL6-HDAC4 complex -0.25
Multisubunit ACTR coactivator complex 0.31
Multisubunit ACTR coactivator complex 0.30
Any ideas on how I can get there?
Example data (10 genes, 3 groups, only showing first 6 cols of correlation matrix):
Result: