Loess regression on genomic data

64 Views Asked by At

I am struggling with R loess function in R. I have a dataframe on which I would locally weighted polynomial regression For each ‘Gene’ is associated a Count (log10 transformed) which gives information regarding the gene expression. For each Gene is associated an ‘Integrity’ measurement (span 0-100) which tells you the quality of the ‘Count’ measurement for each gene. As a general principle, higher is the ‘Integrity’, more reliable is the ‘Count’ for the specific Gene. Below is reported a chunk of the dataframe Sample dataframe:

Gene Integrity Count
ENSG00000198786.2 96.6937 3.55279
ENSG00000210194.1 96.68682 1.39794
ENSG00000212907.2 94.81709 2.396199
ENSG00000198886.2 93.87207 3.61595
ENSG00000198727.2 89.08319 3.238548
ENSG00000198804.2 88.82048 3.78326

I would like to use loess to predict the ‘true’ value of genes with low ‘Integrity’ values (since less reliable).

I) Should I pre-process my dataframe in order to correctly apply loess ? From a pletora of examples I observed sinusoidal distributions of points (A), while my dataset seem distributed in a ‘rollercoaster’-like fashion (B).

II) How should I run loess? I cannot understand how to run loess with the correct syntax to differentially weighted observations:

-1 loess( Count ~ Integrity, weight=None)

-2 loess( Count ~ 1:nrow(dataframe), weight=Integrity)

I performed several tests. Fig. C-D used loess (stats), Fig. E-F run weightedloess (limma). I used 2 different packages since, from the loess docs it is clear that prior weights are set based on x distance between points. weightedloess function allow the user to give priors in order to perform regression. Below is reported the basic sintax adopted to perform regression and to generate images.

C) loess(Count ~ Integrity),degree=2,span=0.1)
D) loess(Count ~ 1:nrow(df)),weigths=’Integrity’,degree=2,span=0.1)
E) weightedLowess(x=1:nrow(df), y=Count, weigths=’Integrity’, span=0.1)
F) weightedLowess(x=1:nrow(df), y=order(Count), weigths=’Integrity’, span=0.1)

Please find enclosed images cited in the question.

Sample Images

0

There are 0 best solutions below