How to display only rows with values in each Category plot in ggplot2?

21 Views Asked by Marta López At 11 March 2024 at 11:20

I am using ggplot2 in R to create scatterplots of GO categories. However, I want each plot (BP,MF,CC) to only display the rows that have values in the data. Currently, all rows are shown in each plot, even if they don't have a value. How can I modify my code to achieve this?

A simplify example of my data, and the code I'm using is here:

df <- data.frame(
  'Seq ratio' = runif(15, min = 0, max = 0.5),
  'feature' = stringi::stri_rand_strings(15, 1, pattern = "[A-F]"),
  'Seq count' = c(8, 10, 8, 8, 8, 4, 4, 4, 5, 4, 4, 7, 15, 15, 15),
  'adj. P value' = c(0.012609566, 0.012609566, 0.033278332, 0.021875357, 
  0.021875357,0.003216359, 0.003216359, 0.003216359, 0.030076544, 0.017188404, 
  0.003216359, 0.047018661, 0.002584020, 0.002584020, 0.002584020),
  'cluster' = c('Cluster 1', 'Cluster 1', 'Cluster 1', 'Cluster 1', 
  'Cluster 1', 'Cluster 2', 'Cluster 2', 'Cluster 2', 'Cluster 2', 
  'Cluster 3', 'Cluster 3', 'Cluster 4', 'Cluster 4', 'Cluster 4', 'Cluster 4'), 
  'Category' = c('MF', 'MF', 'CC', 'MF', 'CC', 'BP', 'BP', 'BP',
   'BP', 'MF', 'MF', 'BP', 'CC', 'CC', 'CC'))

library(ggplot2)
ggplot(df, aes(`Seq ratio`, feature, size =`Seq count`, colour=`adj. P value`))+
  facet_wrap(~ Category, ncol = 1, ) +
  geom_point() +
  ylab(NULL) +
  coord_cartesian(xlim = c(0, 0.25))+
  scale_x_continuous(n.breaks = 5)+
  scale_size(name = "Seq count", limits = c(0, 15))

For example, in plot BP, not show letter B, in plot CC not E and C and in MF not show F and B.

Thanks in advance!

Original Q&A

There are 1 best solutions below

Carl On 11 March 2024 at 11:32

With scales = "free_y":

library(tidyverse)
library(stringi)
library(janitor)

set.seed(123)

df <- data.frame(
  'Seq ratio' = runif(15, min = 0, max = 0.5),
  'feature' = stri_rand_strings(15, 1, pattern = "[A-F]"),
  'Seq count' = c(8, 10, 8, 8, 8, 4, 4, 4, 5, 4, 4, 7, 15, 15, 15),
  'adj. P value' = c(0.012609566, 0.012609566, 0.033278332, 0.021875357, 
                     0.021875357,0.003216359, 0.003216359, 0.003216359, 0.030076544, 0.017188404, 
                     0.003216359, 0.047018661, 0.002584020, 0.002584020, 0.002584020),
  'cluster' = c('Cluster 1', 'Cluster 1', 'Cluster 1', 'Cluster 1', 
                'Cluster 1', 'Cluster 2', 'Cluster 2', 'Cluster 2', 'Cluster 2', 
                'Cluster 3', 'Cluster 3', 'Cluster 4', 'Cluster 4', 'Cluster 4', 'Cluster 4'), 
  'Category' = c('MF', 'MF', 'CC', 'MF', 'CC', 'BP', 'BP', 'BP',
                 'BP', 'MF', 'MF', 'BP', 'CC', 'CC', 'CC'))

df |> 
  clean_names() |> 
  ggplot(aes(seq_ratio, feature, size = seq_count, colour = adj_p_value)) +
  facet_wrap(~category, ncol = 1, scales = "free_y") +
  geom_point() +
  ylab(NULL) +
  scale_x_continuous(n.breaks = 5) +
  scale_size(name = "Seq count", limits = c(0, 15))

^{Created on 2024-03-11 with reprex v2.1.0}

How to display only rows with values in each Category plot in ggplot2?

There are 1 best solutions below

Related Questions in R

Related Questions in GGPLOT2

Related Questions in CATEGORIES

Related Questions in SCATTER-PLOT

Trending Questions

Popular # Hahtags

Popular Questions