How can I reorder a ggplot2 graph to group a character column by a different character vector/column?

136 Views Asked by At

I have a set of patient samples for which I have gene expression data. I want to create a bar chart where the x axis are the individual samples, the y axis are the counts for a given gene and the fill is the diagnosis of the patient. I then want the data displayed so that the diagnoses are plotted/grouped together (without faceting).

Imagine mtcars where the rownames are stored as an alphabetically factored 'car' column and there is a column of 'car_model' separately too. The plotted x-axis featuring each car plotted against its weight would then be arranged/grouped by car_model.

p <- ggplot(mtcars, aes(x= car, y= wt, fill= car_model)) + 
    geom_bar(stat = 'identity') +  theme_bw() +
    theme(axis.text.x = element_text(angle = 90, hjust = 1)) + 
    xlab('car')+ ylab('wt')+
    ggtitle(cars)
  print(p)

I've tried/looked at factor/level, fct_reorder/2, order/reordering the data, str_sort, arrange, grouping... both inside and outside of the ggplot2 code. I thought the code below (or fct_reorder/2 within ggplot code) would work but it hasn't:

mtcars$car_model <- relevel(mtcars$car_model, "Toyota") #Reorder group so that the Toyota group is specifically first and everything else remains as it was

mtcars <- mtcars[order(mtcars$car, mtcars$car_model),] # Then plot

I've considered mutating the car_model column to a numerical vector and trying to order that way but I also think I'm missing something stupidly obvious... any tips please?

Thanks!

1

There are 1 best solutions below

1
Paco Herrera On

A minimum reproducible example may help the community to understant your problem. Either way, im guessing you are looking for something like this:

library(ggplot2)

# Create a sample dataset
gene_data <- data.frame(
  sample = c("Sample1", "Sample2", "Sample3", "Sample4", "Sample5", 
             "Sample6"),
  gene_count = c(10, 15, 8, 12, 9, 11),
  diagnosis = c("Diagnosis1", "Diagnosis2", "Diagnosis2", "Diagnosis1", 
                "Diagnosis3", "Diagnosis1")
)

# Reorder the levels of the diagnosis factor to match the desired grouping order
gene_data$diagnosis <- factor(gene_data$diagnosis, 
                       levels = c("Diagnosis1", "Diagnosis2", "Diagnosis3"))

# Create the bar chart
p <- ggplot(gene_data, aes(x = sample, y = gene_count, fill = diagnosis)) +
  geom_bar(stat = 'identity') +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  xlab('Sample') +
  ylab('Gene Count') +
  ggtitle('Gene Expression by Diagnosis')

# Group the samples by diagnosis using the 'sample' column
p <- p + scale_x_discrete(limits = 
              unique(gene_data$sample[order(gene_data$diagnosis)]))

p