How to make y-axis of geom_histogram a log scale without changing counts numbers

33 Views Asked by At
log2_breaks <- c(1, 2, 4, 8, 16, 32, 64, 128, 256)

p <- ggplot(plot_data, aes(Median, fill = Median < 0)) +
  geom_histogram(bins = 9, color = "black") +  # Adjust binwidth as needed
  scale_fill_manual(values = c("skyblue", "red"), 
                    name = "", 
                    labels = c("X", "Y")) +  # Custom labels for fill colors
  labs(title = "", x = "X", y = "Y") +
  theme_classic() +
  theme(legend.text = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title = element_text(size = 14)) +
  scale_y_continuous(breaks = log2_breaks, labels = log2_breaks)

I want to make the y axis a log2 scale to better visualize the distribution of data, (1, 2, 4, 8 etc). I tried scale_y_continuous(breaks = log_breaks, labels = log_labels) specifying the breaks but this did not work, it only illustrated the breaks but not on a log scale.

1

There are 1 best solutions below

0
lisa On

Using + scale_y_continuous(transformation = 'log2') should work, but here the color aesthetic makes the results appear strange. Below is a workaround if you still want the color: it is to add breaks = seq(-150, 150, by = 50) inside geom_histogram to make a histogram break that falls exactly on the zero:

library(ggplot2)

set.seed(3)

# Define synthetic data
x <- rnorm(1000, sd = 50)
plot_data <- data.frame(Median = x)

ggplot(plot_data, aes(Median, fill = Median < 0)) +
  geom_histogram(bins = 9, color = "black",
                 breaks = seq(-150, 150, by = 50)) + # add manual breaks that separate at zero
  scale_fill_manual(values = c("skyblue", "red"),
                    name = "",
                    labels = c("X", "Y")) +  # Custom labels for fill colors
  labs(title = "", x = "X", y = "Y") +
  theme_classic() +
  theme(legend.text = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title = element_text(size = 14)) +
  scale_y_continuous(transform = "log2")

Break on zero

Note that there is a warning because the bins height for some colors are zero, and the log transformation doesn't like that.

Without this trick, the plot looks like this:

ggplot(plot_data, aes(Median, fill = Median < 0)) +
  geom_histogram(bins = 9, color = "black") + 
  scale_fill_manual(values = c("skyblue", "red"),
                    name = "",
                    labels = c("X", "Y")) +  # Custom labels for fill colors
  labs(title = "", x = "X", y = "Y") +
  theme_classic() +
  theme(legend.text = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title = element_text(size = 14)) +
  scale_y_continuous(transform = "log2")

No break on zero

Here, the 2 colors are stacked on the middle bin and I'm not sure that's what you want.