Vertical lines do not appear as intended in ggplot

338 Views Asked by At

I am simulating the distribution of the means of 40 i.i.d. exponentials and I plot the distribution of the sample means along with the mean of the distribution of the sample means (A) and the theoretical mean of the exponential with lambda = 0.2 (B). The two means A and B should appear as vertical lines colored differently and their color codes should be explained in a legend. However my code produces only one vertical line and ignores the color scheme I define inside my code.

The code is the following:

n <- 40

lambda <- 0.2

simulation = data.table(sample_mean = numeric())

for (i in (1 : 1000)){
  simulation <- rbind(simulation, data.table(sample_mean = mean(rexp(n,  lambda))))
}

#==============================================================================================
# Show the sample mean and compare it to the theoretical mean of the distribution.
#==============================================================================================

sample_mean <- mean(simulation$sample_mean)
4.981267

theoretical_mean <- 1/lambda
5

#------------------------------------------------------------------------------------------
# Plot of the Empirical and Theoretical Distributions and their respective means
#------------------------------------------------------------------------------------------

ggplot(simulation, aes(x = sample_mean) ) +
geom_histogram(aes(y=..density..), position="identity",alpha = 0.4, fill = "red", bins=100) +
  geom_density(colour = "red" , size = 2, alpha = 0.5) +
  geom_vline(xintercept = sample_mean, aes(colour = "Empirical"), size = 1.5, alpha =0.3) +
  geom_vline(xintercept = lambda, aes(colour = "Theoretical"), size = 1.5, alpha =0.3) +
  theme_economist() + ggtitle("Distribution of Sample Means.  Mean of the Empirical Distribution 
  and Mean of the Theoretical Exponential (1,000 simulations) ") +
  scale_colour_manual("Distributions", values = c("blue", "red")) +
  scale_y_continuous(name = "Density") +   
  scale_x_continuous(name = "Sample Means", breaks = seq(2, 8, .5), limits=c(2, 8))

The plot is the following:

enter image description here

Your advice will be appreciated.

==================================

EDIT

@mkt: Thank you for your contribution. Still I need to annotate in the plot the vertical lines and this is why I used the color within the aes() with character strings that were mapped to colors later in my code. So I still need to find a solution on how to do that.

1

There are 1 best solutions below

0
mkt On

You had three problems. 1) You were plotting lambda, instead of 1/lambda 2) "Empirical" and "Theoretical" are not colours that R will recognise 3) The colour should not be defined within aes()

This works:

ggplot(simulation, aes(x = sample_mean) ) +
  geom_histogram(aes(y=..density..), position="identity",alpha = 0.4, fill = "red", bins=100) +
  geom_density(colour = "red" , size = 2, alpha = 0.5) +
  geom_vline(xintercept = sample_mean, colour = "blue", size = 1.5, alpha = 0.3) +
  geom_vline(xintercept = theoretical_mean, colour = "green", size = 1.5, alpha = 0.5) +
  ggtitle("Distribution of Sample Means.  Mean of the Empirical Distribution 
  and Mean of the Theoretical Exponential (1,000 simulations) ") +
  scale_colour_manual("Distributions", values = c("blue", "red")) +
  scale_y_continuous(name = "Density") +   
  scale_x_continuous(name = "Sample Means", breaks = seq(2, 8, .5), limits=c(2, 8))

enter image description here