Issue with geom_function shifting values

27 Views Asked by At

I am trying to create a graphic displaying the distributions of the second digit of vote counts. I am using geom_function to plot the expected distribution according to 2BL. The function outputs the intended values, but I cannot get geom_function to plot them in the way I want.

The graph of the function is shifted to the left, and does not start at 0 like the other lines. Has anyone had this problem/have any ideas on how to shift the function line over? Here is the graph currently and the code I used. The data has a state, county name, party, and vote count variables along with variables for the first and second digits of the vote counts.

Also, I do not have this problem with the first distribution (graph also provided below).

Second distribution graph: Image of second distribution graph

First distribution graph: Image of first distribution graph

digit_dist <- function(data, first = TRUE, states = "all") {
  
  if(first) {
    data <- data %>%
      filter(first_digit != 0 & first_digit != '') %>%
      rename(digit = first_digit) 
    
    expected <- function(x) log10(1 + (1/x))
    digit <- "First"
  }
  
  else{
    data <- data %>%
      filter(second_digit != '') %>%
      rename(digit = second_digit) 
    
    digit_1 <- seq(1, 9)
    expected <- function(x) sum(log10(1 + 1/(10*digit_1+x)))
    expected <- Vectorize(expected)
    digit <- "Second"
  }
  
  if(is.vector(states)) {
    data <- filter(data, state %in% states)
  }
  
  graph_data <- data %>%
    group_by(digit, state, party_simplified) %>%
    mutate(digit = as.factor(digit)) %>%
    summarise(n = n()) %>%
    ungroup() %>%
    group_by(state, party_simplified) %>%
    mutate(prop = n/sum(n))
  
  plot <- ggplot(graph_data, aes(x = digit, y = prop, group = party_simplified, 
                                 color = party_simplified)) +
    geom_line(stat = "identity") +
    geom_function(fun = ~expected(.x), color = "grey", xlim = c(0, 9)) +
    geom_point() +
    facet_wrap(vars(state))+
    scale_y_continuous(labels = scales::percent_format(scale=100)) +
    labs(title = sprintf("Distribution of the %s Digit by Party", digit),
         color = "Party", y = "Percentage", x = "Digit") +
    scale_color_manual(values=c("blue", "red"))
  
  
  return(plot)
}

digit_dist(df, first = F, states = c("GEORGIA", "VIRGINIA"))

I made sure that:

  1. the function to make the line outputted the expected values;
  2. the x values being input into the function were correct (values from 0 to 9);
  3. the other lines were not shifted (I believe they are not since they all start at x = 0 unlike the function line).
0

There are 0 best solutions below