Horizontal alignment of geom_text linked to emmeans and cld

79 Views Asked by At

I ran a model on my data, thereafter using the emmeans function to do a Tukey post-hoc test, following by the cld function to add the compact letter display for significant differences.

    Mod1a <- aov(Drywt~Treatment*Soil, data=Pots1)
    Mod1em_split <- emmeans(Mod1a,~Treatment|Soil, subset=(Pots1$Drywt))
    Mod1cld_split <- cld(Mod1em_split, Letters=letters, reversed=TRUE, by="Soil")

After this I set up a ggplot to plot the emmeans as bars with the standard error bars and the cld as letters above the bars. The ggplot function uses the dataframe generated by Mod1cld_split directly. The letters are not horizontally aligned to the bars. No matter what positioning, nudging or hjust I try it does not align.

When I extracted the Mod1cld_split data to set up this Stackoverflow query I ran it through R and suddenly everything aligned perfectly. So the workaround is to export the data to a csv and use the data copied from the csv file in the ggplot call (directly calling up the csv results in the same alingment issue). But this is quite a roundabout way and I will need to generate a csv file for each plot I make and convert the data to R's format, which is unreasonable.

Is there perhaps a solution or another way to run emmeans and a cld (or equivalent) function on my model which can be used directly in ggplot? I should note that this problem persists regarless of the model used, I get this on all my data even in different R scripts and for different datasets.

For reference, the data and ggplot code which aligns is as follows:

Testdata <- data.frame(
  Treatment = c("CanolaHull10tha", "CanolaHull10thaTSP", "CanolaHull50kgha", "CanolaMeal10tha", "CanolaMeal10thaTSP",
                "CanolaMeal50kgha", "Control1", "Control2", "Manure10tha", "Manure10thaTSP", "Manure50kgha", 
                "TripleSuperPhosphate", "Willow10tha", "Willow10thaTSP", "Willow50kgha", "CanolaHull10tha", "CanolaHull10thaTSP", "CanolaHull50kgha", "CanolaMeal10tha", "CanolaMeal10thaTSP",
                "CanolaMeal50kgha", "Control1", "Control2", "Manure10tha", "Manure10thaTSP", "Manure50kgha", 
                "TripleSuperPhosphate", "Willow10tha", "Willow10thaTSP", "Willow50kgha"),
  Soil = c("Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", 
           "Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", "Haverhill", 
           "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow", "Oxbow",
           "Oxbow", "Oxbow", "Oxbow"),
  emmean = c(7.8925, 9.255, 6.6725, 8.67, 10.75, 5.5475, 1.03, 2.5975, 8.4175, 9.4825, 6.0475, 8.755, 7.335,
             8.9025, 8.1775, 4.3475, 9.1675, 3.676666667, 8.8, 10.9, 2.7975, 1.3075, 2.62, 9.7325, 12.61, 6.755, 
             9.4675, 5.4, 11.7175, 10.05333333),
  SE = c(0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.697872552, 0.604375358, 0.697872552, 0.604375358,
          0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358,
          0.604375358, 0.697872552, 0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358,
          0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.604375358, 0.697872552),
  CLDletters = c("abcd", "ab", "bcd", "abc", "a", "de", "f", "ef", "abcd", "ab", "cd", "abc", "bcd", "abc", "abcd",
                  "de", "bc", "def", "bc", "ab", "ef", "f", "ef", "abc", "a", "cd", "bc", "de", "ab", "ab"))

Drywt_trtVar <- c("Control1", "Control2","CanolaHull50kgha","CanolaMeal50kgha","Manure50kgha","Willow50kgha",
                  "TripleSuperPhosphate")
Drywt_subVar <- Testdata %>%
  filter(Treatment %in% Drywt_trtVar)
(Drywt50kg <- ggplot(Drywt_subVar, aes(x=Treatment, y=emmean, pattern=Soil))+
    geom_bar_pattern(stat="identity", position=position_dodge2(padding=0.2), colour="black", fill="white", 
                     pattern_density=0.05, pattern_spacing=0.01)+
    scale_pattern_manual(values=c("Haverhill"="stripe", "Oxbow"="crosshatch"), 
                         labels=c("Haverhill", "Oxbow"))+
    geom_errorbar(aes(ymin=emmean - SE, ymax=emmean + SE), 
                  width=0.2, position=position_dodge(width=0.9)) +
    geom_text(aes(label = ifelse(Soil == "Haverhill", CLDletters, toupper(CLDletters)), y=emmean+SE+0.5),
              size = 6, position = position_dodge(width=0.9), hjust="dodge", check_overlap = T)+
    labs(y="Biomass yield (g) for chars at 50kg P/ha")+
    scale_x_discrete(labels=c("Control 1", "Control 2", "Canola Meal", "Canola Hull", "Manure", "Willow",
                              "Fert.\nPhosphorus"))+
    theme(legend.position="top", legend.justification="center", legend.key.size=unit(10,"mm"),
          legend.text=element_text(size=14), legend.title = element_text(size = 16),
          plot.title=element_text(size=18),
          axis.text.x=element_text(angle=90, vjust=0.5, hjust=1, size=18, face="bold", colour="black"),
          axis.title.x=element_blank(), axis.title.y=element_text(size=22, face="bold"),
          panel.background = element_blank(),
          panel.border=element_blank(), panel.grid.major=element_blank(),
          panel.grid.minor=element_blank(), axis.line=element_line(colour="black")))
ggsave(Drywt50kg, file="Pots1_Biomass_50kgPha.jpg", width=12, height=8, dpi=150)  
0

There are 0 best solutions below