I'm currently studying Data Analysis in R and Rstudio and I've got an issue dealing with a double Y-axis visualization. This is my code: `
Test_Zscore <- traitement_CDE_LONG |>
group_by(Athletes) |>
mutate(moyenne = mean(CDE), SD = sd(CDE))
traitement_final <- left_join(CDE, Test_Zscore, by = "Athletes") |>
select(Athletes, Date.x, CDE_sum, moyenne, SD) |>
distinct(Athletes, .keep_all = TRUE) |>
group_by(Athletes) |>
mutate(Z_Score = (CDE_sum - moyenne)/SD)
plot_CDE_jour <- ggplot(data = traitement_final, aes(x = Athletes)) +
geom_col(aes(y = CDE_sum), fill = "skyblue") + # Colonne pour les CDE
geom_line(aes(y = Z_Score * 750 - 1500, group = 1), colour = "red") + # Ligne pour les Zscores, ajustez le facteur de mise à l'échelle et le décalage pour aligner avec l'axe des Y des CDE
scale_y_continuous(
"CDE",
sec.axis = sec_axis(~ (. + 1500) / 750 - 4, name = "Z-Scores") # Créer un second axe des ordonnées pour les Zscores, ajustez selon le besoin
) +
scale_x_discrete("Athletes") +
coord_cartesian(ylim = c(0, 1500)) + # Définir les limites pour l'axe principal des Y
labs(title = "CDE et Zscore par Athlète") +
theme_minimal() +
geom_hline(aes(yintercept = (-1.96 * 750 - 1500)), linetype = "dashed", color = "blue") + # Seuil de -1.96
geom_hline(aes(yintercept = (1.96 * 750 - 1500)), linetype = "dashed", color = "blue") # Seuil de 1.96
print(plot_CDE_jour)
Picture 1 : What I get Picture 2 : What I want
My data :
structure(list(Athletes = c("Abadie", "Abescat", "Antonescu",
"Auradou", "Balfet", "Barbaste", "Betham", "Boundjema", "Castinel",
"Chauvet"), Date.x = structure(c(19747, 19747, 19747, 19747,
19747, 19747, 19747, 19747, 19747, 19747), class = "Date"), CDE_sum = c(824,
690, 750, 481, 756, 764, 654, 516, 695, 746), moyenne = c(710.558181818182,
738.504504504505, 596.219117647059, 637.671287128713, 714.474698795181,
748.532978723404, 634.503260869565, 524.178947368421, 620.496330275229,
642.718348623853), SD = c(417.313941045778, 363.098405992192,
302.630508794043, 293.807319628573, 326.882040275206, 360.934928871125,
335.865907456306, 311.70092564648, 360.183333957143, 346.311225677975
), Z_Score = c(0.271838074466278, -0.133585010851157, 0.508147321186304,
-0.53324501012015, 0.127034514254312, 0.0428526585802353, 0.0580491758693049,
-0.0262397275576183, 0.206849297845734, 0.298233622586019)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(
Athletes = c("Abadie", "Abescat", "Antonescu", "Auradou",
"Balfet", "Barbaste", "Betham", "Boundjema", "Castinel",
"Chauvet"), .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -10L), .drop = TRUE))
With this code I get this visualization: see "What I get".
You can see that the geom_line() doesn't appear. I think it's a scale issue but I don't know how to set it. I want a geom_col() like the "graph" where a geom_line() is superposed with his scale. I also want 2 thresholds represented by geom_hline(). The first Y-axis begins at 0 and ends at 1500 and the second at -2 and 2. May someone help me to adjust my second Y-axis correctly? Or finding another way to perform what I want to do. I'm also open to any suggestions that improve my code. Please, forgive my English.
Thank you!
I tried several ways to scale my Y axis but it didn't work there is always an issue with geom_line() or a scaling issue.