I have data which I have summarized to a high level which I downloaded from UN Comtrade https://comtradeplus.un.org/TradeFlow
the ns_eu_category variable is my own division of the world into these regions:
"East Asia and Pacific", "Global North", "Latin America and Caribbean", "Middle East and North Africa", "Non-EU Former Soviet Bloc Countries", "South Asia", and "Sub-Saharan Africa". I don't think this is the source of the problem so we can ignore what the exact divisions are for now.
> longterm_trade_data
# A tibble: 7,364 × 6
ns_eu_category year sitc_code import_or_export value sector
<chr> <dbl> <chr> <chr> <dbl> <chr>
1 East Asia and Pacific 1962 0 Export 946694358 Food And Live Animals
2 East Asia and Pacific 1962 0 Import 745286120 Food And Live Animals
3 East Asia and Pacific 1962 1 Export 60846922 Beverages And Tobacco
4 East Asia and Pacific 1962 1 Import 67321814 Beverages And Tobacco
5 East Asia and Pacific 1962 2 Export 1479804622 Crude Materials, Inedible, Except Fuels
6 East Asia and Pacific 1962 2 Import 640428682 Crude Materials, Inedible, Except Fuels
7 East Asia and Pacific 1962 3 Export 482623764 Mineral Fuels, Lubric. And Related Mtrls
8 East Asia and Pacific 1962 3 Import 416870707 Mineral Fuels, Lubric. And Related Mtrls
9 East Asia and Pacific 1962 4 Export 66775599 Animal And Vegetable Oils,Fats And Waxes
10 East Asia and Pacific 1962 4 Import 42687574 Animal And Vegetable Oils,Fats And Waxes
# ℹ 7,354 more rows
# ℹ Use `print(n = ...)` to see more rows
I take these aggregated statistics and turn the value of each trade sector into a percentage so I can put it in an area graph:
trade_data_sector <- longterm_trade_data %>%
group_by(ns_eu_category, year, import_or_export) %>%
mutate(total_of_sectors = sum(value)) %>%
ungroup() %>%
drop_na() %>%
mutate(percent = value / total_of_sectors)
I try to produce an area graph
# "East Asia and Pacific" "Global North"
# "Latin America and Caribbean" "Middle East and North Africa" "Non-EU Former Soviet Bloc Countries"
# "South Asia" "Sub-Saharan Africa"
region <- "Sub-Saharan Africa"
ix <- "Export"
trade_data_sector %>%
mutate(truncated_name = sector %>% substr(0L, 10L),
descriptor = paste0(sitc_code, ": ", truncated_name)) %>%
filter(ns_eu_category == region, import_or_export == ix) %>%
ggplot(aes(x = year, y = percent, fill = descriptor)) +
geom_area() +
theme_minimal() +
labs(title = paste0(import_or_export, "s in ", region, " Over Time"),
caption = "Source: UN COMTRADE Database 1962-2023") +
scale_y_continuous(breaks = seq(from = 0, to = 1, by = 0.1), labels = scales::percent, limits = c(0, 1)) +
scale_x_discrete(limits = 1962:2023, expand = c(0,0)) +
theme(
# panel.grid.major.y = element_line(color = "dark gray", linewidth = 0.1, linetype = "dashed"),
# panel.grid.major.x = element_blank(),
axis.ticks.x=element_line(linewidth=0.2),
axis.text.x = element_text(size = 6, family=my_font, angle=-90, vjust=0.5),
axis.title.x = element_text(size = 8, family=my_font),
axis.text.y=element_text(size = 6, family=my_font),
# axis.ticks.y=element_line(),
axis.title.y = element_text(size = 8, family=my_font),
panel.grid = element_blank(),
legend.position="bottom",
plot.title = element_text(size = 12, family=my_font),
plot.subtitle = element_text(size = 10, family=my_font),
legend.title = element_text( size=8, family=my_font),
legend.text = element_text( size=8, family=my_font),
strip.text = element_text(size=8, family=my_font),
legend.key.size = unit(0.3, "cm"),
plot.caption = element_text(size = 7, color="dark gray", family=my_font)
)
The result is this:
Graph of Exports from Sub-Saharan Africa
Note: the data from 2010-2022 is missing right now to ignore that part of the graph.
Not only does it look a lot more erratic than it should be. There are entire sections where SITC Code 0: Food and Live Animals just disappears. But As we can see in the following graph, there's never a time when this amount was zero
sector_code <- "0"
trade_data_sector %>%
filter(ns_eu_category == region, import_or_export == ix, sitc_code == sector_code) %>%
ggplot(aes(x = year, y = value)) +
geom_line() +
theme_minimal() +
labs(title = paste0(import_or_export, "s in ", region, " Over Time (Sector ", sector, ")"),
caption = "Source: UN COMTRADE Database 1962-2023") +
scale_x_discrete(limits = 1962:2022) +
# scale_y_continuous(breaks = seq(from = 0, to = 600, by = 100), limits=c(0,700)) +
theme(
panel.grid.major.y = element_line(color = "dark gray", linewidth = 0.1, linetype = "dashed"),
# panel.grid.major.x = element_blank(),
# axis.ticks.x=element_blank(),
axis.text.x = element_text(size = 6, family=my_font, angle=-90, vjust=0.5),
axis.title.x = element_text(size = 8, family=my_font),
axis.text.y=element_text(size = 6, family=my_font),
# axis.ticks.y=element_line(),
axis.title.y = element_text(size = 8, family=my_font),
panel.grid = element_blank(),
legend.position="bottom",
plot.title = element_text(size = 10, family=my_font),
plot.subtitle = element_text(size = 8, family=my_font),
legend.title = element_text( size=8, family=my_font),
legend.text = element_text( size=8, family=my_font),
strip.text = element_text(size=8, family=my_font),
legend.key.size = unit(0.3, "cm"),
plot.caption = element_text(size = 7, color="dark gray", family=my_font)
)
Percentage of exports in Sector 0
What could be causing this? This isn't just happening with Sub-Saharan Africa, these gaps appear for other regions of the world as well.
The problem was the result of setting
limits = c(0, 1)withinscale_y_continuous. By removing the limits the graph now looks normal