I have a frequency table that contains 800 rows. Here is a similar table for example:
table<- data.frame(number = c(1:10),
units = c(800, 780,500, 430, 200, 189, 110, 86, 54, 31))
I would like to create a table with fewer rows, with intervals having roughly the same amount of units, respecting the fact that the intervals must be discrete and based on the numbers column. How can I do this in R?
If the above is not possible, some help with aggregating the table with personalised intervals would be very appreciated.
It seems that you are looking for binning. Try
cut_interval()from{ggplot2}, it will create several groups (bins) of your values, then look at levels.Another similar function is
cut_numberEDIT:
I apologise, turned out that that function is pretty useless here.
It's still not completely clear, what is your desired output.
I have 2 solutions. Scenario 1. You just want to keep, let's say, every 3rd row. Here, you can simply subset with
filter:Output:
Scenario 2, based on bins - taking one value out of a bin:
Output:
Choose the scenario you like. You can also provide the desired output to avoid the confusion.