The yearmon class from the zoo package seems particularly useful for working with data that only has a year and month, but no day or time. However, I find it difficult to work with, particularly when trying to filter out a range of dates.
yearmon is kind of a weird data class. It displays dates like Jun 2002. This is more aesthetically pleasing, but (correct me if I'm wrong) underneath it's the year and a fraction of the month, with January being 0 and December being 11. So, Jun 2002, for example, would actually be 2002.417 (i.e., 5/12) under the hood.
Let's assume now that you have monthly data over a ten year period (2000-2009), kinda like this.
date value
<yearmon> <dbl>
1 Jan 2000 10000
2 Feb 2000 30000
3 Mar 2000 250
4 Apr 2000 20
5 May 2000 50000
6 Jun 2000 -90042
7 Jul 2000 73400
8 Aug 2000 4317
9 Sep 2000 1000
10 Oct 2000 -22
You want to filter out rows that fall between June 2002 and November 2003 and keep everything else in your data set.
%in% is good for keeping only a specific range of values, but I often find taking the inverse of this function just as useful, like the following.
%!in% <- Negate(%in%)
My understanding is that you can't work with what yearmon displays. Meaning, you can't actually reference dates like "Jun 2003", but instead you need to work with what's stored under the hood (i.e., 2002.417). I would think it'd be sufficient to use filter() from dplyr to filter out the date range through something like the following
my_data |>
filter(date %!in% c(2002.417:2003.833))
However, I find that this doesn't seem to work, as the months from June 2002 to November 2003 won't be removed. I think it might have something to do with the colon not working with yearmon somehow? I tried only removing June 2002 (i.e., removed the colon and 2003.833) and it was able to remove that single date. However, I find it doesn't like it when you try to specify a range of dates to remove. If the dates were numeric, say, written as 200206 and 200311, I've found that running my above code works in removing that period.
Does anyone know how to filter out a range of dates in class yearmon? I'm surprised that there doesn't seem to be much stuff about filtering out date ranges (usually I find online tutorials only show how to keep a specific range, rather than exclude).
Also, because I don't see a lot of discussion on it either, what are peoples thoughts on the yearmon class from zoo? Do you find it useful? Many tutorials only seem to use yearmon for extracting month or year into a separate column. Am I wrong from keeping dates with only a month and year in class yearmon? Is it better to store them as class date and just add 01 at the end for the day so it's like 2002-06-01?
I've tried using !=, instead of %!in%. I also tried feeding in c("Jun 2002": "Nov 2003") but I get an NA/NAN error message. I've also tried converting it to a date class and using c("2002-06-01:"2003-11-01") but I think I also got a NA/NAN error, even after dropping the quotation marks and swapping the hyphens for slashes c("2002/06/01:"2003/11/01"). I also tried dropping the concatenate sign and brackets in case the numeric date values were sufficient, but that didn't work either. I also tried playing with the Negate of the between() function, but found that it didn't work and was making the code more complicated than it ought to be.
I have never used
yearmonbefore, but I believe your error is stemming from the use of:. Using a colon like2000:2003will produce integer values from 2000 to 2003. Using non-integer values increments by 1 until the end point is exceeded. This would not produce the desired matches in your situation.The function you are looking for is
seq().You'll notice
Nov 2003(or2003+((11-1)/12) = 2003.834) is missing. This is becauseseq()cannot exceed thetoparameter. Increasing thetoparameter by a small amount will remedy that issue.Using the appropriate
seq()call in place ofc()in yourfilter()should return the correct subset of data.