I am working with Netezza SQL.
I have the two following tables:
# table_a
id year
1 2010
1 2011
1 2012
1 2013
1 2014
2 2010
2 2011
2 2012
2 2013
2 2014
3 2010
3 2011
3 2012
3 2013
3 2014
# table_b
id year var1
1 2010 A
1 2014 B
2 2012 A
2 2013 C
3 2010 A
3 2011 Z
My Question: I am trying to write SQL code to accomplish the following task:
- Step 1: For a given ID in table_b, take the earliest available year : I want to add new rows for this id until 2014. For example, ID = 1 would have rows added for 2011, 2012, 2013. Id = 2 would have rows added only for 2014. ID = 3 would have rows added for 2012, 2013, 2014
- Step 2: Anytime a new row is added, the value of var1 is assigned the text 'not available'
- However, the only exception to Step 2: If a given ID has var1 = Z, then for all new rows that are added, var1 will be assigned a value of 'Z'.
The final answer would look something like this:
id year var1
1 2010 A
1 2011 not available
1 2012 not available
1 2013 not available
1 2014 B
2 2012 A
2 2013 C
2 2014 not available
3 2010 A
3 2011 Z
3 2012 Z
3 2013 Z
3 2014 Z
Here is what I tried so far:
Part 1: I understand that I need to first create the missing rows - I thinks can be done like this:
select
g.id,
h.year
from
(select id,
min(year) as start
from table_b
group by id)g
join table_a h on
h.id
between g.start and 2014)
But from here, I am not sure how to proceed. Can someone please show me how to continue working on this problem?
Note: Ideally, I would like to solve this problem with a set of basic CTEs as recursive functions are not available within Netezza.
Thanks!
Perform a LEFT JOIN:
https://www.ibm.com/docs/en/psfa/7.2.1?topic=joins-left-outer-joinleft-join
https://www.ibm.com/docs/en/db2/11.5?topic=functions-coalesce