Return Duplicate Rows

49 Views Asked by At

I'm executing below query in, but return duplicate rows when we do not have any records for "LastMonth" and "CurrentMonth". Please let me know where is the issue?

**Output:**
Project ProCount CurrentDateAvg thisWeekAvg thisMonthAvg lastMonthAvg           
xyz         0            0          0.19       0.09       0
xyz         5            0          0.0         0.0       0


WITH 
ThisWeek AS (SELECT  'xyz' as 'Project_Name', 0.19 as 'thisWeekAvg' ), 
ThisMonth AS ( SELECT  'xyz' as 'Project_Name', 0.09 as 'thisMonthAvg'), 
LastMonth AS( SELECT  'xyz' as 'Project_Name', 0 as 'lastMonthAvg'  where 1=2), 
CurrentDate AS( SELECT  'xyz' as 'Project_Name', 0 as 'CurrentDateAvg'  where 1=2), 
ProjectCount AS( SELECT  'xyz' as 'Project_Name', 5 AS 'ProCount' ) 
              SELECT 
                             COALESCE(w.Project_Name, m.Project_Name, l.Project_Name,c.Project_Name,pc.Project_Name) AS 'Project',
                             COALESCE(pc.ProCount,0) 'ProCount', 
                             COALESCE(c.CurrentDateAvg,0) 'CurrentDateAvg', 
                             COALESCE(w.thisWeekAvg,0) 'thisWeekAvg', 
                             COALESCE(m.thisMonthAvg,0) 'thisMonthAvg', 
                             COALESCE(l.lastMonthAvg,0) 'lastMonthAvg' 
                             FROM 
                             ThisWeek w 
                             FULL OUTER JOIN ThisMonth m ON         w.Project_Name = m.Project_Name
                             FULL OUTER JOIN LastMonth l    ON         m.Project_Name = l.Project_Name
                             FULL OUTER JOIN ProjectCount pc ON    l.Project_Name = pc.Project_Name 
                             FULL OUTER JOIN CurrentDate c ON         pc.Project_Name = c.Project_Name
2

There are 2 best solutions below

7
Thom A On

It's expected behavior because you aren't handling your NULL values in your JOINs.

Let's number them for ease:

ThisWeek w --1
FULL OUTER JOIN ThisMonth m ON w.Project_Name = m.Project_Name --2
FULL OUTER JOIN LastMonth l ON m.Project_Name = l.Project_Name --3
FULL OUTER JOIN ProjectCount pc ON l.Project_Name = pc.Project_Name --4
FULL OUTER JOIN CurrentDate c ON pc.Project_Name = c.Project_Name --5

So we have:

  1. This Week with a project called "xyz".
  2. This Month, with a project of "xyz" to JOIN to This Week (which also has "xyz")
  3. Last Month, with no projects, to JOIN to This Month (no rows JOINed)
  4. Current Data, with no projects, to JOIN to Last Month (no rows JOINed)
  5. Project Count, with a project of "xyz", to JOIN to no rows; creates a new row due to FULL OUTER JOIN. NULL <> 'xyz'

For stuff like this you likely want an IN:

WITH
    ThisWeek AS (
        SELECT
            'xyz' AS Project_Name,
            0.19 AS thisWeekAvg
    ),
    ThisMonth AS (
        SELECT
            'xyz' AS Project_Name,
            0.09 AS thisMonthAvg
    ),
    LastMonth AS (
        SELECT
            'xyz' AS Project_Name,
            0 AS lastMonthAvg
        WHERE
            1 = 2
    ),
    CurrentDate AS (
        SELECT
            'xyz' AS Project_Name,
            0 AS CurrentDateAvg
        WHERE
            1 = 2
    ),
    ProjectCount AS (
        SELECT
            'xyz' AS Project_Name,
            5 AS ProCount
    )
SELECT
    COALESCE(
        w.Project_Name,
        m.Project_Name,
        l.Project_Name,
        c.Project_Name,
        pc.Project_Name
    ) AS Project,
    COALESCE(pc.ProCount, 0) ProCount,
    COALESCE(c.CurrentDateAvg, 0) CurrentDateAvg,
    COALESCE(w.thisWeekAvg, 0) thisWeekAvg,
    COALESCE(m.thisMonthAvg, 0) thisMonthAvg,
    COALESCE(l.lastMonthAvg, 0) lastMonthAvg
FROM
    ThisWeek w
    FULL OUTER JOIN ThisMonth m ON w.Project_Name = m.Project_Name --WE don't need an IN here
    FULL OUTER JOIN LastMonth l ON l.Project_Name IN (w.Project_Name,m.Project_Name)
    FULL OUTER JOIN ProjectCount pc ON pc.Project_Name IN (w.Project_Name,m.Project_Name,l.Project_Name)
    FULL OUTER JOIN CurrentDate c ON c.Project_Name IN (w.Project_Name,m.Project_Name,l.Project_Name,pc.Project_Name);
2
Cetin Basoz On

It happens because there is not match to pc.Project_Name = l.Project_Name.

WITH ThisWeek
AS (SELECT 'xyz' AS Project_Name,
           0.19 AS thisWeekAvg),
     ThisMonth
AS (SELECT 'xyz' AS Project_Name,
           0.09 AS thisMonthAvg),
     LastMonth
AS (SELECT 'xyz' AS Project_Name,
           0 AS lastMonthAvg
    WHERE 1 = 2),
     CurrentDate
AS (SELECT 'xyz' AS Project_Name,
           0 AS CurrentDateAvg
    WHERE 1 = 2),
     ProjectCount
AS (SELECT 'xyz' AS Project_Name,
           5 AS ProCount)
SELECT COALESCE(w.Project_Name, m.Project_Name, l.Project_Name, c.Project_Name, pc.Project_Name) AS Project,
       COALESCE(pc.ProCount, 0) ProCount,
       COALESCE(c.CurrentDateAvg, 0) CurrentDateAvg,
       COALESCE(w.thisWeekAvg, 0) thisWeekAvg,
       COALESCE(m.thisMonthAvg, 0) thisMonthAvg,
       COALESCE(l.lastMonthAvg, 0) lastMonthAvg
FROM ThisWeek w
    FULL OUTER JOIN ThisMonth m
        ON w.Project_Name = m.Project_Name
    FULL OUTER JOIN LastMonth l
        ON m.Project_Name = l.Project_Name
           OR l.Project_Name = w.Project_Name
    FULL OUTER JOIN ProjectCount pc
        ON l.Project_Name = pc.Project_Name
           OR pc.Project_Name = w.Project_Name
           OR pc.Project_Name = m.Project_Name
    FULL OUTER JOIN CurrentDate c
        ON pc.Project_Name = c.Project_Name
           OR c.Project_Name = w.Project_Name
           OR c.Project_Name = m.Project_Name
           OR c.Project_Name = l.Project_Name;