I have a table called referrals
CREATE TABLE [dataproduct].[referrals]
(
[person_id] DOUBLE,
[medical_service_cd] DOUBLE,
[refer_from_organization_id]
[referral_org_name] STRING,
[refer_from_provider_id] DOUBLE,
[provider_from_name] STRING,
[refer_to_provider_id] DOUBLE,
[provider_to_name] STRING,
[outbound_encntr_id DOUBLE,
[order_id] DOUBLE,
[referral_written_dt_tm] STRING,
[requested_start_dt_tm] STRING,
[medical_service] STRING,
[referral_status] STRING,
[loc_code] STRING,
[service_type] STRING,
[referral_reason] STRING,
[treatment_text] STRING
)
I want to find patients, by person_id, who were referred to Neurology FOLLOWED BY Medical Genetics (as listed under medical_service) using referral_written_dt_tm. I confirmed that some patients had both medical services, but want to find the ones who had them in the desired order. so I tried this:
SELECT r.person_id,
COUNT(DISTINCT r.medical_service) as count,
row_number() OVER (PARTITION BY person_id ORDER BY referral_written_dt_tm ASC) AS row_num
FROM matt_dataproduct.referrals r
WHERE r.medical_service IN ('Medical Genetics',
'Neurology'
)
GROUP BY r.person_id,
HAVING count > 1;
I got no results. I just want a list of patients who meet the criteria (medical genetics followed by neurology). Is there some way I should revise my query? Something about it seems off.
One way is probably something like the following:
This simply joins the table to itself (once for Neurology and once for Medical Genetics) and finds rows where the same patient has Neurology before Medical Genetics. The
group byis being used to get distinct values (could usedistinctin theselectpart instead).You said the type of
eferral_written_dt_tmis a string (not an actual date). I'm assuming it's directly comparable without conversion (since you used it in your window function'sorder by), but if not you would replacern.eferral_written_dt_tm < rmg.eferral_written_dt_tmwith the appropriate functions or conversions so that it's comparable.You could use the
row_numberwindow function as you tried (or maybe a variation withlag) by wrapping it in another query to compare values. But in this case I don't see any big advantage over the simple self-join.