Using COALESCE in postgresql with UPDATE

344 Views Asked by At

I have a table of real estate data with lots of NULLS in address column. I want to replace these nulls based on the unique ID. (I'm following Alex the Analyst tutorial). However when I use this in postgresql, I'm getting an error:

ERROR: relation "a" does not exist

UPDATE a SET
propertyaddress = COALESCE(a.propertyaddress, b.propertyaddress)
FROM nashville_housing_data a
JOIN nashville_housing_data b
  ON a.parcelid = b.parcelid
AND a.uniqueid <> b.uniqueid
WHERE propertyaddress is null

Now, when I replaced a with the table name, it updated all address rows to one specific address.

5

There are 5 best solutions below

0
Bohemian On

You can't UPDATE aliases; you must use the table name.

Also, COALESCE is useless here because the condition WHERE propertyaddress is null guarantees the first term is always null, so just assign b.propertyaddress.

Try this:

UPDATE nashville_housing_data SET
propertyaddress = b.propertyaddress
FROM nashville_housing_data
JOIN nashville_housing_data b
  ON nashville_housing_data.parcelid = b.parcelid
AND nashville_housing_data.uniqueid <> b.uniqueid
WHERE nashville_housing_data.propertyaddress is null

If there can be more than one other rows with a matching parcelid, break the tie with MAX():

...
propertyaddress = MAX(b.propertyaddress)
...
0
Nisar Mehmood On

The Issue here is with the table alias 'a'. You cannot directly reference the updated table using an alias in the UPDATE statement. Instead, it would help if you directly referenced the table name. Try using the query with the alias 'a'.

UPDATE nashville_housing_data
SET propertyaddress = COALESCE(propertyaddress, b.propertyaddress)
FROM nashville_housing_data b
WHERE nashville_housing_data.propertyaddress IS NULL
  AND nashville_housing_data.parcelid = b.parcelid
  AND nashville_housing_data.uniqueid <> b.uniqueid;
0
lemon On

You can indeed use aliases, but you can't do it with the explicit join notation, because you're going to need add reference for your table "a" twice. In this context, it's better to reference the second table directly in the FROM clause, with aliasing, and add your join conditions inside the WHERE clause.

Additionally:

  • you don't need to use COALESCE, because you're matching table1 null values with table2 non null values (you can just take table2 values)
  • in place of a.uniqueid <> b.uniqueid, you can use b.propertyaddress IS NOT NULL, which should be slightly more efficient (when your records have all null values in your "t2.propertyaddress" column, you just skip the update on that "t1.uniqueid" value)
  • not really strictly needed, but as an optional preference, when I'm self-joining the same table I like to use t1 and t2, to remind myself it's always the same "t", but referenced twice
UPDATE nashville_housing_data t1 
SET propertyaddress = t2.propertyaddress
FROM nashville_housing_data t2
WHERE t1.parcelid = t2.parcelid
  AND t1.propertyaddress IS NULL
  AND t2.propertyaddress IS NOT NULL;

A little demo here.

0
Riinu Anslan On

I played around with this and came up with an answer to my own question:

UPDATE nashville_housing_data a
SET propertyaddress = COALESCE(a.propertyaddress, b.propertyaddress)
FROM nashville_housing_data b
WHERE a.parcelid = b.parcelid
AND a.uniqueid <> b.uniqueid
AND a.propertyaddress IS NULL;
0
Raja Rakshak On

It appears that your SQL query has a problem because the table's alias was used in the UPDATE statement. When using the UPDATE clause in PostgreSQL, the alias should refer to the target table that is being modified rather than the table you are joining. You should alter your query as follows to properly update the propertyaddress column based on the unique ID:

    UPDATE nashville_housing_data a
    SET propertyaddress= COALESCE(a.propertyaddress, b.propertyaddress)
    FROM nashville_housing_data b
    WHERE a.parcelid= b.parcelid
    AND a.uniqueid <> b.uniqueid
    AND a.propertyaddress IS NULL;

I think, by modifying the query, you should not get any error.