`EXECUTE` statement VS no `EXECUTE` statement in a function in PostgreSQL

131 Views Asked by At

I created person table, then inserted 2 rows into it as shown below:

CREATE TABLE person (
  id INT,
  name VARCHAR(20),
  age INT
);

INSERT INTO person (id, name, age) 
VALUES (1, 'John', 27), (2, 'David', 32);

Then with an EXECUTE statement, I created my_func() with my_age and my_id parameter as shown below:

CREATE FUNCTION my_func(my_age INT, my_id INT) RETURNS VOID AS $$
BEGIN                -- ↑↑↑↑↑↑      ↑↑↑↑↑
  EXECUTE 'UPDATE person SET age = $1 WHERE id = $2' USING my_age, my_id;
END;
$$ LANGUAGE plpgsql;

Or:

CREATE FUNCTION my_func(my_age INT, my_id INT) RETURNS VOID AS $$
BEGIN
  EXECUTE 'UPDATE person SET age = $1 WHERE id = $2' USING $1, $2;
END;
$$ LANGUAGE plpgsql;

Then, calling my_func() could update age of David to 56 as shown below:

postgres=# SELECT my_func(56, 2);
 my_func
---------

(1 row)

postgres=# SELECT * FROM person;
 id | name  | age
----+-------+-----
  1 | John  |  27
  2 | David |  56
(2 rows)

Next without an EXECUTE statement, I created my_func() with my_age and my_id parameter as shown below:

CREATE FUNCTION my_func(my_age INT, my_id INT) RETURNS VOID AS $$
BEGIN                -- ↑↑↑↑↑↑      ↑↑↑↑↑
  UPDATE person SET age = my_age WHERE id = my_id;
END;
$$ LANGUAGE plpgsql;

Or:

CREATE FUNCTION my_func(my_age INT, my_id INT) RETURNS VOID AS $$
BEGIN
  UPDATE person SET age = $1 WHERE id = $2;
END;
$$ LANGUAGE plpgsql;

Then, calling my_func() could update age of David to 56 as shown below:

postgres=# SELECT my_func(56, 2);
 my_func
---------

(1 row)

postgres=# SELECT * FROM person;
 id | name  | age
----+-------+-----
  1 | John  |  27
  2 | David |  56
(2 rows)

So, what is the difference between the function with and without an EXECUTE statement?

2

There are 2 best solutions below

3
Super Kai - Kazuya Ito On

With an EXECUTE statement, you can create my_func() with age and id parameter which are the same names as age and id column as shown below:

CREATE FUNCTION my_func(age INT, id INT) RETURNS VOID AS $$
BEGIN                -- ↑↑↑      ↑↑
  EXECUTE 'UPDATE person SET age = $1 WHERE id = $2' USING age, id;
END;                      -- ↑↑↑            ↑↑
$$ LANGUAGE plpgsql;

Or:

CREATE FUNCTION my_func(age INT, id INT) RETURNS VOID AS $$
BEGIN                -- ↑↑↑      ↑↑
  EXECUTE 'UPDATE person SET age = $1 WHERE id = $2' USING $1, $2;
END;                      -- ↑↑↑            ↑↑
$$ LANGUAGE plpgsql;

Then, calling my_func() could update age of David to 56 as shown below:

postgres=# SELECT my_func(56, 2);
 my_func
---------

(1 row)

postgres=# SELECT * FROM person;
 id | name  | age
----+-------+-----
  1 | John  |  27
  2 | David |  56
(2 rows)

And without an EXECUTE statement, you can still create my_func() with age and id parameter which are the same names as age and id column as shown below:

CREATE FUNCTION my_func(age INT, id INT) RETURNS VOID AS $$
BEGIN                -- ↑↑↑      ↑↑
  UPDATE person SET age = age WHERE id = id;
END;                   -- ↑↑↑       ↑↑
$$ LANGUAGE plpgsql;

Or:

CREATE FUNCTION my_func(age INT, id INT) RETURNS VOID AS $$
BEGIN                -- ↑↑↑      ↑↑
  UPDATE person SET age = $1 WHERE id = $2;
END;             -- ↑↑↑            ↑↑
$$ LANGUAGE plpgsql;

But, calling my_func() above gets the error below:

ERROR: column reference "id" is ambiguous

0
Erwin Brandstetter On

When to use EXECUTE?

EXECUTE in PL/pgSQL is used for dynamic SQL. For building and executing SQL statements on the fly. Mainly to involve input from users or system catalogs in other roles than just values - i.e. identifiers other key words of the SQL syntax. You don't normally use EXECUTE unless you need that.

One exception: While PL/pgSQL can save and reuse a generic query plan for SQL DML statements (much like prepared statements, but only after a couple of executions, there is some sophistication to this), EXECUTE prevents that from happening and forces a new query plan for every execution. And that can be (ab)used for highly irregular data distributions, where the query plan is better optimized for the given input (and Postgres does not realize this all by itself).

Since Postgres 12, you can also use the server variable plan_cache_mode to force generic or custom plans for non-dynamic DML statements.

It's hard to imagine this could make sense for your simple example. If person.id is the PK, it certainly doesn't. Basically, your version with EXECUTE is nonsense.

Related:

Always concatenate dynamic SQL carefully to defend against possible SQL injection! See:

One other side effect: The query string passed to EXECUTE executes in a separate scope. Variables and parameters of the surrounding code block are not visible there. (Preferably passed in as values with the USING clause.) Good to know, but don't abuse dynamic SQL just to hide naming collisions.

Hidden issue: naming conflicts

There are better solutions to resolve naming conflicts with function parameters. The manual has an overview.

Ideally, adopt a naming convention that avoids conflicts altogether.
If that's not an option, the simple fix is to table-qualify all column names (which is never a bad idea) and reference function parameters by ordinal position: $1 etc.

Simple solution

CREATE PROCEDURE my_proc(id int, age int)  -- ①
  LANGUAGE sql AS
$func$
UPDATE person p SET age = $2 WHERE p.id = $1;  -- ②
$func$;

① While the function doesn't return anything, you might just use a procedure instead. Either way, we have parameter names colliding with column names.

② References to ordinal positions of function parameters avoid ambiguity.

Alternative to demonstrate more options

CREATE FUNCTION my_func(id int, age int)
  LANGUAGE plpgsql AS
$func$
#variable_conflict use_variable    -- ③
BEGIN
   UPDATE person p    -- table alias for brevity
   SET    age = age   -- ④, ⑤
   WHERE  p.id = id;  -- ⑥, ⑤
END
$func$;

Using a function like you had it. Just as proof of concept.

#variable_conflict use_variable before the begin of the function body resolves ambiguities in favor of parameter or variable names. Typically unnecessary.

④ Target columns of an UPDATE are never ambiguous and cannot be table-qualified.

⑤ The ambiguous, unqualified names age and id resolve to function parameters due to #variable_conflict use_variable.

p.id is table-qualified and hence unambiguous.

fiddle

Related:

Aside, age as table column is typically nonsense. Store a birthday instead.