Re: row_to_json(), NULL values, and AS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neil(dot)conway(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, pgsql-bugs(at)postgresql(dot)org
Subject: Re: row_to_json(), NULL values, and AS
Date: 2018-06-16 14:42:26
Message-ID: 14510.1529160146@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

I wrote:
> Neil Conway <neil(dot)conway(at)gmail(dot)com> writes:
>> On Fri, Jun 15, 2018 at 7:53 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I'm a bit hesitant to muck with this behavior, given that it's stood
>>> for ~20 years. However, if we did want to touch it, maybe the right
>>> thing would be to give up the absolutist position that f(x) and x.f
>>> are exactly interchangeable. We could say instead that we prefer the
>>> function interpretation if function syntax is used, and the column
>>> interpretation if column syntax is used.

>> Interesting! Your proposed change seems quite reasonable to me.

> Here's a proposed patch for that. (It needs to be applied over the
> fixes in <14497(dot)1529089235(at)sss(dot)pgh(dot)pa(dot)us>, which are unrelated but
> touch some of the same code.) I didn't add any test cases yet,
> but probably it's desirable to have one.

It occurred to me this morning that there's actually a dump/reload
hazard in the current behavior. Consider

regression=# create table t1 (f1 int, f2 text);
CREATE TABLE
regression=# create view v1 as select row_to_json(t1) as j from t1;
CREATE VIEW
regression=# alter table t1 add column row_to_json text;
ALTER TABLE
regression=# \d+ v1
View "public.v1"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+------+-----------+----------+---------+----------+-------------
j | json | | | | extended |
View definition:
SELECT row_to_json(t1.*) AS j
FROM t1;

At this point, pg_dump will naively dump the view as

CREATE VIEW public.v1 AS
SELECT row_to_json(t1.*) AS j
FROM public.t1;

which, with the historical behavior, will be read as a reference to
the subsequently-added column, giving a view definition that means
something else entirely:

regression=# \d+ v1
View "public.v1"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+------+-----------+----------+---------+----------+-------------
j | text | | | | extended |
View definition:
SELECT t1.row_to_json AS j
FROM t1;

The proposed change fixes this on the parsing side, with no need
for any hacking in ruleutils.c (hence it will fix it for pre-existing
dump files too).

So I'm now pretty well convinced that this is a good change and we
should slip it into v11. I'm still afraid to back-patch though.
I vaguely recall one or two prior complaints in this area, but not
so many that it's worth taking a chance of breaking applications
in minor releases.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2018-06-17 08:54:06 BUG #15245: pg_stat_all_tables does not include partition master tables
Previous Message Tom Lane 2018-06-15 21:57:50 Re: BUG #15244: Inherited table queries return null lines that do not exist

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-06-16 15:59:24 Re: POC: GROUP BY optimization
Previous Message John Dent 2018-06-16 14:21:27 Query Rewrite for Materialized Views (Postgres Extension)