Re: SQL/JSON: FOR ORDINALITY bug

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL/JSON: FOR ORDINALITY bug
Date: 2022-05-04 21:52:38
Message-ID: CAKFQuwbOU4mxua2pk9_df0sOt0HAeptkjms_jHq+kGXOgTvWLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 4, 2022 at 1:43 PM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
wrote:

> On Wed, May 4, 2022 at 1:09 PM Erik Rijkers <er(at)xs4all(dot)nl> wrote:
>
>> Op 04-05-2022 om 21:12 schreef Andrew Dunstan:
>> >
>> >>>>
>> >>>> I don't see how rowseq can be anything but 1. Each invocation of
>> >>
>> >>
>> >> After some further experimentation, I now think you must be right,
>> David.
>> >>
>> >> Also, looking at the DB2 docs:
>> >> https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
>> >> (see especially under 'Handling nested information')
>>
>> You've probably noticed then that on that same page under 'Sibling
>> Nesting' is a statement that gives a 13-row resultset on DB2 whereas in
>> 15devel that statement yields just 10 rows. I don't know which is
>> correct.
>>
>>
> There should be 12 results (minimum would be 8 - 5 of which are used for
> real matches, plus 4 new row producing matches).
>
> Our result seems internally inconsistent; conceptually there are two kinds
> of nulls here and we cannot collapse them.
>

> null-val: we are outputting the record from the nested path but there is
> no actual value to output so we output null-val
> null-union: we are not outputting the record for the nested path (we are
> doing a different one) but we need to output something for this column so
> we output null-union.
>
>
Thinking this over - I think the difference is we implemented a FULL OUTER
JOIN to combine the siblings - including the behavior of that construct and
the absence of rows. DB2 took the word "UNION" for the plan modifier
literally and unioned (actually union all) the two subpaths together using
the null concepts above (though somehow ensuring that at least one row was
produced from each subpath...).

Thus we are indeed back to seeing whether the standard defines sibling
combining as union or join, or some other special construct. I'm now
leaning toward what we've done as at least being the more sane option.

Even if our outer join process is correct the existing wording is odd.

"Use FULL OUTER JOIN ON FALSE, so that both parent and child rows are
included into the output, with NULL values inserted into both child and
parent columns for all missing values."

I don't think it helps to mention parent here. This aspect of plan doesn't
concern itself with the final output, only the output of the subplan which
is then combined with the parent using a join. I would probably want to
phrase the default more like:

"This is the default option for joining the combined child rows to the
parent."

David J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-05-04 22:32:00 Costing elided SubqueryScans more nearly correctly
Previous Message David Zhang 2022-05-04 21:38:54 Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit