Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, matti(dot)hameister(at)technologygroup(dot)de, pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns
Date: 2014-06-23 20:01:03
Message-ID: 53A8877F.7060208@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers


On 06/23/2014 11:43 AM, Tom Lane wrote:
> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
>> Digging more into that, I have found the issue and a fix for it. It happens
>> that populate_recordset_object_start, which is used to initialize the
>> process for the population of the record, is taken *each* time a json
>> object is found, re-creating every time the hash table for the parsing
>> process, hence removing from PopulateRecordsetState all the entries already
>> parsed and creating the problem reported by Matti. The fix I am proposing
>> to fix this issue is rather simple: simply bypass the creation of the hash
>> table if lex_level > 1 as we are in presence of a nested object and rely on
>> the existing hash table.
> Yes, this code is clearly not handling the nested-objects case correctly.
> I had written a fix more or less equivalent to yours last night.
>
> However, it seems to me that these functions (json[b]_to_record[set]) are
> handling the nested-json-objects case in a fairly brain-dead fashion to
> start with. I would like to propose that we should think about getting
> rid of the use_json_as_text flag arguments altogether. What purpose do
> they serve? If we're going to the trouble of parsing the nested JSON
> objects anyway, why don't we just reconstruct from that data?
>
> (IOW, we probably actually should have nested hash tables in this case.
> I suspect that the current bug arose from incompletely-written logic
> to do it like that.)
>
> Since we've already decided to force an initdb for 9.4beta2, it's not
> quite too late to revisit this API, and I think it needs revisiting.
>
>

Looks like we have some problems in this whole area, not just the new
function, so we need to fix 9.3 also :-(

IIRC, originally, the intention was to disallow nested json objects, but
the use_json_as_text was put in as a possibly less drastic possibility.
If we get rid of it our only recourse is to error out if we encounter
nested json. I was probably remiss in not considering the likelihood of
a json target field.

I currently don't have lots of time to devote to this, sadly, but
Michael's patch looks like a good minimal fix.

cheers

andrew

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Steven Siebert 2014-06-23 20:26:25 Re: BUG #10680: LDAP bind password leaks to log on failed authentication
Previous Message Merlin Moncure 2014-06-23 16:19:05 Re: BUG #10728: json_to_recordset with nested json objects NULLs columns

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2014-06-23 20:07:45 Re: New functions in sslinfo module
Previous Message Jeff Davis 2014-06-23 20:00:11 Re: pgsql: Do all-visible handling in lazy_vacuum_page() outside its critic