Re: BUG #10728: json_to_recordset with nested json objects NULLs columns

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: matti(dot)hameister(at)technologygroup(dot)de
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #10728: json_to_recordset with nested json objects NULLs columns
Date: 2014-06-23 12:18:26
Message-ID: CAB7nPqTP1m5H=kNsm6mmv-5f7A99O7AP2X6E9ubb4ShZWq-COQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Jun 23, 2014 at 4:36 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Mon, Jun 23, 2014 at 2:13 PM, Michael Paquier <
> michael(dot)paquier(at)gmail(dot)com> wrote:
>
>> I am guessing that the bug origin is in pg_parse_json in the way nested
>> json is managed, it is the only code path of populate_recordset_worker
>> where a switch on JSON[B]OID is used.
>>
> Digging into that, I am seeing that the hash table used to find the field
> values queried in populate_recordset_object_end in the hash table
> (PopulateRecordsetState *)->json_hash has null entries for all the columns
> inserted before the last nested json value. For example in my last example
> with '[{"a":2,"b":3,"c":{"z":4}, "d":6}]', this results in having null
> values for "a" and "b", "c" and "d" remaining correct.
> populate_recordset_object_field_end inserts those values correctly within
> the hash table though, so something strange is going on when inserting in
> json_hash directly a json value.
>
Digging more into that, I have found the issue and a fix for it. It happens
that populate_recordset_object_start, which is used to initialize the
process for the population of the record, is taken *each* time a json
object is found, re-creating every time the hash table for the parsing
process, hence removing from PopulateRecordsetState all the entries already
parsed and creating the problem reported by Matti. The fix I am proposing
to fix this issue is rather simple: simply bypass the creation of the hash
table if lex_level > 1 as we are in presence of a nested object and rely on
the existing hash table.
Patch is attached, and should be backpatched to REL9_4_STABLE where
json_to_recordset has been introduced.
Regards,
--
Michael

Attachment Content-Type Size
20140623_fix_json_record_hash.patch text/x-diff 3.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2014-06-23 14:22:13 Re: BUG #10726: database gets to an older backup out of nowhere!!!
Previous Message marcel.hanke 2014-06-23 08:33:31 BUG #10736: Database does somtimes not startup again after restart

Browse pgsql-hackers by date

  From Date Subject
Next Message Nicholas White 2014-06-23 12:27:41 Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
Previous Message Gurjeet Singh 2014-06-23 12:17:07 Re: /proc/self/oom_adj is deprecated in newer Linux kernels