Re: Emitting JSON to file using COPY TO

From: Joe Conway <mail(at)joeconway(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Emitting JSON to file using COPY TO
Date: 2023-12-04 15:45:58
Message-ID: bcbb2a04-5df3-49f2-b304-28ae8edc5189@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 12/4/23 09:25, Andrew Dunstan wrote:
>
> On 2023-12-04 Mo 08:37, Joe Conway wrote:
>> On 12/4/23 07:41, Andrew Dunstan wrote:
>>>
>>> On 2023-12-03 Su 20:14, Joe Conway wrote:
>>>> (please don't top quote on the Postgres lists)
>>>>
>>>> On 12/3/23 17:38, Davin Shearer wrote:
>>>>> " being quoted as \\" breaks the JSON. It needs to be \".  This has
>>>>> been my whole problem with COPY TO for JSON.
>>>>>
>>>>> Please validate that the output is in proper format with correct
>>>>> quoting for special characters. I use `jq` on the command line to
>>>>> validate and format the output.
>>>>
>>>> I just hooked existing "row-to-json machinery" up to the "COPY TO"
>>>> statement. If the output is wrong (just for for this use case?),
>>>> that would be a missing feature (or possibly a bug?).
>>>>
>>>> Davin -- how did you work around the issue with the way the built in
>>>> functions output JSON?
>>>>
>>>> Andrew -- comments/thoughts?
>>>
>>> I meant to mention this when I was making comments yesterday.
>>>
>>> The patch should not be using CopyAttributeOutText - it will try to
>>> escape characters such as \, which produces the effect complained of
>>> here, or else we need to change its setup so we have a way to inhibit
>>> that escaping.
>>
>>
>> Interesting.
>>
>> I am surprised this has never been raised as a problem with COPY TO
>> before.
>>
>> Should the JSON output, as produced by composite_to_json(), be sent
>> as-is with no escaping at all? If yes, is JSON somehow unique in this
>> regard?
>
>
> Text mode output is in such a form that it can be read back in using
> text mode input. There's nothing special about JSON in this respect -
> any text field will be escaped too. But output suitable for text mode
> input is not what you're trying to produce here; you're trying to
> produce valid JSON.
>
> So, yes, the result of composite_to_json, which is already suitably
> escaped, should not be further escaped in this case.

Gotcha.

This patch version uses CopySendData() instead and includes
documentation changes. Still lacks regression tests.

Hopefully this looks better. Any other particular strings I ought to
test with?

8<------------------
test=# copy (select * from foo limit 4) to stdout (format json,
force_array true);
[
{"id":1,"f1":"line with \" in it:
1","f2":"2023-12-03T12:26:41.596053-05:00"}
,{"id":2,"f1":"line with ' in it:
2","f2":"2023-12-03T12:26:41.596173-05:00"}
,{"id":3,"f1":"line with \" in it:
3","f2":"2023-12-03T12:26:41.596179-05:00"}
,{"id":4,"f1":"line with ' in it:
4","f2":"2023-12-03T12:26:41.596182-05:00"}
]
8<------------------

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
copyto_json.003.diff text/x-patch 14.3 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2023-12-04 16:07:25 vacuumdb seems not to like option -j when run from crontab
Previous Message John DeSoi 2023-12-04 15:40:46 Re: libpq crashing on macOS during connection startup

Browse pgsql-hackers by date

  From Date Subject
Next Message Kumar, Sachin 2023-12-04 16:07:59 Re: pg_upgrade failing for 200+ million Large Objects
Previous Message Tomas Vondra 2023-12-04 15:33:09 Re: Parallel CREATE INDEX for BRIN indexes