| From: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Copy from JSON FORMAT. |
| Date: | 2026-03-22 08:59:01 |
| Message-ID: | CAEG8a3+wxMXGLcnrDzSF0qTDzH+K7_x0FWNAcu-H_j8gKiwFVw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi hackers,
7dadd38cda introduced support for COPY TO in JSON format, but did
not include the corresponding COPY FROM functionality. The original
discussion [1] mentioned the possibility of supporting COPY FROM as
well, but that part of the patch never happened (If I don't miss).
For a data format like JSON, basic round-trip capability seems essential:
if data can be exported via COPY TO, it should also be possible to import
it back using COPY FROM. So I try to close that gap.
0001
A small, unrelated cleanup that replaces several usages with
TupleDescCompactAttr where appropriate. This was discovered
opportunistically while working on the JSON support.
0002
Adds support for COPY FROM with FORMAT json.
The core logic is outlined below.
Bytes are read into raw_buf and optionally transcoded into
input_buf. Instead of CopyReadLine, JSON mode uses CopyReadNextJson
to fetch the next row object via a small state machine:
- BEFORE_ARRAY: expect '[' or '{'
- BEFORE_OBJECT: in concat mode, expect next object or EOF
- IN_ARRAY: expect object, comma, or ']'
- IN_OBJECT: track brace depth to find the matching '}'
- IN_STRING / IN_STRING_ESC: skip braces inside strings
- ARRAY_END: after ']', no more rows
When a row object closes, copy_json_finalize_linebuf_for_row
reshapes line_buf to [row text][unparsed tail] and sets parse_pos
to the row length so jsonb_in sees only the current row.
NextCopyFromJsonRawFieldsInternal calls jsonb_in on that slice,
verifies a JSON object at the root, then looks up each target
column by name. Values are converted to C strings via
JsonbValueToCstring and stored in attribute_buf, following the
same pattern as text/CSV.
CopyFromJsonOneRow then runs the standard per-column input
functions, so type coercion matches ordinary textual input and
the existing COPY machinery for defaults and soft errors applies
unchanged.
[1] postgresql.org/message-id/flat/20231201230958.GA1786735%40nathanxps13
--
Regards
Junwang Zhao
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0002-Support-COPY-FROM-with-FORMAT-JSON.patch | application/octet-stream | 43.5 KB |
| v1-0001-use-TupleDescCompactAttr-where-possible.patch | application/octet-stream | 3.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2026-03-22 09:09:22 | Re: Remove inner joins based on foreign keys |
| Previous Message | Andrew Dunstan | 2026-03-22 08:50:31 | Re: Non-text mode for pg_dumpall |