Re: Add header support to text format and matching feature

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Rémi Lapeyre <remi(dot)lapeyre(at)lenstra(dot)fr>, Zhihong Yu <zyu(at)yugabyte(dot)com>, David Steele <david(at)pgmasters(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Daniel Verite <daniel(at)manitou-mail(dot)org>
Subject: Re: Add header support to text format and matching feature
Date: 2022-06-13 02:32:13
Message-ID: 20220613023213.bj7qjnplvgfmx522@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Sun, Jun 12, 2022 at 09:36:13AM -0400, Andrew Dunstan wrote:
>
> On 2022-06-07 Tu 11:47, Julien Rouhaud wrote:
> >
> > First, probably nitpicking, the HEADER MATCH is allowed for COPY TO, is that
> > expected? The documentation isn't really explicit about it, but there's
> > nothing to match when exporting data it's a bit surprising. I'm not opposed to
> > have HEADER MATCH means HEADER ON for COPY TO, as as-is one can easily reuse
> > the commands history, but maybe it should be clearly documented?
>
>
> I think it makes more sense to have a sanity check to prevent HEADER
> MATCH with COPY TO.

I'm fine with it. I added such a check and mentioned it in the documentation.

> > Then, apparently HEADER MATCH doesn't let you do sanity checks against a custom
> > column list. This one looks like a clear oversight, as something like that
> > should be entirely valid IMHO:
> >
> > CREATE TABLE tbl(col1 int, col2 int);
> > COPY tbl (col2, col1) TO '/path/to/file' WITH (HEADER MATCH);
> > COPY tbl (col2, col1) FROM '/path/to/file' WITH (HEADER MATCH);
> >
> > but right now it errors out with:
> >
> > ERROR: column name mismatch in header line field 1: got "col1", expected "col2"
> >
> > Note that the error message is bogus if you specify attributes in a
> > different order from the relation, as the code is mixing access to the tuple
> > desc and access to the raw fields with the same offset.
> > [...]
> I think it should, but a temporary alternative would be to forbid HEADER
> MATCH with explicit column lists until we can make it work right.

I think it would still be problematic if the target table has dropped columns.
Fortunately, as I initially thought the problem is only due to a thinko in the
original commit which used a wrong variable for the raw_fields offset. Once
fixed (attached v1) I didn't see any other problem in the rest of the logic and
all the added regression tests work as expected.

Attachment Content-Type Size
v1-0001-Fix-processing-of-header-match-option-for-COPY.patch text/plain 6.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2022-06-13 02:58:57 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Previous Message Thomas Munro 2022-06-12 22:36:02 Re: Collation version tracking for macOS