Re: Extend COPY FROM with HEADER <integer> to skip multiple lines

From: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
To: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Extend COPY FROM with HEADER <integer> to skip multiple lines
Date: 2025-07-02 05:39:43
Message-ID: CAOzEurTUk6AGwojR6+zCJbsJu3DZ779_tufj6aqR6NbcnHsQGw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 30, 2025 at 4:24 PM Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
>
> > > I have a few minor comment on the patch.
> > >
> > > + if (ival < 0)
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > + errmsg("%s requires a Boolean value, a non-negative "
> > > + "integer, or the string \"match\"",
> > > + def->defname)));
> > >
> > > ereport(ERROR,
> > > (errcode(ERRCODE_SYNTAX_ERROR),
> > > - errmsg("%s requires a Boolean value or \"match\"",
> > > + errmsg("%s requires a Boolean value, a non-negative integer, "
> > > + "or the string \"match\"",
> > > def->defname)));
> > >
> > > These two pieces of code raise the same error, but with different error codes:
> > > one returns ERRCODE_INVALID_PARAMETER_VALUE, while the other returns ERRCODE_SYNTAX_ERROR.
> > >
> > > I believe the former is more appropriate, although the existing code uses the
> > > latter. In any case, the error codes should be consistent.
> >
> > I'm not sure there's an actual rule like "the error code must match
> > if the error message is the same." But if using the same message with
> > a different error code is confusing, I'm fine with changing
> > the earlier message. For example:
> >
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > - errmsg("%s requires a Boolean value, a non-negative "
> > - "integer, or the string \"match\"",
> > + errmsg("a negative integer value cannot be "
> > + "specified for %s",
> > def->defname)));
>
> Fair point. There might not be any explicit rule.
> However, I feel it somewhat confusing.
>
> After looking into the code, it seems that ERRCODE_SYNTAX_ERROR tends to be used
> for invalid keywords or parameter types, while ERRCODE_INVALID_PARAMETER_VALUE
> is typically used for values that are out of the acceptable range.
>
> So, the proposed change seems reasonable to me.

Thank you for the review. The change looks good to me, too. A new
patch is attached.

> Regarding the documentation, how about explicitly stating that when MATCH is specified, only
> the first line is skipped? While this may seem obvious, it’s worth clarifying, as the semantics
> of the HEADER option have become a bit more complex with this change.

Agreed. I have updated the documentation as follows:

+ lines are discarded. If the option is set to <literal>MATCH</literal>,
+ the number and names of the columns in the header line must exactly
+ match those of the table and, in order, after which the header line is
+ discarded; otherwise an error is raised. The <literal>MATCH</literal>

--
Best regards,
Shinya Kato
NTT OSS Center

Attachment Content-Type Size
v4-0001-Add-support-for-multi-line-header-skipping-in-COP.patch application/octet-stream 14.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2025-07-02 05:46:34 Re: Improve tab completion for COPY
Previous Message Ashutosh Bapat 2025-07-02 05:20:09 Using failover slots for PG-non_PG logical replication