Re: Extend COPY FROM with HEADER <integer> to skip multiple lines

From: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Extend COPY FROM with HEADER <integer> to skip multiple lines
Date: 2025-06-26 10:12:03
Message-ID: CAOzEurSRr8J9QmDn9TSCf09XrVN_+7Q-zHgXb0Ni8+_HZbeaBw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 26, 2025 at 4:36 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
wrote:

> On 2025/06/26 14:35, Shinya Kato wrote:
> >
> > > > So it seems better for you to implement the patch at first and then
> > > > check the performance overhead etc if necessary.
> > >
> > > Thank you for your advice. I will create a patch.
> >
> > I created a patch.
>
> Thanks for the patch!
>

Thank you for your review.

When I compiled with the patch applied, I got the following warning:
>
> copyfromparse.c:834:20: error: ‘done’ may be used uninitialized
> [-Werror=maybe-uninitialized]
> 834 | if (done)
> | ^
>

Oh, sorry. I missed it.
I fixed it to initialize done = false.

> + lines are discarded. An integer value greater than 1 is only valid
> for
> + <command>COPY FROM</command> commands.
>
>
> This might be slightly misleading since 0 is also a valid value for COPY
> FROM.
>

That's certainly true. I fixed it below:
+ lines are discarded. An integer value greater than 1 is not allowed
for
+ <command>COPY TO</command> commands.

> + for (int i = 0; i < cstate->opts.header.skip_lines; i++)
> + {
> + cstate->cur_lineno++;
> + done = CopyReadLine(cstate, is_csv);
> + }
>
> If "done" is true, shouldn't we break out of the loop immediately?
> Otherwise, with a large HEADER value, we could end up wasting a lot of
> cycles unnecessarily.

Exactly, fixed.

> +defGetCopyHeaderOption(DefElem *def, bool is_from)
> {
> + CopyHeaderOption option;
>
> To avoid repeating the same code like "option.match = false" in every case,
> how about initializing the struct like "option = {false, 1}"?
>

Exactly, fixed.

>
>
> +
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("HEADER
> must be non-negative integer")));
>
> This message could be confusing since HEADER can also accept a Boolean or
> "match".
> Maybe it's better to use the same message as "%s requires a Boolean value,
> integer value,
> or \"match\"",? "integer value"? If so, it's also better to use
> "non-negative integer"
> instead of just "integer".
>

Yes, I fixed it to use the same message which replaced "integer" to
"non-negative integer".

Original error message "%s requires a Boolean value, integer value, or
\"match\"" should also be fixed to "non-negative integer"?

--
Best regards,
Shinya Kato
NTT OSS Center

Attachment Content-Type Size
v2-0001-Add-support-for-multi-line-header-skipping-in-COP.patch application/octet-stream 13.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2025-06-26 10:16:29 Re: PG18 protocol version
Previous Message Fujii Masao 2025-06-26 10:04:48 Re: pg_dump misses comments on NOT NULL constraints