Re: Extend COPY FROM with HEADER <integer> to skip multiple lines

From: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
To: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Extend COPY FROM with HEADER <integer> to skip multiple lines
Date: 2025-06-26 05:35:34
Message-ID: CAOzEurSvbOyW7k4Sjc_e7yDyr9NB-N2CObVoL=D4zu9_9r4_pw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > So it seems better for you to implement the patch at first and then
> > check the performance overhead etc if necessary.
>
> Thank you for your advice. I will create a patch.

I created a patch.

As you can see from the patch, I believe the performance impact is
negligible. The only changes were to modify how the HEADER option is stored
and to add a loop that skips the specified number of header lines when
parsing the options.

The design is such that if a HEADER value larger than the number of lines
in the file is specified, the command will complete with zero rows loaded
and will not return an error. This is the same behavior as specifying
HEADER true for a CSV file that has zero rows.

And I will add this patch for the next CF.

Thoughts?

--
Best regards,
Shinya Kato
NTT OSS Center

Attachment Content-Type Size
v1-0001-Add-support-for-multi-line-header-skipping-in-COP.patch application/octet-stream 13.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-06-26 05:40:36 Re: Removing unneeded self joins
Previous Message Bertrand Drouvot 2025-06-26 05:28:22 Re: pgsql: Introduce pg_shmem_allocations_numa view