Re: Extend COPY FROM with HEADER <integer> to skip multiple lines

From: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Extend COPY FROM with HEADER <integer> to skip multiple lines
Date: 2025-06-10 00:43:10
Message-ID: CAOzEurQkgzy4nmvGFXhG0o1Wz6hTEpCNr7HSBmFGY=-uv+geSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > However, a similar proposal was made earlier [1], and seemingly
> > some hackers weren't in favor of it. It's probably worth reading
> > that thread to understand the previous concerns.
> >
> > [1] https://postgr.es/m/CALAY4q8nGSXp0P5uf56vn-mD7reWqZP5k6PS1CGUm26X4FsYJA@mail.gmail.com
>
> Oh, I missed it. I will check it soon.

I read it.

There are clear differences from the earlier proposal. My sole
motivation is to skip multiple headers, and I don't believe a feature
for skipping footers is necessary. To be clear, I think it's best to
simply extend the current HEADER option.

Regarding the concern about adding ETL-like functionality, this
feature is already implemented in other RDBMSs, which is why I believe
it is also necessary for PostgreSQL.

Honestly, I haven't implemented it yet, so I'm not sure about the
performance. However, I don't expect it to be significantly different
from the current HEADER option that skips a single line.

> I think the earlier proposal went rather further than this one, which I
> suspect can be implemented fairly cheaply.

That's probably it, exactly.

> I don't have terribly strong feelings about it, but matching a feature
> implemented elsewhere has some attraction if it can be done easily.
>
> OTOH I'm a bit curious to know what software produces multi-line CSV
> headers.

Both Pandas and R can create CSV files with multi-line headers
(although I don't personally think this is desirable). Furthermore,
various systems sometimes generate reports as CSV files that
unexpectedly contain multiple header lines.

--
Best regards,
Shinya Kato
NTT OSS Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Steven Niu 2025-06-10 01:39:37 Re: [PATCH] Refactor: Extract XLogRecord info
Previous Message Dimitrios Apostolou 2025-06-10 00:26:28 [WIP PATCH v2] Implement "pg_restore --data-only --clean" as a way to skip WAL