Re: Make COPY extendable in order to support Parquet and other formats

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Make COPY extendable in order to support Parquet and other formats
Date: 2022-06-23 08:38:29
Message-ID: CAJ7c6TOvJE_kwBAoo7-6QT8PRvAgc2SrPfUQJOKAFTq_QeEiFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres, Tom,

> > I suspect that we'd first need a patch to refactor the existing copy code a
> > good bit to clean things up. After that it hopefully will be possible to plug
> > in a new format without being too intrusive.
>
> I think that step 1 ought to be to convert the existing formats into
> plug-ins, and demonstrate that there's no significant loss of performance.

Yep, this looks like a promising strategy to me too.

> I know little about parquet - can it support FROM STDIN efficiently?

Parquet is a compressed binary format with data grouped by columns
[1]. I wouldn't assume that this is a primary use case for this
particular format.

[1]: https://parquet.apache.org/docs/file-format/

--
Best regards,
Aleksander Alekseev

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-06-23 08:43:54 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Drouvot, Bertrand 2022-06-23 08:07:41 Re: Missing reference to pgstat_replslot.c in pgstat.c