Re: Make COPY extendable in order to support Parquet and other formats

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Make COPY extendable in order to support Parquet and other formats
Date: 2022-06-24 14:14:01
Message-ID: cad1cec1-a148-c488-bf51-5821cc1a9b16@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2022-06-23 Th 21:45, Andres Freund wrote:
> Hi,
>
> On 2022-06-23 11:38:29 +0300, Aleksander Alekseev wrote:
>>> I know little about parquet - can it support FROM STDIN efficiently?
>> Parquet is a compressed binary format with data grouped by columns
>> [1]. I wouldn't assume that this is a primary use case for this
>> particular format.
> IMO decent COPY FROM / TO STDIN support is crucial, because otherwise you
> can't do COPY from/to a client. Which would make the feature unusable for
> anybody not superuser, including just about all users of hosted PG.
>

+1

Note that Parquet puts the metadata at the end of each file, which makes
it nice to write but somewhat unfriendly for streaming readers, which
would have to accumulate the whole file in order to process it.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2022-06-24 14:17:06 Pre-installed index access methods cannot be manually installed.
Previous Message Tom Lane 2022-06-24 14:13:51 Re: Unify DLSUFFIX on Darwin