Quick Links

Re: New Copy Formats - avro/orc/parquet

From:	Nicolas Paris <niparisco(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: New Copy Formats - avro/orc/parquet
Date:	2018-02-11 20:00:12
Message-ID:	20180211200012.2agrfocyaf42td5v@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> > That is true, but the question is how significant the overhead is. If
> > it's 50% then reducing it would make perfect sense. If it's 1% then no
> > one if going to be bothered by it.
>
> I think it's pretty clear that it's going to be way way much more than
> 1%.

Good news but not sure to anderstand why.

> It's trivial to construct cases where input parsing / output
> formatting takes the majority of the time.

Binary -> ORC
^
|
PROGRAM parsing/output formating on the fly

> And a lot of that you're going to be able to avoid with binary formats.

Still the above diagram shows both parsing/formating step, isn't it ?

In response to

Re: New Copy Formats - avro/orc/parquet at 2018-02-11 17:16:53 from Andres Freund

Responses

Re: New Copy Formats - avro/orc/parquet at 2018-02-11 20:03:14 from Andres Freund

Browse pgsql-general by date

	From	Date	Subject
Next Message	Andres Freund	2018-02-11 20:03:14	Re: New Copy Formats - avro/orc/parquet
Previous Message	Sand Stone	2018-02-11 19:34:52	persistent read cache