Re: Making "COPY partitioned_table FROM" faster

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, khuddleston(at)pivotal(dot)io
Subject: Re: Making "COPY partitioned_table FROM" faster
Date: 2018-07-27 16:45:14
Message-ID: CA+TgmoaLgmJR_dqy81DPWWF4mqmRheNMdjj5cOTBq6edz7MxpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 25, 2018 at 10:30 PM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> I agree RANGE partition is probably the most likely case to benefit
> from this optimisation, but I just don't think that HASH could never
> benefit and LIST probably sits somewhere in the middle.
>
> HASH partitioning might be useful in cases like partitioning by
> "sensor_id". It does not seem that unreasonable that someone might
> want to load all the data for an entire sensor at once.

Another case might be restoring a dump with --load-via-partition-root.
Granted, you probably shouldn't specify that option unless you expect
rows to end up in different partition than they were in the original
cluster, but it's possible somebody might do it out of an abundance of
caution if, say, they are doing an automated dump restore on a machine
that may or may not have different endian-ness than the original.

I think making it adaptive, as you've done, is definitely better than
a heuristic based on the partitioning type.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-07-27 16:51:17 Re: Deprecating, and scheduling removal of, pg_dump's tar format.
Previous Message Andrew Gierth 2018-07-27 16:33:51 Re: Early WIP/PoC for inlining CTEs