Re: Hash Functions

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>
Subject: Re: Hash Functions
Date: 2017-05-13 03:06:46
Message-ID: 20170513030646.4p2awpt4gt2ldbtn@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-05-12 21:56:30 -0400, Robert Haas wrote:
> Cheap isn't free, though. It's got a double-digit percentage overhead
> rather than a large-multiple-of-the-runtime overhead as triggers do,
> but people still won't want to pay it unnecessarily, I think.

That should be partiall addressable with reasonable amounts of
engineering though. Efficiently computing the target partition in a
hash-partitioned table can be implemented very efficiently, and adding
infrastructure for multiple bulk insert targets in copy should be quite
doable as well. It's also work that's generally useful, not just for
backups.

The bigger issue to me here wrt pg_dump is that partitions can restored
in parallel, but that'd probably not work as well if dumped
separately. Unless we'd do the re-routing on load, rather than when
dumping - which'd also increase cache locality, by most of the time
(same architecture/encoding/etc) having one backend insert into the same
partition.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-05-13 04:42:01 Re: multi-column range partition constraint
Previous Message Michael Paquier 2017-05-13 02:58:49 Re: Addition of pg_dump --no-publications