Re: When to use PARTITION BY HASH?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: When to use PARTITION BY HASH?
Date: 2020-06-04 14:32:42
Message-ID: CAMkU=1w0JzvpjrNFmQJ94w1+JrF+XeY0_5f9Z9R4mZr6hOOpkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Wed, Jun 3, 2020 at 7:55 AM Oleksandr Shulgin <
oleksandr(dot)shulgin(at)zalando(dot)de> wrote:

With hash partitioning you are not expected, in general, to end up with a
> small number of partitions being accessed more heavily than the rest. So
> your indexes will also not fit into memory.
>
> I have the feeling that using a hash function to distribute rows simply
> contradicts the basic assumption of when you would think of partitioning
> your table at all: that is to make sure the most active part of the table
> and indexes is small enough to be cached in memory.
>

While hash partitioning doesn't appeal to me, I think this may be overly
pessimistic. It would not be all that unusual for your customers to take
turns being highly active and less active. Especially if you do occasional
bulk loads all with the same customer_id for any given load, for example.
So while you might not have a permanently hot partition, you could have
partitions which are hot in turn. Of course you could get the same benefit
(and probably better) with list or range partitioning rather than hash, but
then you have to maintain those lists or ranges when you add new customers.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Susan Joseph 2020-06-04 15:00:57 PostgreSQL 11 with SSL on Linux
Previous Message Matt Zagrabelny 2020-06-04 14:21:27 Re: Can we get SQL Server-like cross database queries

Browse pgsql-performance by date

  From Date Subject
Next Message Philip Semanchuk 2020-06-04 16:41:35 Re: increased max_parallel_workers_per_gather results in fewer workers?
Previous Message Magnus Hagander 2020-06-04 09:30:54 Re: increased max_parallel_workers_per_gather results in fewer workers?