Re: Postgresql 12, 512 partition by hash. Slow select

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Arya F <arya6000(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Postgresql 12, 512 partition by hash. Slow select
Date: 2020-04-05 18:55:00
Message-ID: 31605.1586112900@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Arya F <arya6000(at)gmail(dot)com> writes:
> I have a table with 120 million rows of data spread among 512
> partitioned by hash table. The id column of the table is a uuid, which
> is what is being used for the partition hash and it's also the PK for
> the table.

> The table has a text column, which also has a btree index on it. A
> select query on an identical non-partitioned table takes 0.144
> seconds, but on the partitioned table it takes 5.689 seconds.

> Am I missing something in my setup? Or is this expected? I do know
> having more than 100 partitions in prior versions of PostgreSQL 12
> would cause a major slow down, but from what I read PostgreSQL 12
> addresses that now?

You have your expectations calibrated wrongly, I suspect.

Your default expectation with a table with many partitions should be
that queries will have to hit all those partitions and it will take a
long time. If the query is such that the system can prove that it
only needs to access one partition, then it can be fast --- but those
proof rules are not superlatively bright, and they're especially not
bright for hash partitioning since that has so little relationship
to WHERE restrictions that practical queries would use. But if the
query WHERE isn't restricting the partitioning key at all, as I suspect
is the case for your query, then there's certainly no chance of not
having to search all the partitions.

If you showed us the specific table declaration and query you're
working with, it might be possible to offer more than generalities.

In general though, partitioning should be a last resort when you've
got so much data that you have no other choice. I doubt that you
are there at all with 100M rows, and you are certainly not at a point
where using hundreds of partitions is a good idea. They are not
cost-free, by a very long shot. And when you do partition, you
typically need to think hard about what the partitioning rule will be.
I'm afraid that hash partitioning is more of a shiny trap for novices
than it is a useful tool, because it doesn't organize the data into
meaningful sub-groups.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Arya F 2020-04-05 19:50:18 Re: Postgresql 12, 512 partition by hash. Slow select
Previous Message Andreas Kretschmer 2020-04-05 18:41:03 Re: Postgresql 12, 512 partition by hash. Slow select