From: | Arya F <arya6000(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Postgresql 12, 512 partition by hash. Slow select |
Date: | 2020-04-05 19:50:18 |
Message-ID: | CAFoK1azuLEF=Q_wmFgXcxo5bkiG_06VzD18svk9gBNCR+YEqeg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Sun, Apr 5, 2020 at 2:55 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Arya F <arya6000(at)gmail(dot)com> writes:
> > I have a table with 120 million rows of data spread among 512
> > partitioned by hash table. The id column of the table is a uuid, which
> > is what is being used for the partition hash and it's also the PK for
> > the table.
>
> > The table has a text column, which also has a btree index on it. A
> > select query on an identical non-partitioned table takes 0.144
> > seconds, but on the partitioned table it takes 5.689 seconds.
>
> > Am I missing something in my setup? Or is this expected? I do know
> > having more than 100 partitions in prior versions of PostgreSQL 12
> > would cause a major slow down, but from what I read PostgreSQL 12
> > addresses that now?
>
> You have your expectations calibrated wrongly, I suspect.
>
> Your default expectation with a table with many partitions should be
> that queries will have to hit all those partitions and it will take a
> long time. If the query is such that the system can prove that it
> only needs to access one partition, then it can be fast --- but those
> proof rules are not superlatively bright, and they're especially not
> bright for hash partitioning since that has so little relationship
> to WHERE restrictions that practical queries would use. But if the
> query WHERE isn't restricting the partitioning key at all, as I suspect
> is the case for your query, then there's certainly no chance of not
> having to search all the partitions.
>
> If you showed us the specific table declaration and query you're
> working with, it might be possible to offer more than generalities.
>
> In general though, partitioning should be a last resort when you've
> got so much data that you have no other choice. I doubt that you
> are there at all with 100M rows, and you are certainly not at a point
> where using hundreds of partitions is a good idea. They are not
> cost-free, by a very long shot. And when you do partition, you
> typically need to think hard about what the partitioning rule will be.
> I'm afraid that hash partitioning is more of a shiny trap for novices
> than it is a useful tool, because it doesn't organize the data into
> meaningful sub-groups.
>
> regards, tom lane
The table at some point will have more than 1 billion rows, the
information stored is international residential addresses. Trying to
figure out a way of spreading the data fairly evenly thought out
multiple partitions, but I was unable to come up with a way of
splitting the data so that Postgres does not have to search though all
the partitions.
From | Date | Subject | |
---|---|---|---|
Next Message | Neil | 2020-04-05 20:48:49 | Re: Postgresql 12, 512 partition by hash. Slow select |
Previous Message | Tom Lane | 2020-04-05 18:55:00 | Re: Postgresql 12, 512 partition by hash. Slow select |