Re: PostgreSQL 11 higher Planning time on Partitioned table

From: Ravi Garg <ravi(dot)garg(at)yahoo(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: PostgreSQL 11 higher Planning time on Partitioned table
Date: 2020-02-24 19:40:16
Message-ID: 843623673.5437661.1582573216373@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Justin,
>I didn't hear how large the tables and indexes are.+-------------------------------------------+------------------+--------------------------------------------+|              table_name                   | pg_relation_size |  pg_total_relation_size - pg_relation_size |+-------------------------------------------+------------------+--------------------------------------------+| TransactionLog_20200213                   |      95646646272 | 4175699968                                 || TransactionLog_20200212                   |      95573344256 | 4133617664                                 || TransactionLog_20200211                   |      91477336064 | 3956457472                                 || TransactionLog_20200210                   |       8192000000 |  354344960                                 || TransactionLog_20200214                   |       6826672128 |  295288832                                 || TransactionLog_20200220                   |       1081393152 |   89497600                                 || pg_catalogpg_attribute                    |          3088384 |    2220032                                 || TransactionLog_20190925                   |          1368064 |      90112  (174 such partitions)          |+-------------------------------------------+------------------+--------------------------------------------+ > Do you mean that a given query is only going to hit 2 partitions ?  Or do you> mean that all but the most recent 2 partitions are "archival" and won't be
> needed by future queries ?
Yes all queries will hit only 2 partitions (e.g. if we do daily partition, queries will hit only today's and yesterday's partition).
> You should determine what an acceptable planning speed is, or the best balance> of planning/execution time.  Try to detach half your current partitions and, if> that gives acceptable performance, then partition by day/2 or more.  You could> make a graph of (planning and total) time vs npartitions, since I think it's> likely to be nonlinear.> I believe others have reported improved performance under v11 with larger> numbers of partitions, by using "partitions of partitions".  So you could try> making partitions by month themselves partitioned by day.
FYI, these are the observations I am getting with various number of partition and a multilevel partition with respect to Un-Partitioned.+---------------+----------------------+----------------+---------------+--------------+---------------+--------------+---------------+--------------+| Testcase      | Partition Count      | Records in     | Select        | Select       | Update        | Update       | insert        | insert       ||               |                      | each Partition | planning (ms) | execute (ms) | planning (ms) | execute (ms) | planning (ms) | execute (ms) |+---------------+----------------------+----------------+---------------+--------------+---------------+--------------+---------------+--------------+| Single Level  |   6                  | 1000           |  1.162        | 0.045        |  2.112        | 0.115        | 1.261         | 0.178        || Partition     |  30                  | 1000           |  2.879        | 0.049        |  5.146        | 0.13         | 1.243         | 0.211        ||               | 200                  | 1000           | 18.479        | 0.087        | 31.385        | 0.18         | 1.253         | 0.468        |+---------------+----------------------+----------------+---------------+--------------+---------------+--------------+---------------+--------------+| Multi Level   | 6 Partition having   | 1000           | 3.6032        | 0.0695       | x             | x            | x             | x            || Partition     | 30 subpartition each |                |               |              |               |              |               |              |+---------------+----------------------+----------------+---------------+--------------+---------------+--------------+---------------+--------------+| UnPartitioned | NA                   | 430 Million    | 0.0875        | 0.0655       | x             | x            | x             | x            |+---------------+----------------------+----------------+---------------+--------------+---------------+--------------+---------------+--------------+
> If you care about INSERT performance, you probably need to make at least a> single partition's index fit within shared_buffers (or set shared_buffers such> that it fits).  Use transactions around your inserts.  If your speed is not> limited by I/O, you could further use multiple VALUES(),() inserts, or maybe> prepared statements.  Maybe synchronous_commit=off.> > If you care about (consistent) SELECT performance, you should consider> VACUUMing the tables after bulk inserts, to set hint bits (and since> non-updated tuples won't be hit by autovacuum).  Or maybe VACUUM FREEZE to> freeze tuples (since it sounds like a typical page is unlikely to ever be> updated).
Sure, I'll evaluate these settings, thanks.
Thanks and Regards,
Ravi Garg

On Sunday, 23 February, 2020, 08:40:58 pm IST, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

On Sun, Feb 23, 2020 at 10:57:29AM +0000, Ravi Garg wrote:
>    - Currently we are thinking to have Daily partitions and as we need to keep 6 months of data thus 180 Partitions.However we have liberty to reduce the number of partitions to weekly/fortnightly/monthly, If we get comparable performance. 

I didn't hear how large the tables and indexes are.

>    - We need to look current partition and previous partition for all of our use-cases/queries.

Do you mean that a given query is only going to hit 2 partitions ?  Or do you
mean that all but the most recent 2 partitions are "archival" and won't be
needed by future queries ?

> Can you please suggest what sort of combinations/partition strategy we can test considering data-volume/vacuum etc. Also let me know if some of the pg_settings can help us tuning this (I have attached my pg_settings).

You should determine what an acceptable planning speed is, or the best balance
of planning/execution time.  Try to detach half your current partitions and, if
that gives acceptable performance, then partition by day/2 or more.  You could
make a graph of (planning and total) time vs npartitions, since I think it's
likely to be nonlinear.

I believe others have reported improved performance under v11 with larger
numbers of partitions, by using "partitions of partitions".  So you could try
making partitions by month themselves partitioned by day.

>    - Our use case is limited to simple selects (we don't join with the other
>    tables) however, we are expecting ~70 million records inserted per day
>    and there would be couple of updates on each records where average record
>    size would be ~ 1.5 KB.

>  shared_buffers                        | 1048576

If you care about INSERT performance, you probably need to make at least a
single partition's index fit within shared_buffers (or set shared_buffers such
that it fits).  Use transactions around your inserts.  If your speed is not
limited by I/O, you could further use multiple VALUES(),() inserts, or maybe
prepared statements.  Maybe synchronous_commit=off.

If you care about (consistent) SELECT performance, you should consider
VACUUMing the tables after bulk inserts, to set hint bits (and since
non-updated tuples won't be hit by autovacuum).  Or maybe VACUUM FREEZE to
freeze tuples (since it sounds like a typical page is unlikely to ever be
updated).

--
Justin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ravi Garg 2020-02-24 19:44:34 Re: PostgreSQL 11 higher Planning time on Partitioned table
Previous Message Merlin Moncure 2020-02-24 16:34:03 Re: DB running out of memory issues after upgrade