Re: Should we warn against using too many partitions?

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should we warn against using too many partitions?
Date: 2019-06-10 08:11:02
Message-ID: CA+HiwqHZ_YTu5ZR7q_dcQ+XO-57_bc1j=q5gzgrmw+B+qNJnQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the updated patches.

On Sun, Jun 9, 2019 at 5:29 AM David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On Fri, 7 Jun 2019 at 19:00, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > Maybe:
> >
> > ... Removal of unwanted data is also a factor to consider when
> > planning your partitioning strategy as an entire partition can be
> > removed fairly quickly, especially if the partition keys are chosen
> > such that all data that can be deleted together are grouped into
> > separate partitions.
>
> It seems like a good idea to change this to have this mention the
> benefits rather than the drawbacks. I've reworded it, but not using
> your exact words as it seems the "especially" means that a partition
> can be removed faster with properly chosen partition keys, which is
> not the case.
>
> I also split this out into its own paragraph since it's talking about
> something quite different from the previous paragraph.

Did you miss to split? In v4 patches, I still see this point
mentioned in the same paragraph that it was in before:

+ <para>
+ One of the most critical design decisions will be the column or columns
+ by which you partition your data. Often the best choice will be to
+ partition by the column or set of columns which most commonly appear in
+ <literal>WHERE</literal> clauses of queries being executed on the
+ partitioned table. <literal>WHERE</literal> clause items that match and
+ are compatible with the partition key can be used to prune unneeded
+ partitions. Removal of unwanted data is also a factor to consider when
+ planning your partitioning strategy. An entire partition can be detached
+ fairly quickly, so it may be beneficial to design the partition strategy
+ in such a way that all data to be removed at once is located in a single
+ partition.
+ </para>

> > 2.
> >
> > + ... For example, if you choose to have one partition
> > + per customer and you currently have a small number of large customers,
> > + what will the implications be if in several years you obtain a large
> > + number of small customers.
> >
> > The sentence could be rewritten a bit. Maybe as:
> >
> > ... For example, choosing a design with one partition per customer,
> > because you currently have a small number of large customers, will not
> > scale well several years down the line when you might have a large
> > number of small customers.
> >
> > Btw, doesn't it suffice here to say "large number of customers"
> > instead of "large number of small customers"?
>
> I'm not really trying to imply to plan for business growth here, I'm
> trying to angle it as "what if your business changes".

Hmm, okay. I thought you were intending this as an example of how a
particular partitioning design may not *scale with time*.

> I've reworded
> this slightly and it now says "what will the implications be if in
> several years you instead find yourself with a large number of small
> customers."

I suggest "consider the implications" in place of "what will the
implications be...". Also a user may choose a particular design (one
partition per customer) *because* of their business situation (small
number of large customers), so I suggest linking the two clauses with
"because". With these two changes, the whole sentence will read more
connected, imho:

For example, if you choose to have one partition per customer because
you currently have a small number of large customers, consider the
implications if in several years you instead find yourself with a
large number of small customers.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2019-06-10 08:37:14 Re: postgres_fdw: oddity in costing presorted foreign scans with local stats
Previous Message Kuntal Ghosh 2019-06-10 08:10:21 Re: Why to index a "Recently DEAD" tuple when creating index