Re: Creating foreign key on partitioned table is too slow

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "kato-sho(at)fujitsu(dot)com" <kato-sho(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Creating foreign key on partitioned table is too slow
Date: 2019-10-30 22:19:05
Message-ID: CAKJS1f8v-fUG8YpaAGj309ZuALo3aEk7f6cqMHr_AVwz1fKXug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 31 Oct 2019 at 07:30, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> On Thu, Oct 24, 2019 at 04:28:38PM -0700, Andres Freund wrote:
> >Hi,
> >
> >On 2019-10-23 05:59:01 +0000, kato-sho(at)fujitsu(dot)com wrote:
> >> To benchmark with tpcb model, I tried to create a foreign key in the partitioned history table, but backend process killed by OOM.
> >> the number of partitions is 8192. I tried in master(commit: ad4b7aeb84).
> >
> >Obviously this should be improved. But I think it's also worthwhile to
> >note that using 8k partitions is very unlikely to be a good choice for
> >anything. The metadata, partition pruning, etc overhead is just going to
> >be very substantial.
> >
>
> True. Especially with two partitioned tables, each with 8k partitions.

In Ottawa this year, Andres and I briefly talked about the possibility
of making a series of changes to how equalfuncs.c works. The idea was
to make it easy by using some pre-processor magic to allow us to
create another version of equalfuncs which would let us have an equal
comparison function that returns -1 / 0 / +1 rather than just true or
false. This would allow us to Binary Search Trees of objects. I
identified that EquivalenceClass.ec_members would be better written as
a BST to allow much faster lookups in get_eclass_for_sort_expr().

The implementation I had in mind for the BST was a compact tree that
instead of using pointers for the left and right children, it just
uses an integer to reference the array element number. This would
allow us to maintain very fast insert-order traversals. Deletes would
need to decrement all child references greater than the deleted index.
This is sort of on-par with how the new List implementation in master.
i.e deletes take additional effort, but inserts are fast if there's
enough space in the array for a new element, traversals are
cache-friendly, etc. I think trees might be better than hash tables
for this as a hash function needs to hash all fields, whereas a
comparison function can stop when it finds the first non-match.

This may also be able to help simplify the code in setrefs.c to get
rid of the complex code around indexed tlists. tlist_member() would
become O(log n) instead of O(n), so perhaps there'd be not much point
in having both search_indexed_tlist_for_var() and
search_indexed_tlist_for_non_var().

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-10-30 22:29:07 Re: Creating foreign key on partitioned table is too slow
Previous Message Peter Geoghegan 2019-10-30 21:59:16 Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation