Re: Declarative partitioning optimization for large amount of partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Declarative partitioning optimization for large amount of partitions
Date: 2017-03-01 01:36:24
Message-ID: 2e319dc9-27cc-6a5e-3c86-ac8b4107c689@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017/02/28 23:25, Aleksander Alekseev wrote:
> Hello.
>
> I decided to figure out whether current implementation of declarative
> partitioning has any bottlenecks when there is a lot of partitions. Here
> is what I did [1].

Thanks for sharing.

> Then:
>
> ```
> # 2580 is some pk that exists
> echo 'select * from part_test where pk = 2580;' > t.sql
> pgbench -j 7 -c 7 -f t.sql -P 1 -T 300 eax
> ```
>
> `perf top` showed to bottlenecks [2]. A stacktrace for the first one
> looks like this [3]:
>
> ```
> 0x00000000007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') at pgstat.c:1689
> 1689 if (entry->t_id == rel_id)
> #0 0x00000000007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 '\000') at pgstat.c:1689
> #1 0x00000000007a4275 in pgstat_initstats (rel=0x7f4af3fd41f8) at pgstat.c:1666
> #2 0x00000000004c7090 in relation_open (relationId=25696, lockmode=0) at heapam.c:1137
> #3 0x00000000004c72c9 in heap_open (relationId=25696, lockmode=0) at heapam.c:1291
> (skipped)
> ```
>
> And here is a stacktrace for the second bottleneck [4]:
>
> ```
> 0x0000000000584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, numparents=0x0) at pg_inherits.c:199
> 199 forboth(lo, rels_list, li, rel_numparents)
> #0 0x0000000000584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, numparents=0x0) at pg_inherits.c:199
> #1 0x000000000077fc9f in expand_inherited_rtentry (root=0x1badcb8, rte=0x1b630b8, rti=1) at prepunion.c:1408
> #2 0x000000000077fb67 in expand_inherited_tables (root=0x1badcb8) at prepunion.c:1335
> #3 0x0000000000767526 in subquery_planner (glob=0x1b63cc0, parse=0x1b62fa0, parent_root=0x0, hasRecursion=0 '\000', tuple_fraction=0) at planner.c:568
> (skipped)
> ```
>
> The first one could be easily fixed by introducing a hash table
> (rel_id -> pgStatList entry). Perhaps hash table should be used only
> after some threshold. Unless there are any objections I will send a
> corresponding patch shortly.

I have never thought about this one really.

> I didn't explored the second bottleneck closely yet but at first glance
> it doesn't look much more complicated.

I don't know which way you're thinking of fixing this, but a planner patch
to implement faster partition-pruning will have taken care of this, I
think. As you may know, even declarative partitioned tables currently
depend on constraint exclusion for partition-pruning and planner's current
approach of handling inheritance requires to open all the child tables
(partitions), whereas the new approach hopefully shouldn't need to do
that. I am not sure if looking for a more localized fix for this would be
worthwhile, although I may be wrong.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-03-01 01:58:00 PATCH: Configurable file mode mask
Previous Message Lukas Fittl 2017-03-01 01:01:29 [PATCH] Use $ parameters as replacement characters for pg_stat_statements