Re: Patch: ResourceOwner optimization for tables with many partitions

From: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: ResourceOwner optimization for tables with many partitions
Date: 2015-12-09 13:30:52
Message-ID: 20151209163052.48d2d43a@fujitsu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, Robert

Thanks for your review. I believe I fixed items 1, 2 and 3 (see
attachment). Also I would like to clarify item 4.

> 4. It mixes together multiple ideas in a single patch, not only
> introducing a hashing concept but also striping a brand-new layer of
> abstraction across the resource-owner mechanism. I am not sure that
> layer of abstraction is a very good idea, but if it needs to be done,
> I think it should be a separate patch.

Do I right understand that you suggest following?

Current patch should be split in two parts. In first patch we create
and use ResourceArray with array-based implementation (abstraction
layer). Then we apply second patch which change ResourceArray
implementation to hashing based (optimization).

Best regards,
Aleksander

On Tue, 8 Dec 2015 17:30:33 -0500
Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev
> <a(dot)alekseev(at)postgrespro(dot)ru> wrote:
> > Current implementation of ResourceOwner uses arrays to store
> > resources like TupleDescs, Snapshots, etc. When we want to release
> > one of these resources ResourceOwner finds it with linear scan.
> > Granted, resource array are usually small so its not a problem most
> > of the time. But it appears to be a bottleneck when we are working
> > with tables which have a lot of partitions.
> >
> > To reproduce this issue:
> > 1. run `./gen.pl 10000 | psql my_database postgres`
> > 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database`
> > 3. in second terminal run `sudo perf top -u postgres`
> >
> > Both gen.pl and q.sql are attached to this message.
> >
> > You will see that postgres spends a lot of time in
> > ResourceOwnerForget* procedures:
> >
> > 32.80% postgres [.] list_nth
> > 20.29% postgres [.] ResourceOwnerForgetRelationRef
> > 12.87% postgres [.] find_all_inheritors
> > 7.90% postgres [.] get_tabstat_entry
> > 6.68% postgres [.] ResourceOwnerForgetTupleDesc
> > 1.17% postgres [.] hash_search_with_hash_value
> > ... < 1% ...
> >
> > I would like to suggest a patch (see attachment) witch fixes this
> > bottleneck. Also I discovered that there is a lot of code
> > duplication in ResourceOwner. Patch fixes this too. The idea is
> > quite simple. I just replaced arrays with something that could be
> > considered hash tables, but with some specific optimizations.
> >
> > After applying this patch we can see that bottleneck is gone:
> >
> > 42.89% postgres [.] list_nth
> > 18.30% postgres [.] find_all_inheritors
> > 10.97% postgres [.] get_tabstat_entry
> > 1.82% postgres [.] hash_search_with_hash_value
> > 1.21% postgres [.] SearchCatCache
> > ... < 1% ...
> >
> > For tables with thousands partitions we gain in average 32.5% more
> > TPS.
> >
> > As far as I can see in the same time patch doesn't make anything
> > worse. `make check` passes with asserts enabled and disabled. There
> > is no performance degradation according to both standard pgbench
> > benchmark and benchmark described above for tables with 10 and 100
> > partitions.
>
> I do think it's interesting to try to speed up the ResourceOwner
> mechanism when there are many resources owned. We've had some forays
> in that direction in the past, and I think it's useful to continue
> that work. But I also think that this patch, while interesting, is
> not something we can seriously consider committing in its current
> form, because:
>
> 1. It lacks adequate comments and submission notes to explain the
> general theory of operation of the patch.
>
> 2. It seems to use uint8 * rather freely as a substitute for char * or
> void *, which doesn't seem like a great idea.
>
> 3. It doesn't follow PostgreSQL formatting conventions (uncuddled
> curly braces, space after if and before open paren, etc.).
>
> 4. It mixes together multiple ideas in a single patch, not only
> introducing a hashing concept but also striping a brand-new layer of
> abstraction across the resource-owner mechanism. I am not sure that
> layer of abstraction is a very good idea, but if it needs to be done,
> I think it should be a separate patch.
>

Attachment Content-Type Size
resource-owner-optimization-v2.patch text/x-patch 36.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2015-12-09 14:09:09 Re: WIP: Rework access method interface
Previous Message Ants Aasma 2015-12-09 13:05:14 Re: W-TinyLfu for cache eviction