| From: | Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru> | 
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> | 
| Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Patch: ResourceOwner optimization for tables with many partitions | 
| Date: | 2015-12-09 13:30:52 | 
| Message-ID: | 20151209163052.48d2d43a@fujitsu | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hello, Robert
Thanks for your review. I believe I fixed items 1, 2 and 3 (see
attachment). Also I would like to clarify item 4.
> 4. It mixes together multiple ideas in a single patch, not only
> introducing a hashing concept but also striping a brand-new layer of
> abstraction across the resource-owner mechanism.  I am not sure that
> layer of abstraction is a very good idea, but if it needs to be done,
> I think it should be a separate patch.
Do I right understand that you suggest following?
Current patch should be split in two parts. In first patch we create
and use ResourceArray with array-based implementation (abstraction
layer). Then we apply second patch which change ResourceArray
implementation to hashing based (optimization).
Best regards,
Aleksander
On Tue, 8 Dec 2015 17:30:33 -0500
Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Dec 4, 2015 at 7:15 AM, Aleksander Alekseev
> <a(dot)alekseev(at)postgrespro(dot)ru> wrote:
> > Current implementation of ResourceOwner uses arrays to store
> > resources like TupleDescs, Snapshots, etc. When we want to release
> > one of these resources ResourceOwner finds it with linear scan.
> > Granted, resource array are usually small so its not a problem most
> > of the time. But it appears to be a bottleneck when we are working
> > with tables which have a lot of partitions.
> >
> > To reproduce this issue:
> > 1. run `./gen.pl 10000 | psql my_database postgres`
> > 2. run `pgbench -j 8 -c 8 -f q.sql -T 100 my_database`
> > 3. in second terminal run `sudo perf top -u postgres`
> >
> > Both gen.pl and q.sql are attached to this message.
> >
> > You will see that postgres spends a lot of time in
> > ResourceOwnerForget* procedures:
> >
> >  32.80%  postgres               [.] list_nth
> >  20.29%  postgres               [.] ResourceOwnerForgetRelationRef
> >  12.87%  postgres               [.] find_all_inheritors
> >   7.90%  postgres               [.] get_tabstat_entry
> >   6.68%  postgres               [.] ResourceOwnerForgetTupleDesc
> >   1.17%  postgres               [.] hash_search_with_hash_value
> >  ... < 1% ...
> >
> > I would like to suggest a patch (see attachment) witch fixes this
> > bottleneck. Also I discovered that there is a lot of code
> > duplication in ResourceOwner. Patch fixes this too. The idea is
> > quite simple. I just replaced arrays with something that could be
> > considered hash tables, but with some specific optimizations.
> >
> > After applying this patch we can see that bottleneck is gone:
> >
> >  42.89%  postgres               [.] list_nth
> >  18.30%  postgres               [.] find_all_inheritors
> >  10.97%  postgres               [.] get_tabstat_entry
> >   1.82%  postgres               [.] hash_search_with_hash_value
> >   1.21%  postgres               [.] SearchCatCache
> >  ... < 1% ...
> >
> > For tables with thousands partitions we gain in average 32.5% more
> > TPS.
> >
> > As far as I can see in the same time patch doesn't make anything
> > worse. `make check` passes with asserts enabled and disabled. There
> > is no performance degradation according to both standard pgbench
> > benchmark and benchmark described above for tables with 10 and 100
> > partitions.
> 
> I do think it's interesting to try to speed up the ResourceOwner
> mechanism when there are many resources owned.  We've had some forays
> in that direction in the past, and I think it's useful to continue
> that work.  But I also think that this patch, while interesting, is
> not something we can seriously consider committing in its current
> form, because:
> 
> 1. It lacks adequate comments and submission notes to explain the
> general theory of operation of the patch.
> 
> 2. It seems to use uint8 * rather freely as a substitute for char * or
> void *, which doesn't seem like a great idea.
> 
> 3. It doesn't follow PostgreSQL formatting conventions (uncuddled
> curly braces, space after if and before open paren, etc.).
> 
> 4. It mixes together multiple ideas in a single patch, not only
> introducing a hashing concept but also striping a brand-new layer of
> abstraction across the resource-owner mechanism.  I am not sure that
> layer of abstraction is a very good idea, but if it needs to be done,
> I think it should be a separate patch.
> 
| Attachment | Content-Type | Size | 
|---|---|---|
| resource-owner-optimization-v2.patch | text/x-patch | 36.5 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Korotkov | 2015-12-09 14:09:09 | Re: WIP: Rework access method interface | 
| Previous Message | Ants Aasma | 2015-12-09 13:05:14 | Re: W-TinyLfu for cache eviction |