Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)
Date: 2014-03-12 06:26:10
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F897B8@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for your efforts!
> Head patched
> Diff
> Select - 500K 772ms 2659ms -200%
> Insert - 400K 3429ms 1948ms 43% (I am
> not sure how it improved in this case)
> delete - 200K 2066ms 3978ms -92%
> update - 200K 3915ms 5899ms -50%
>
> This patch shown how the custom scan can be used very well but coming to
> patch as It is having some performance problem which needs to be
> investigated.
>
> I attached the test script file used for the performance test.
>
First of all, it seems to me your test case has too small data set that
allows to hold all the data in memory - briefly 500K of 200bytes record
will consume about 100MB. Your configuration allocates 512MB of
shared_buffer, and about 3GB of OS-level page cache is available.
(Note that Linux uses free memory as disk cache adaptively.)

This cache is designed to hide latency of disk accesses, so this test
case does not fit its intention.
(Also, the primary purpose of this module is a demonstration for
heap_page_prune_hook to hook vacuuming, so simple code was preferred
than complicated implementation but better performance.)

I could reproduce the overall trend, no cache scan is faster than
cached scan if buffer is in memory. Probably, it comes from the
cost to walk down T-tree index using ctid per reference.
Performance penalty around UPDATE and DELETE likely come from
trigger invocation per row.
I could observe performance gain on INSERT a little bit.
It's strange for me, also. :-(

On the other hand, the discussion around custom-plan interface
effects this module because it uses this API as foundation.
Please wait for a few days to rebase the cache_scan module onto
the newer custom-plan interface; that I submitted just a moment
before.

Also, is it really necessary to tune the performance stuff in this
example module of the heap_page_prune_hook?
Even though I have a few ideas to improve the cache performance,
like insertion of multiple rows at once or local chunk copy instead
of t-tree walk down, I'm not sure whether it is productive in the
current v9.4 timeframe. ;-(

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Haribabu Kommi
> Sent: Wednesday, March 12, 2014 1:14 PM
> To: Kohei KaiGai
> Cc: Kaigai Kouhei(海外 浩平); Tom Lane; PgHacker; Robert Haas
> Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only
> table scan?)
>
> On Thu, Mar 6, 2014 at 10:15 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
> > 2014-03-06 18:17 GMT+09:00 Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>:
> >> I will update you later regarding the performance test results.
> >>
>
> I ran the performance test on the cache scan patch and below are the readings.
>
> Configuration:
>
> Shared_buffers - 512MB
> cache_scan.num_blocks - 600
> checkpoint_segments - 255
>
> Machine:
> OS - centos - 6.4
> CPU - 4 core 2.5 GHZ
> Memory - 4GB
>
> Head patched
> Diff
> Select - 500K 772ms 2659ms -200%
> Insert - 400K 3429ms 1948ms 43% (I am
> not sure how it improved in this case)
> delete - 200K 2066ms 3978ms -92%
> update - 200K 3915ms 5899ms -50%
>
> This patch shown how the custom scan can be used very well but coming to
> patch as It is having some performance problem which needs to be
> investigated.
>
> I attached the test script file used for the performance test.
>
> Regards,
> Hari Babu
> Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2014-03-12 06:43:18 Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)
Previous Message Pavel Stehule 2014-03-12 06:23:17 Re: COPY table FROM STDIN doesn't show count tag