Re: Patch for fast gin cache performance improvement

From: "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Ian Link'" <ian(at)ilink(dot)io>, "'pgsql-hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch for fast gin cache performance improvement
Date: 2013-09-26 13:02:53
Message-ID: 006801cebab8$b6352c90$229f85b0$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Ian,

> This patch contains a performance improvement for the fast gin cache. As you
> may know, the performance of the fast gin cache decreases with its size.
> Currently, the size of the fast gin cache is tied to work_mem. The size of
> work_mem can often be quite high. The large size of work_mem is inappropriate
> for the fast gin cache size. Therefore, we created a separate cache size
called
> gin_fast_limit. This global variable controls the size of the fast gin cache,
> independently of work_mem. Currently, the default gin_fast_limit is set to
128kB.
> However, that value could need tweaking. 64kB may work better, but it's hard
> to say with only my single machine to test on.

> On my machine, this patch results in a nice speed up. Our test queries improve
> from about 0.9 ms to 0.030 ms. Please feel free to use the test case yourself:
> it should be attached. I can look into additional test cases (tsvectors) if
> anyone is interested.

> In addition to the global limit, we have provided a per-index limit:
> fast_cache_size. This per-index limit begins at -1, which means that it is
> disabled. If the user does not specify a per-index limit, the index will
simply
> use the global limit.

I had a look over this patch. I think this patch is interesting and very
useful. Here are my review points:

1. Patch applies cleanly.
2. make, make install and make check is good.
3. I did performance evaluation using your test queries with 64kB and 128kB of
gin_fast_limit (or fast_cache_size), and saw that both values achieved the
performance gains over gin_fast_limit = '256MB'. 64kB worked better than 128kB.
64kB improved from 1.057 ms to 0.075 ms. Great!
4. In my understanding, the small value of gin_fast_limit/fast_cache_size leads
to the increase in GIN search performance, which, however, leads to the decrease
in GIN update performance. Am I right? If so, I think the tradeoff should be
noted in the documentation.
5. The following documents in Chapter 57. GIN Indexes need to be updated:
* 57.3.1. GIN Fast Update Technique
* 57.4. GIN Tips and Tricks
6. I would like to see the results for the additional test cases (tsvectors).
7. The commented-out elog() code should be removed.
8. I think there are no issues in this patch. However, I have one question: how
this patch works in the case where gin_fast_limit/fast_cache_size = 0? In this
case, in my understanding, this patch inserts new entries into the pending list
temporarily and immediately moves them to the main GIN data structure using
ginInsertCleanup(). Am I right? If so, that is obviously inefficient.

Sorry for the delay.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2013-09-26 13:27:13 Re: dynamic shared memory
Previous Message Bruce Momjian 2013-09-26 12:58:11 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE