Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Krogh <jesper(at)krogh(dot)cc>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index
Date: 2014-10-30 10:30:52
Message-ID: 5452135C.5010706@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2014/10/09 11:49), Etsuro Fujita wrote:
> (2014/10/08 22:51), Fujii Masao wrote:
>> On Wed, Sep 24, 2014 at 11:10 AM, Etsuro Fujita
>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera
>>>> <alvherre(at)2ndquadrant(dot)com> wrote:
>>>>> Fujii Masao wrote:
>>>>>> On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita
>>>>>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

>>>>>>> PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting.
>>>>>>> Wouldn't it be easy-to-use to have only one parameter,
>>>>>>> PENDING_LIST_CLEANUP_SIZE? How about setting
>>>>>>> PENDING_LIST_CLEANUP_SIZE
>>>>>>> to
>>>>>>> work_mem as the default value when running the CREATE INDEX command?

>>>>>> So what about introduing pending_list_cleanup_size also as GUC?
>>>>>> That is, users can set the threshold by using either the reloption or
>>>>>> GUC.

>>>>> Yes, I think having both a GUC and a reloption makes sense -- the GUC
>>>>> applies to all indexes, and can be tweaked for individual indexes
>>>>> using
>>>>> the reloption.

>>> OK, I'd vote for your idea of having both the GUC and the reloption.
>>> So, I
>>> think the patch needs to be updated. Fujii-san, what plan do you
>>> have about
>>> the patch?

>> Please see the attached patch. In this patch, I introduced the GUC
>> parameter,
>> pending_list_cleanup_size. I chose 4MB as the default value of the
>> parameter.
>> But do you have any better idea about that default value?

> It seems reasonable to me that the GUC has the same default value as
> work_mem. So, +1 for the default value of 4MB.

>> BTW, I moved the CommitFest entry of this patch to next CF 2014-10.

> OK, I'll review the patch in the CF.

Thank you for updating the patch! Here are my review comments.

* The patch applies cleanly and make and make check run successfully. I
think that the patch is mostly good.

* In src/backend/utils/misc/guc.c:
+ {
+ {"pending_list_cleanup_size", PGC_USERSET, CLIENT_CONN_STATEMENT,
+ gettext_noop("Sets the maximum size of the pending list for GIN
index."),
+ NULL,
+ GUC_UNIT_KB
+ },
+ &pending_list_cleanup_size,
+ 4096, 0, MAX_KILOBYTES,
+ NULL, NULL, NULL
+ },

ISTM it'd be better to use RESOURCES_MEM, not CLIENT_CONN_STATEMENT. No?
Also why not set min to 64, not to 0, in accoradance with that of
work_mem?

* In src/backend/utils/misc/postgresql.conf.sample:
Likewise, why not put this variable in the section of RESOURCE USAGE,
not in that of CLIENT CONNECTION DEFAULTS.

* In src/backend/access/common/reloptions.c:
+ {
+ {
+ "pending_list_cleanup_size",
+ "Maximum size of the pending list for this GIN index, in kilobytes.",
+ RELOPT_KIND_GIN
+ },
+ -1, 0, MAX_KILOBYTES
+ },

As in guc.c, why not set min to 64, not to 0?

* In src/include/access/gin.h:
/* GUC parameter */
extern PGDLLIMPORT int GinFuzzySearchLimit;
+ extern int pending_list_cleanup_size;

The comment should be "GUC parameters".

* In src/backend/access/gin/ginfast.c:
+ /* GUC parameter */
+ int pending_list_cleanup_size = 0;

Do we need to substitute 0 for pending_list_cleanup_size?

* In doc/src/sgml/config.sgml:
+ <varlistentry id="guc-pending-list-cleanup-size"
xreflabel="pending_list_cleanup_size">
+ <term><varname>pending_list_cleanup_size</varname>
(<type>integer</type>)

As in postgresql.conf.sample, ISTM it'd be better to explain this
variable in the section of Resource Consumption (maybe in "Memory"), not
in that of Client Connection Defaults.

* In doc/src/sgml/gin.sgml:
! <literal>pending_list_cleanup_size</>. To avoid fluctuations in
observed

ISTM it'd be better to use <varname> for pending_list_cleanup_size, not
<literal>, here.

* In doc/src/sgml/ref/create_index.sgml:
+ <term><literal>PENDING_LIST_CLEANUP_SIZE</></term>

IMHO, it seems to me better for this variable to be in lowercase in
accordance with the GUC version. Also, I think that the words "GIN
indexes accept a different parameter:" in the section of "Index Storage
Parameters" in this reference page would be "GIN indexes accept
different parameters:".

Sorry for the delay in reviewing the patch.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-10-30 11:31:04 Re: Lockless StrategyGetBuffer() clock sweep
Previous Message Tomas Vondra 2014-10-30 10:29:39 Re: WIP: multivariate statistics / proof of concept