Re: GIN fast insert

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN fast insert
Date: 2009-02-19 13:36:48
Message-ID: 499D6070.3010103@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> and I still think that's bogus. It seems clear that this is going to
> change much faster than plans are going to be invalidated, and if
> autovacuum is doing its job, the pending list won't get long enough to
> matter much anyway, right? I don't think this patch should be
> touching gincostestimate at all.
Why not? For any estimation function it's possible to invent workload which will
change cost (or optimal plan) much faster than plan's invalidation.
Gincostestimate depends on statistic on table, not on real index's state, and
plan's invalidation will occur after analyze run.

> I am thinking that it is may not be necessary to remove the
> gingettuple interface (as you did in v0.28.2). Forcing a cleanup of
> the pending list seems like a reasonable workaround. We don't expect
> this situation to come up frequently, so if the method we use to
Agree

> handle it is not terribly efficient, oh well. The one thing that
> concerns me is - what will happen in a hot standby environment, when
> that patch is committed? In that situation, I believe that we can't
> call modify any heap or index pages, so...

I don't see a problems here, because indexes in postgres don't depend on any
transaction's ids or modes as heap depends. WAL-logger works without that
knowledge too. May be I missed something here or don't understand.

Although heap's pages could be changed by setting commit-status bits on tuple
even in read-only transaction, but that changes are not WAL-logged. That is
correct for index's page too.

> 1. The description of the "fastupdate" reloption should be reworded
> for consistency with other options:
> Enables "fast update" feature for this GIN index
Fixed

>
> 2. Why is this implemented as a string reloption rather than a boolean
> reloption? It seems like you want to model this on
> autovacuum_enabled.
Fixed

> 3. Documentation wordsmithing. You have the following paragraph:
Done

> 4. More wordsmithing. In the following paragraph, you have:
Done

> 5. In textsearch.sgml, you say that GIN indexes are moderately slower
> to update, but about 10x slower without fastupdate. Can we provide a
> real number in place of "moderately"? I don't know whether to think
> this means 20% or 2x.
Will make exact measurement some later :)

> 6. One of the comment changes in startScanEntry is simply a correction
> of a typographical error ("deletion" for "deletition"). You might as
> well commit this change separately and remove it from this patch.
Sorry, I don't find what you mean. I found only "begining" (and fixed it)

> 7. pg_stat_get_fresh_inserted_tuples. I am not crazy about the fact
> that we call this the pending list in some places, fast update in some
> places, and now, here, fresh tuples. Let's just call it fast insert
> tuples.
It's really inserted ("fresh") tuples in table since last vacuum. This number
doesn't depend on existence of GIN indexes or its modes.

> 8. tbm_check_tuple. The opening curly brace should be uncuddled. The
> curly braces around wordnum = bitnum = 0 are superfluous.
It's a copy/paste code from tbm_add_tuples, and I'd like
if
{
...
}
else
{
...
}

much more than
if
..
else
{

}

>
> 9. gincostestimate. There are a lot of superfluous whitespace changes
> here, and some debugging code that obviously wasn't fully removed.
Cleaned.

> 10. GinPageOpaqueData. Surely we can come up with a better name than
> GIN_LIST. This is yet another name for the same concept. Why not call
> this GIN_FAST_INSERT_LIST?
Like other defines here - GIN_DATA, not a GIN_DATA_PAGE, GIN_LEAF - not a
GIN_LEAF_TREE_PAGE.

>
> 11. ginInsertCleanup. "Inserion" is a typo.
fixed

> Unfortunately, I don't understand all of this patch well enough to
> give it as thorough a going-over as it deserves, so my apologies for
> whatever I've missed.
Thank you a lot for your reviewing.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
fast_insert_gin-0.29.1.gz application/x-tar 24.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ohp 2009-02-19 14:20:08 Re: pg_restore new option -m
Previous Message Bruce Momjian 2009-02-19 13:20:03 Re: vacuumdb --freeze