Re: GIN fast insert

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN fast insert
Date: 2009-02-23 14:50:11
Message-ID: 603c8f070902230650p40657826x854f9bcb25196ee9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 23, 2009 at 4:56 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> It would be helpful if Heikki or Simon could jump in here, but my
>> understanding is that cleaning up the pending list is a read-write
>> operation. I don't think we can do that on a hot standby server.
>
> >From reading the docs with the patch the pending list is merged into the
> main index when a VACUUM is performed. I (think I) can see that
> additions to the pending list are WAL logged, so that will work in Hot
> Standby. I also see ginEntryInsert() calls during the move from pending
> list to main index which means that is WAL logged also. AFAICS this
> *could* work during Hot Standby mode.
> Best check here http://wiki.postgresql.org/wiki/Hot_Standby#Usage
> rather than attempt to read the patch.
> Teodor, can you confirm
> * we WAL log the insert into the pending list
> * we WAL log the move from the pending list to the main index
> * that we maintain the pending list correctly during redo so that it can
> be accessed by index scans
>
> The main thing with Hot Standby is that we can't do any writes. So a
> pending list cannot change solely because of a gingettuple call on the
> *standby*.

That's what I thought. Thanks for confirming.

> But that's easy to disable. If all the inserts happened on
> the primary node and all the reads happened on the standby, then pending
> list would never be cleaned up if the cleanup is triggered only by read.

No, because the inserts would trigger VACUUM on the primary.

> I would suggest that we trigger cleanup by read at threshold size X and
> trigger cleanup by insert at threshold size 5X. That avoids the strange
> case mentioned, but generally ensures only reads trigger cleanup. (But
> why do we want that??)

I think that's actually not what we want. What we want is for VACUUM
to deal with it. Unfortunately that's hard to guarantee since, for
example, someone might turn autovacuum off. So the issue is what do
we do when we're in the midst of an index scan and our TIDBitmap has
become lossy. Right now, the answer is that we clean up the pending
list from inside the index scan and then retry the index scan. I
don't think that's going to work.

I'm starting to think that the right thing to do here is to create a
non-lossy option for TIDBitmap. Tom has been advocating just losing
the index scan AM altogether, but that risks losing performance in
cases where a LIMIT would have stopped the scan well prior to
completion.

> I found many parts of the patch and docs quite confusing because of the
> way things are named. For me, this is a deferred or delayed insert
> technique to allow batching. I would prefer if everything used one
> description, rather than "fast", "pending", "delayed" etc.

I mentioned this in my previous review (perhaps not quite so
articulately) and I completely agree with you. It's clear enough
reading the patch because you know that all the changes in the patch
must be related to each other, but once it's applied it's going to be
tough to figure out.

> Personally, I see ginInsertCleanup() as a scheduled task unrelated to
> vacuum. Making the deferred tasks happen at vacuum time is just a
> convenient way of having a background task occur regularly. That's OK
> for now, but I would like to be able to request a background task
> without having to hook into AV.

This has been discussed previously and I assume you will be submitting
a patch at some point, since no one else has volunteered to implement
it. I think autovacuum is the right way to handle this particular
case because it is a cleanup operation that is not dependent on time
but on write activity and hooks into more or less the same stats
infrastructure, but I don't deny the existence of other cases that
would benefit from a scheduler.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-02-23 15:05:37 Re: GIN fast insert
Previous Message Andrew Chernow 2009-02-23 14:38:19 Re: Hadoop backend?