Re: Reducing relation locking overhead

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Gregory Maxwell <gmaxwell(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reducing relation locking overhead
Date: 2005-12-02 22:34:57
Message-ID: 1133562897.2906.673.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2005-12-02 at 19:04 -0300, Alvaro Herrera wrote:
> Gregory Maxwell wrote:
> > On 02 Dec 2005 15:25:58 -0500, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> > > I suspect this comes out of a very different storage model from Postgres's.
> > >
> > > Postgres would have no trouble building an index of the existing data using
> > > only shared locks. The problem is that any newly inserted (or updated) records
> > > could be missing from such an index.
> > >
> > > To do it you would then have to gather up all those newly inserted records.
> > > And of course while you're doing that new records could be inserted. And so
> > > on.

CREATE INDEX uses SnapshotAny, so the scan that feeds the build could
easily include rows added after the CREATE INDEX started. When the scan
was exhausted we could mark that last TID and return to it after the
sort/build.

> There's no guarantee it would ever finish, though I suppose you could
> > > detect the situation if the size of the new batch wasn't converging to 0 and
> > > throw an error.
> >
> > After you're mostly caught up, change locking behavior to block
> > further updates while the final catchup happens. This could be driven
> > by a hurestic that says make up to N attempts to catch up without
> > blocking, after that just take a lock and finish the job. Presumably
> > the catchup would be short compared to the rest of the work.
>
> The problem is that you need to upgrade the lock at the end of the
> operation. This is very deadlock prone, and likely to abort the whole
> operation just when it's going to finish. Is this a showstopper? Tom
> seems to think it is. I'm not sure anyone is going to be happy if they
> find that their two-day reindex was aborted just when it was going to
> finish.

If that is the only objection against such a seriously useful feature,
then we should look at making some exceptions. (I understand the lock
upgrade issue).

Greg has come up with an exceptional idea here, so can we look deeper?
We already know others have done it.

What types of statement would cause the index build to fail? How else
can we prevent them from executing while the index is being built?

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-12-02 22:38:17 Re: Numeric 508 datatype
Previous Message Bruce Momjian 2005-12-02 22:29:59 Re: Numeric 508 datatype