Re: Reducing relation locking overhead

From: Hannu Krosing <hannu(at)skype(dot)net>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Gregory Maxwell <gmaxwell(at)gmail(dot)com>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reducing relation locking overhead
Date: 2005-12-08 09:58:50
Message-ID: 1134035930.3641.29.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, N, 2005-12-08 kell 01:08, kirjutas Jim C. Nasby:
> On Thu, Dec 08, 2005 at 08:57:42AM +0200, Hannu Krosing wrote:
> > ??hel kenal p??eval, N, 2005-12-08 kell 00:16, kirjutas Jim C. Nasby:
> > > On Sat, Dec 03, 2005 at 10:15:25AM -0500, Greg Stark wrote:
> > > > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> > > > > What's worse, once you have excluded writes you have to rescan the entire
> > > > > table to be sure you haven't missed anything. So in the scenarios where this
> > > > > whole thing is actually interesting, ie enormous tables, you're still
> > > > > talking about a fairly long interval with writes locked out. Maybe not as
> > > > > long as a complete REINDEX, but long.
> > > >
> > > > I was thinking you would set a flag to disable use of the FSM for
> > > > inserts/updates while the reindex was running. So you would know where to find
> > > > the new tuples, at the end of the table after the last tuple you read.
> > >
> > > What about keeping a seperate list of new tuples? Obviously we'd only do
> > > this when an index was being built on a table.
> >
> > The problem with separate list is that it can be huge. For example on a
> > table with 200 inserts/updates per second an index build lasting 6 hours
> > would accumulate total on 6*3600*200 = 4320000 new tuples.
>
> Sure, but it's unlikely that such a table would be very wide, so 4.3M
> tuples would probably only amount to a few hundred MB of data. It's also
> possible that this list could be vacuumed by whatever the regular vacuum
> process is for the table.

I think that keeping such list as part the table at well defined
location (like pages from N to M) is the best strategy, as it will
automatically make all new tuples available to parallel processes and
avoids both duplicate storage as well as the the need for changing
insert/update code.

---------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Csaba Nagy 2005-12-08 10:05:17 Re: Concurrent CREATE INDEX, try 2 (was Re: Reducing
Previous Message Simon Riggs 2005-12-08 09:57:16 Re: Reducing contention for the LockMgrLock