Re: old synchronized scan patch

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Hannu Krosing <hannu(at)skype(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Luke Lonergan <llonergan(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org, Eng <eng(at)intranet(dot)greenplum(dot)com>
Subject: Re: old synchronized scan patch
Date: 2006-12-06 22:27:36
Message-ID: 1165444056.2048.45.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2006-12-06 at 12:48 -0600, Jim C. Nasby wrote:
> On Tue, Dec 05, 2006 at 09:09:39AM -0800, Jeff Davis wrote:
> > That being said, I can just lock the hint table (the shared memory hint
> > table, not the relation) and only update the hint every K pages, as Niel
> > Conway suggested when I first proposed it. If we find a K small enough
> > so the feature is useful, but large enough that we're sure there won't
> > be contention, this might be a good option. However, I don't know that
> > we would eliminate the contention, because if K is a constant (rather
> > than random), the backends would still all want to update that shared
> > memory table at the same time.
>
> What about some algorithm where only one backend will update the hint
> entry (perhaps the first one, or the slowest one (ie: lowest page
> number))? ISTM that would eliminate a lot of contention, and if you get
> clever with the locking scheme you could probably allow other backends
> to do non-blocking reads except when the page number passes a 4-byte
> value (assuming 4-byte int updates are atomic).
>

If we have one backend in charge, how does it pass the torch when it
finishes the scan? I think you're headed back in the direction of an
independent "scanning" process. That's not unreasonable, but there are a
lot of other issues to deal with.

One thought of mine goes something like this: A scanner process starts
up and scans with a predefined length of a cache trail in the
shared_buffers, perhaps a chunk of buffers used like a circular list (so
it doesn't interfere with caching). When a new scan starts, it could
request a block from this scanner process and begin the scan there. If
the new scan keeps up with the scanner process, it will always be
getting cached data. If it falls behind, the request turns into a new
block request. In theory, the scan could actually catch back up to the
scanner process after falling behind.

We could use a meaningful event (like activity on a particular relation)
to start/stop the scanner process.

It's just another idea, but I'm still not all that sure that
synchronization is necessary.

Does anyone happen to have an answer on whether OS-level readahead is
system-wide, or per-process? I expect that it's system wide, but Tom
raised the issue and it may be a drawback if some OSs do per-process
readahead.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2006-12-06 23:01:54 Re: old synchronized scan patch
Previous Message Tom Lane 2006-12-06 22:20:41 Re: 8.2 bug with outer join reordering