Quick Links

Re: idea for concurrent seqscans

From:	Jeff Davis <jdavis-pgsql(at)empires(dot)org>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: idea for concurrent seqscans
Date:	2005-02-25 17:36:40
Message-ID:	1109353000.4089.134.camel@jeff
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 2005-02-25 at 13:38 +0000, Simon Riggs wrote:
> On Fri, 2005-02-25 at 00:34 -0800, Jeff Davis wrote:
> > I had an idea that might improve parallel seqscans on the same relation.
> >
> > If you have lots of concurrent seqscans going on a large relation, the
> > cache hit ratio is very low. But, if the seqscans are concurrent on the
> > same relation, there may be something to gain by starting a seqscan near
> > the page being accessed by an already-in-progress seqscan, and wrapping
> > back around to that start location. That would make some use of the
> > shared buffers, which would otherwise just be cache pollution.
>
> This is cool and was on my list of would-like-to-implement features.
>
> It's usually known as Synchronised Scanning. AFAIK it is free of any
> patent restriction: it has already been implemented by both Teradata and
> RedBrick.
>
> > This is the first time I've really modified the PG source code to do
> > anything that looked promising, so this is more of a question than
> > anything else. Is it promising? Is this a potentially good approach? I'm
> > happy to post more test data and more documentation, and I'd also be
> > happy to bring the code to production quality.
>
> I'll be happy to help you do this, at least for design and code review.
>
> I'll come back later with more detailed comments on your thoughts so
> far.
>

Good to hear. I'll clean up the code and document some more tests. Three
questions come to mind right now:
(1) Do we care about reverse scans being done with synchronized
scanning? If so, is there a good way to know in advance whether it is
going to be a forward or reverse scan (i.e. before heap_getnext())?
(2) Where is the appropriate place to put the page location of an
in-progress scan? Are there other pieces of shared memory that aren't
disk buffers that I should be making use of?

> > However, before I spend
> > too much more time on that, I'd like to get a general response from a
> > 3rd party to let me know if I'm off base.
>
> Third party?
>

A 2nd party? Anyone else? That was a typo :)

Regards,
Jeff Davis

In response to

Re: idea for concurrent seqscans at 2005-02-25 13:38:50 from Simon Riggs

Responses

Re: idea for concurrent seqscans at 2005-02-25 17:54:45 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Josh Berkus	2005-02-25 17:41:45	Re: Development Plans
Previous Message	Marc G. Fournier	2005-02-25 17:27:09	Re: [HACKERS] Interesting NetBSD annual report