Quick Links

Re: [PATCHES] Proposed patch: synchronized_scanning GUC variable

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Neil Conway <neilc(at)samurai(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: [PATCHES] Proposed patch: synchronized_scanning GUC variable
Date:	2008-01-30 16:36:33
Message-ID:	1201710993.4453.92.camel@ebony.site
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > Rather than having a boolean GUC, we should have a number and make the
> > parameter "synchronised_scan_threshold".
>
> This would open up a can of worms I'd prefer not to touch, having to do
> with whether the buffer-access-strategy behavior should track that or
> not. As the note in heapam.c says,
>
> * If the table is large relative to NBuffers, use a bulk-read access
> * strategy and enable synchronized scanning (see syncscan.c). Although
> * the thresholds for these features could be different, we make them the
> * same so that there are only two behaviors to tune rather than four.
>
> It's a bit late in the cycle to be revisiting that choice. Now we do
> already have three behaviors to worry about (BAS on and syncscan off)
> but throwing in a randomly settable knob will take it back to four,
> and we have no idea how that fourth case will behave. The other tack we
> could take (having the one GUC variable control both thresholds) is
> not good since it will result in pg_dump trashing the buffer cache.

I'm still not very happy with any of the options here.

BAS is great if you didn't want to trash the cache, but its also
annoying to people that really did want to load a large table into
cache. However we set it, we're going to have problems because not
everybody has the same database.

We're trying to guess which data is in memory and which is on disk and
then act accordinly. The answer to that question cannot be answered
solely by how big shared_buffers is. It really ought to be a combination
of (at least) shared_buffers and total database size. I think we must
either put some more intelligence into the setting of the threshold, or
give it to the user as a parameter, possibly as a parameter not
mentioned in the sample .conf.

If we set the threshold unintelligently or in a way that cannot be
overridden, we will still get weird bug reports from people who set
shared_buffers higher and got a performance drop.

We need to make a final decision on this quickly, so I'll say no more on
this for 8.3 to help that process.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Re: [PATCHES] Proposed patch: synchronized_scanning GUC variable at 2008-01-28 21:21:44 from Tom Lane

Responses

Re: [PATCHES] Proposed patch: synchronized_scanning GUC variable at 2008-01-30 17:22:28 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2008-01-30 17:02:31	Re: Truncate Triggers
Previous Message	Alvaro Herrera	2008-01-30 16:00:37	Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Tom Lane	2008-01-30 17:19:22	Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Previous Message	Gregory Stark	2008-01-30 16:04:48	Re: [8.4] Updated WITH clause patch (non-recursive)