Re: Synchronized Scan update

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronized Scan update
Date: 2007-03-05 18:39:34
Message-ID: 1173119974.13722.277.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2007-03-04 at 11:54 +0000, Simon Riggs wrote:
> > (2) sync_scan_offset: Start a new scan this many pages before a
> > currently running scan to take advantage of the pages
> > that are likely already in cache.
>
> I'm somewhat dubious about this parameter, I have to say, even though I
> am eager for this feature. It seems like a "magic" parameter that works
> only when we have the right knowledge to set it correctly.
>

That was my concern about this parameter also.

> How will we know what to default it to and how will we know whether to
> set it higher or lower for better performance? Does that value vary
> according to the workload on the system? How?
>

Perhaps people would only set this parameter when they know it will
help, and for more complex (or varied) usage patterns they'd set
sync_scan_offset to 0 to be safe.

My thinking on the subject (and this is only backed up by very basic
tests) is that there are basically two situations where setting this
parameter too high can hurt:
(1) It's too close to the limits of your physical memory, and you end up
diverging the scans when they could be kept together.
(2) You're using a lot of CPU and the backends aren't processing the
buffers as fast as your I/O system is delivering them. This will prevent
the scans from converging.

If your CPUs are well below capacity and you choose a size significantly
less than your effective cache size, I don't think it will hurt.

> I'm worried that we get a feature that works well on simple tests and
> not at all in real world circumstances. I don't want to cast doubt on
> what could be a great patch or be negative: I just see that the feature
> relies on the dynamic behaviour of the system. I'd like to see some
> further studies on how this works to make sure that we can realistically
> set know how to set this knob, that its the correct knob and it is the
> only one we need.

I will do some better tests on some better hardware this week and next
week. I hope that sheds some light.

> Further thoughts: It sounds like sync_scan_offset is related to
> effective_cache_size. Can you comment on whether that might be a
> something we can use as well/instead? (i.e. set the scan offset to say K
> * effective_cache_size, 0.1 <= K <= 0.5)???
>
> Might we do roughly the same thing with sync_scan_threshold as well, and
> just have enable_sync_scan instead? i.e. sync_scan_threshold =
> effective_cache_size? When would those two parameters not be connected
> directly to each other?
>

Originally, these parameters were in terms of the effective_cache_size.
Somebody else convinced me that it was too confusing to have the
variables dependent on each other, so I made them independent. I don't
have a strong opinion either way.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2007-03-05 18:44:51 Re: Bug: Buffer cache is not scan resistant
Previous Message Tom Lane 2007-03-05 18:24:45 Re: Bug: Buffer cache is not scan resistant