Re: Synchronized Scan update

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronized Scan update
Date: 2007-03-13 00:46:17
Message-ID: 1173746777.23455.90.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2007-03-12 at 13:21 +0000, Simon Riggs wrote:
> So based on those thoughts, sync_scan_offset should be fixed at 16,
> rather than being variable. In addition, ss_report_loc() should only
> report its position every 16 blocks, rather than do this every time,
> which will reduce overhead of this call.

If we fix sync_scan_offset at 16, we might as well just get rid of it.
Sync scans are only useful on large tables, and getting a free 16 pages
over a scan isn't worth the trouble. However, even without
sync_scan_offset, sync scans are still a valuable feature.

I agree that ss_report_loc() doesn't need to report on every call. If
there's any significant overhead I agree that it should report less
often. Do you think that the overhead is significant on such a simple
function?

>
> To match that, scan_recycle_buffers should be fixed at 32. So GUCs for
> sync_scan_offset and scan_recycle_buffers would not be required at all.
>
> IMHO we can also remove sync_scan_threshold and just use NBuffers
> instead. That way we get the benefit of both patches or neither, making
> it easier to understand what's going on.

I like the idea of reducing tuning parameters, but we should, at a
minimum, still allow an on/off button for sync scans. My tests revealed
that the wrong combination of OS/FS/IO-Scheduler/Controller could result
in bad I/O behavior.

> If need be, the value of scan_recycle_buffers can be varied upwards
> should the scans drift apart, as a way of bringing them back together.

If the scans aren't being brought together, that means that one of the
scans is CPU bound or outside the combined cache trail (shared_buffers
+ OS buffer cache).

> We aren't tracking whether they are together or apart, so I would like
> to see some debug output from synch scans to allow us to assess how far
> behind the second scan is as it progresses. e.g.
> LOG: synch scan currently on block N, trailing pathfinder by M blocks
> issued every 128 blocks as we go through the scans.
>
> Thoughts?
>

It's hard to track where all the scans are currently. One of the
advantages of my patch is its simplicity: the scans don't need to know
about other specific scans, and there is no concept in the code of a
"head" scan or a "pack".

There is no easy way to tell which scan is ahead and which is behind.
There was a discussion when I submitted this proposal at the beginning
of 8.3, but I didn't see enough benefit to justify all of the costs and
risks associated with scans communicating between eachother. I
certainly can't implement that kind of thing before feature freeze, and
I think there's a risk of lock contention for the communication
required. I'm also concerned that -- if the scans are too
interdependent -- it would make postgres less robust against the
disappearance of a single backend (i.e. what if the backend that is
leading a scan dies?).

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-03-13 01:00:46 Re: CLUSTER and MVCC
Previous Message Tom Lane 2007-03-13 00:44:10 Re: possible de-optimization of multi-column index plans in 8.3