synchronize_seqscans' description is a bit misleading

From: Gurjeet Singh <gurjeet(at)singh(dot)im>
To: PostgreSQL Docs <pgsql-docs(at)postgresql(dot)org>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: synchronize_seqscans' description is a bit misleading
Date: 2013-04-11 01:57:06
Message-ID: CABwTF4VwxS+jjT2RZSzHny5LArW+jFjFn5uiGH8cTRCXETGNag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

If I'm reading the code right [1], this GUC does not actually *synchronize*
the scans, but instead just makes sure that a new scan starts from a block
that was reported by some other backend performing a scan on the same
relation.

Since the backends scanning the relation may be processing the relation at
different speeds, even though each one took the hint when starting the
scan, they may end up being out of sync with each other. Even in a single
query, there may be different scan nodes scanning different parts of the
same relation, and even they don't synchronize with each other (and for
good reason).

Imagining that all scans on a table are always synchronized, may make some
wrongly believe that adding more backends scanning the same table will not
incur any extra I/O; that is, only one stream of blocks will be read from
disk no matter how many backends you add to the mix. I noticed this when I
was creating partition tables, and each of those was a CREATE TABLE AS
SELECT FROM original_table (to avoid WAL generation), and running more than
3 such transactions caused the disk read throughput to behave unpredictably,
sometimes even dipping below 1 MB/s for a few seconds at a stretch.

Please note that I am not complaining about the implementation, which I
think is the best we can do without making backends wait for each other.
It's just that the documentation [2] implies that the scans are
synchronized through the entire run, which is clearly not the case. So I'd
like the docs to be improved to reflect that.

How about something like:

<doc>
synchronize_seqscans (boolean)
This allows sequential scans of large tables to start from a point in
the table that is already being read by another backend. This increases the
probability that concurrent scans read the same block at about the same
time and hence share the I/O workload. Note that, due to the difference in
speeds of processing the table, the backends may eventually get out of
sync, and hence stop sharing the I/O workload.

When this is enabled, ... The default is on.
</doc>

Best regards,

[1] src/backend/access/heap/heapam.c
[2]
http://www.postgresql.org/docs/9.2/static/runtime-config-compatible.html#GUC-SYNCHRONIZE-SEQSCANS

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2013-04-11 03:10:05 Re: synchronize_seqscans' description is a bit misleading
Previous Message Andrew Dunstan 2013-04-10 13:23:21 Re: [PATCH] Fix discrepancy in hstore_to_json_loose documentation

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-04-11 03:10:05 Re: synchronize_seqscans' description is a bit misleading
Previous Message Michael Paquier 2013-04-11 00:55:51 Re: SIGHUP not received by custom bgworkers if postmaster is notified