Re: Parallel Seq Scan

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, John Gorman <johngorman2(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-01-19 13:20:36
Message-ID: CA+TgmoZN66-m9+DL_BR4b0Z1tYPi7nZQiJ+Nmtq-d2u9E0H9wQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 19, 2015 at 2:24 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Okay, I think I got the idea what you want to achieve via
> prefetching. So assuming prefetch_distance = 100 and
> prefetch_increment = 50 (prefetch_distance /2), it seems to me
> that as soon as there are less than 100 blocks in prefetch quota,
> it will fetch next 50 blocks which means the system will be always
> approximately 50 blocks ahead, that will ensure that in this algorithm
> it will always perform sequential scan, however eventually this is turning
> to be a system where one worker is reading from disk and then other
> workers are reading from OS buffers to shared buffers and then getting
> the tuple. In this approach only one downside I can see and that is
> there could be times during execution where some/all workers will have
> to wait on the worker doing prefetching, however I think we should try
> this approach and see how it works.

Right. We probably want to make prefetch_distance a GUC. After all,
we currently rely on the operating system for prefetching, and the
operating system has a setting for this, at least on Linux (blockdev
--getra). It's possible, however, that we don't need this at all,
because the OS might be smart enough to figure it out for us. It's
probably worth testing, though.

> Another thing is that I think prefetching is not supported on all platforms
> (Windows) and for such systems as per above algorithm we need to
> rely on block-by-block method.

Well, I think we should try to set up a test to see if this is hurting
us. First, do a sequential-scan of a related too big at least twice
as large as RAM. Then, do a parallel sequential scan of the same
relation with 2 workers. Repeat these in alternation several times.
If the operating system is accomplishing meaningful readahead, and the
parallel sequential scan is breaking it, then since the test is
I/O-bound I would expect to see the parallel scan actually being
slower than the normal way.

Or perhaps there is some other test that would be better (ideas
welcome) but the point is we may need something like this, but we
should try to figure out whether we need it before spending too much
time on it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-01-19 13:24:08 Re: Fillfactor for GIN indexes
Previous Message Alvaro Herrera 2015-01-19 13:16:03 Re: Error check always bypassed in tablefunc.c