Re: Parallel Seq Scan vs kernel read ahead

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan vs kernel read ahead
Date: 2020-05-22 18:30:20
Message-ID: CA+TgmoZ6u4w5D19QVbNM4kqG+oJ4jyFwnCs5mg_d+h9yhkpHwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 19, 2020 at 10:23 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Good experiment. IIRC, we have discussed a similar idea during the
> development of this feature but we haven't seen any better results by
> allocating in ranges on the systems we have tried. So, we want with
> the current approach which is more granular and seems to allow better
> parallelism. I feel we need to ensure that we don't regress
> parallelism in existing cases, otherwise, the idea sounds promising to
> me.

I think there's a significant difference. The idea I remember being
discussed at the time was to divide the relation into equal parts at
the very start and give one part to each worker. I think that carries
a lot of risk of some workers finishing much sooner than others. This
idea, AIUI, is to divide the relation into chunks that are small
compared to the size of the relation, but larger than 1 block. That
carries some risk of an unequal division of work, as has already been
noted, but it's much less, especially if we use smaller chunk sizes
once we get close to the end, as proposed here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-05-22 18:31:22 Re: Parallel Seq Scan vs kernel read ahead
Previous Message Julien Rouhaud 2020-05-22 17:54:34 Re: Policy on cross-posting to multiple lists