Re: Parallel Seq Scan vs kernel read ahead

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan vs kernel read ahead
Date: 2020-07-15 03:52:17
Message-ID: CAApHDvrpTvn6Cu6oicpjQPLUTVjzgOfJJC4VG=zdnTyZ2Oe_Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 15 Jul 2020 at 14:51, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jul 15, 2020 at 5:55 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > If we've not seen any performance regressions within 1 week, then I
> > propose that we (pending final review) push this to allow wider
> > testing.
>
> I think Soumyadeep has reported a regression case [1] with the earlier
> version of the patch. I am not sure if we have verified that the
> situation improves with the latest version of the patch. I request
> Soumyadeep to please try once with the latest patch.

Yeah, it would be good to see some more data points on that test.
Jumping from 2 up to 6 workers just leaves us to guess where the
performance started to become bad. It would be good to know if the
regression is repeatable or if it was affected by some other process.

I see the disk type on that report was Google PersistentDisk. I don't
pretend to be any sort of expert on network filesystems, but I guess a
regression would be possible in that test case if say there was an
additional layer of caching of very limited size between the kernel
cache and the disks, maybe on a remote machine. If it were doing some
sort of prefetching to try to reduce latency and requests to the
actual disks then perhaps going up to 6 workers with 64 chunk size (as
Thomas' patch used at that time) caused more cache misses on that
cache due to the requests exceeding what had already been prefetched.
That's just a stab in the dark. Maybe someone with knowledge of these
network file systems can come up with a better theory.

It would be good to see EXPLAIN (ANALYZE, BUFFERS) with SET
track_io_timing = on; for each value of max_parallel_workers.

David

> [1] - https://www.postgresql.org/message-id/CADwEdoqirzK3H8bB%3DxyJ192EZCNwGfcCa_WJ5GHVM7Sv8oenuA%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-07-15 03:59:03 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Justin Pryzby 2020-07-15 03:08:39 Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)