Re: Parallel Seq Scan vs kernel read ahead

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan vs kernel read ahead
Date: 2020-07-17 09:18:17
Message-ID: CAA4eK1+ZX9OL=NVSFZ4L5ADxNLmXsKbH0Oso40sdGcqE1yTnHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 17, 2020 at 11:35 AM k(dot)jamison(at)fujitsu(dot)com
<k(dot)jamison(at)fujitsu(dot)com> wrote:
>
> On Wednesday, July 15, 2020 12:52 PM (GMT+9), David Rowley wrote:
>
> >On Wed, 15 Jul 2020 at 14:51, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >>
> >> On Wed, Jul 15, 2020 at 5:55 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> >>> If we've not seen any performance regressions within 1 week, then I
> >>> propose that we (pending final review) push this to allow wider
> >>> testing.
> >>
> >> I think Soumyadeep has reported a regression case [1] with the earlier
> >> version of the patch. I am not sure if we have verified that the
> >> situation improves with the latest version of the patch. I request
> >> Soumyadeep to please try once with the latest patch.
> >...
> >Yeah, it would be good to see some more data points on that test.
> >Jumping from 2 up to 6 workers just leaves us to guess where the performance
> >started to become bad. >It would be good to know if the regression is
> >repeatable or if it was affected by some other process.
> >...
> >It would be good to see EXPLAIN (ANALYZE, BUFFERS) with SET track_io_timing = on;
> >for each value of >max_parallel_workers.
>
> Hi,
>
> If I'm following the thread correctly, we may have gains on this patch
> of Thomas and David, but we need to test its effects on different
> filesystems. It's also been clarified by David through benchmark tests
> that synchronize_seqscans is not affected as long as the set cap per
> chunk size of parallel scan is at 8192.
>
> I also agree that having a control on this through GUC can be
> beneficial for users, however, that can be discussed in another
> thread or development in the future.
>
> David Rowley wrote:
> >I'd like to propose that if anyone wants to do further testing on
> >other operating systems with SSDs or HDDs then it would be good if
> >that could be done within a 1 week from this email. There are various
> >benchmarking ideas on this thread for inspiration.
>
> I'd like to join on testing it, this one using HDD, and at the bottom
> are the results. Due to my machine limitations, I only tested
> 0~6 workers, that even if I increase max_parallel_workers_per_gather
> more than that, the query planner would still cap the workers at 6.
> I also set the track_io_timing to on as per David's recommendation.
>
> Tested on:
> XFS filesystem, HDD virtual machine
> RHEL4, 64-bit,
> 4 CPUs, Intel Core Processor (Haswell, IBRS)
> PostgreSQL 14devel on x86_64-pc-linux-gnu
>
>
> ----Test Case (Soumyadeep's) [1]
>
> shared_buffers = 32MB (to use OS page cache)
>
> create table t_heap as select generate_series(1, 100000000) i; --about 3.4GB size
>
> SET track_io_timing = on;
>
> \timing
>
> set max_parallel_workers_per_gather = 0; --0 to 6
>
> SELECT count(*) from t_heap;
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) from t_heap;
>
> [Summary]
> I used the same query from the thread. However, the sql query execution time
> and query planner execution time results between the master and patched do
> not vary much.
> OTOH, in terms of I/O stats, I observed similar regression in both master
> and patched as we increase max_parallel_workers_per_gather.
>
> It could also be possible that each benchmark setting for max_parallel_workers_per_gather
> is affected by previous result . IOW, later benchmark runs benefit from the data cached by
> previous runs on OS level.
>

Yeah, I think to some extent that is visible in results because, after
patch, at 0 workers, the execution time is reduced significantly
whereas there is not much difference at other worker counts. I think
for non-parallel case (0 workers), there shouldn't be any difference.
Also, I am not sure if there is any reason why after patch the number
of shared hits is improved, probably due to caching effects?

> Any advice?

I think recreating the database and restarting the server after each
run might help in getting consistent results. Also, you might want to
take median of three runs.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jürgen Purtz 2020-07-17 09:32:50 Re: Additional Chapter for Tutorial
Previous Message Fujii Masao 2020-07-17 09:11:36 Re: max_slot_wal_keep_size and wal_keep_segments