Re: Parallel Seq Scan vs kernel read ahead

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan vs kernel read ahead
Date: 2020-07-15 02:51:00
Message-ID: CAA4eK1JR_SnWZs0AiJO2=x2z59n_Cj0bJbLN54EKzCun9OB78A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 15, 2020 at 5:55 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Tue, 14 Jul 2020 at 19:13, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> >
> > On Fri, Jun 26, 2020 at 3:33 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > On Tue, Jun 23, 2020 at 11:53 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > > > In summary, based on these tests, I don't think we're making anything
> > > > worse in regards to synchronize_seqscans if we cap the maximum number
> > > > of blocks to allocate to each worker at once to 8192. Perhaps there's
> > > > some argument for using something smaller than that for servers with
> > > > very little RAM, but I don't personally think so as it still depends
> > > > on the table size and It's hard to imagine tables in the hundreds of
> > > > GBs on servers that struggle with chunk allocations of 16MB. The
> > > > table needs to be at least ~70GB to get a 8192 chunk size with the
> > > > current v2 patch settings.
> > >
> > > Nice research. That makes me happy. I had a feeling the maximum useful
> > > chunk size ought to be more in this range than the larger values we
> > > were discussing before, but I didn't even think about the effect on
> > > synchronized scans.
> >
> > +1. This seems about right to me. We can always reopen the
> > discussion if someone shows up with evidence in favour of a tweak to
> > the formula, but this seems to address the basic problem pretty well,
> > and also fits nicely with future plans for AIO and DIO.
>
> Thank you both of you for having a look at the results.
>
> I'm now pretty happy with this too. I do understand that we've not
> exactly exhaustively tested all our supported operating systems.
> However, we've seen some great speedups with Windows 10 and Linux with
> SSDs. Thomas saw great speedups with FreeBSD with the original patch
> using chunk sizes of 64 blocks. (I wonder if it's worth verifying that
> it increases further with the latest patch with the same test you did
> in the original email on this thread?)
>
> I'd like to propose that if anyone wants to do further testing on
> other operating systems with SSDs or HDDs then it would be good if
> that could be done within a 1 week from this email. There are various
> benchmarking ideas on this thread for inspiration.
>

Yeah, I agree it would be good if we could do what you said.

> If we've not seen any performance regressions within 1 week, then I
> propose that we (pending final review) push this to allow wider
> testing.

I think Soumyadeep has reported a regression case [1] with the earlier
version of the patch. I am not sure if we have verified that the
situation improves with the latest version of the patch. I request
Soumyadeep to please try once with the latest patch.

[1] - https://www.postgresql.org/message-id/CADwEdoqirzK3H8bB%3DxyJ192EZCNwGfcCa_WJ5GHVM7Sv8oenuA%40mail.gmail.com
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-07-15 03:08:39 Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Previous Message Kyotaro Horiguchi 2020-07-15 02:49:06 Re: GSSENC'ed connection stalls while reconnection attempts.