Re: large regression for parallel COPY

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: large regression for parallel COPY
Date: 2016-04-06 06:48:36
Message-ID: alpine.DEB.2.10.1604060841310.25561@sto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Robert,

> I tried the same test mentioned in the original post on cthulhu (EDB
> machine, CentOS 7.2, 8 sockets, 8 cores per socket, 2 threads per core,
> Xeon E7-8830 @ 2.13 GHz). I attempted to test both the effects of
> multi_extend_v21 and the *_flush_after settings.

I'm not sure of {backend,writer}_flush_after intrinsic effectiveness,
especially on HDDs, because although for the checkpointer
(checkpoint_flush_after) there is a great deal of effort to generate large
sequential writes, there is no such provisions for other write activities.
I'm not sure how the write activity of the "parallel" copy is organized,
but that sounds like it will generate less sequential writes than before,
and the negative performance impact could be accentuated by flushing.

This might suggest that the benefit of these two settings are more
irregular/hard to predict, so their default value should be 0 (aka off)?

Or maybe warn clearly in the documentation about the uncertain effects of
these two settings?

--
Fabien.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-04-06 06:58:04 Re: Timeline following for logical slots
Previous Message Craig Ringer 2016-04-06 06:45:48 Re: Timeline following for logical slots