Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Date: 2015-12-21 11:53:29
Message-ID: 5677E839.3030205@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 12/21/2015 12:03 PM, Heikki Linnakangas wrote:
> On 17/12/15 19:07, Robert Haas wrote:
>> On Mon, Dec 14, 2015 at 6:08 PM, Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>> So we know that we should expect about
>>>
>>> (prev_wal_bytes - wal_bytes) + (prev_wal_fpw_bytes - wal_fpw_bytes)
>>>
>>> ( regular WAL ) + ( FPW WAL )
>>>
>>> to be produced until the end of the current checkpoint. I don't
>>> have a clear idea how to transform this into the 'progress' yet,
>>> but I'm pretty sure tracking the two types of WAL is a key to a
>>> better solution. The x^1.5 is probably a step in the right
>>> direction, but I don't feel particularly confident about the 1.5
>>> (which is rather arbitrary).
>>
>> If it works well empirically, does it really matter that it's
>> arbitrary? I mean, the entire planner is full of fairly arbitrary
>> assumptions about which things to consider in the cost model and
>> which to ignore. The proof that we have made good decisions there
>> is in the query plans it generates. (The proof that we have made
>> bad decisions in some cases in the query plans, too.)
>
> Agreed.

What if it only seems to work well because it was tested on cases it was
designed for? What about the workloads that behave differently?

Whenever we do changes to costing and query planning, we carefully
consider counter-examples and cases where it might fail. I see nothing
like that in this thread - all I see is a bunch of pgbench tests, which
seems rather insufficient to me.

>
>> I think a bigger problem for this patch is that Heikki seems to have
>> almost completely disappeared.
>
> Yeah, there's that problem too :-).
>
> The reason I didn't commit this back then was lack of performance
> testing. I'm fairly confident that this would be a significant
> improvement for some workloads, and shouldn't hurt much even in the
> worst case. But I did only a little testing on my laptop. I think
> Simon was in favor of just committing it immediately, and Fabien
> wanted to see more performance testing before committing.
>
> I was hoping that Digoal would re-ran his original test case, and
> report back on whether it helps. Fabien had a performance test setup,
> for testing another patch, but he didn't want to run it to test this
> patch. Amit did some testing, but didn't see a difference. We can
> take that as a positive sign - no regression - or as a negative sign,
> but I think that basically means that his test was just not sensitive
> to the FPW issue.
>
> So Tomas, if you're willing to do some testing on this, that would
> be brilliant!

I'm ready to spend some time on this, assuming we can agree on what
tests to run. Can we come up with realistic workloads where we expect
the patch might actually work poorly?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-12-21 12:11:11 Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Previous Message Ashutosh Bapat 2015-12-21 11:34:57 Re: Getting sorted data from foreign server for merge join