Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Date: 2015-12-23 19:27:15
Message-ID: 567AF593.3030005@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/21/2015 01:11 PM, Heikki Linnakangas wrote:
> On 21/12/15 13:53, Tomas Vondra wrote:
>> On 12/21/2015 12:03 PM, Heikki Linnakangas wrote:
>>> On 17/12/15 19:07, Robert Haas wrote:
>>>> If it works well empirically, does it really matter that it's
>>>> arbitrary? I mean, the entire planner is full of fairly arbitrary
>>>> assumptions about which things to consider in the cost model and
>>>> which to ignore. The proof that we have made good decisions there
>>>> is in the query plans it generates. (The proof that we have made
>>>> bad decisions in some cases in the query plans, too.)
>>>
>>> Agreed.
>>
>> What if it only seems to work well because it was tested on cases it was
>> designed for? What about the workloads that behave differently?
>>
>> Whenever we do changes to costing and query planning, we carefully
>> consider counter-examples and cases where it might fail. I see nothing
>> like that in this thread - all I see is a bunch of pgbench tests, which
>> seems rather insufficient to me.
>
> Agreed on that too.
>
>> I'm ready to spend some time on this, assuming we can agree on what
>> tests to run. Can we come up with realistic workloads where we expect
>> the patch might actually work poorly?
>
> I think the worst case scenario would be the case where there is no
> FPW-related WAL burst at all, and checkpoints are always triggered by
> max_wal_size rather than checkpoint_timeout. In that scenario, the
> compensation formula will cause the checkpoint to be too lazy in the
> beginning, and it will have to catch up more aggressively towards the
> end of the checkpoint cycle.
>
> One such scenario might be to do only COPYs into a table with no
> indexes. Or hack pgbench to do concentrate all the updates on only a few
> very rows. There will be a FPW on those few pages initially, but the
> spike will be much shorter. Or turn full_page_writes=off, and hack the
> patch to do compensation even when fullpage_writes=off, and then just
> run pgbench.

OK, the COPY scenario works interesting and also realistic because it
probably applies to systems doing batch loads.

So that's one test to do, can we come up with some other?

We probably do want to do a bunch of pgbench tests, with various scales
and also distributions - the gaussian/exponential distributions seem
useful for simulating OLTP systems that usually have just s small active
set (instead of touching all the data). This surely affects how much FPW
we do and at what point - my expectetion is that the non-uniform
distributions will have a long tail of FPW.

So I was thinking about these combinations:

* modes: uniform, gaussian, exponential
* scales: 1000 (15GB), 10000 (150GB)
* clients: 1, 2, 4, 8, 16 (to see impact on scalability, if any)

Each combination needs to run for at least an hour or two, possibly with
multiple runs. I'll also try running this both on SSD-based sytem and a
system with 10k drives, because those will probably behave differently.

Also, are we tracking the amount of FPW during the checkpoint,
somewhere? That'd be useful, at least for this patch. Or do we need to
just track the amount of WAL produced?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-12-23 19:34:04 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Robert Haas 2015-12-23 19:22:31 Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?