Re: checkpointer continuous flushing

From: Andres Freund <andres(at)anarazel(dot)de>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: checkpointer continuous flushing
Date: 2015-08-10 19:28:14
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On August 10, 2015 8:24:21 PM GMT+02:00, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>Hello Andres,
>> You can't allocate 4GB with palloc(), it has a builtin limit against
>> allocating more than 1GB.
>Argh, too bad, I assumed very naively that palloc was malloc in

It is, but there's some layering (memory pools/contexts) on top. You can get huge allocations with polloc_huge.

>Then the file would be fsynced twice: if the fsync is done properly
>have already been flushed to disk) then it would not cost much, and
>it sometimes twice on some file would not be a big issue. The code
>also detect such event and log a warning, which would give a hint about
>how often it occurs in practice.

Right. At the cost of keeping track of all files...

>>>> If the pivot element changes its identity won't the result be
>pretty much
>>>> random?
>>> That would be a very unlikely event, given the short time spent in
>>> qsort.
>> Meh, we don't want to rely on "likeliness" on such things.
>My main argument is that even if it occurs, and the qsort result is
>wrong, it does not change correctness, it just mean that the actual
>will be less in order than wished. If it occurs, one pivot separation
>would be quite strange, but then others would be right, so the buffers
>would be "partly sorted".

It doesn't matter for correctness today, correct. But it makes out impossible to rely on or too.

>Another issue I see is that even if buffers are locked within cmp, the
>status may change between two cmp...

Sure. That's not what in suggesting. Earlier versions of the patch kept an array of buffer headers exactly because of that.

I do not think that locking all
>buffers for sorting them is an option. So on the whole, I think that
>locking buffers for sorting is probably not possible with the simple
>efficient) lightweight approach used in the patch.

Yes, the other version has a higher space overhead. I'm not convinced that's meaningful in comparison to shared buffets in space.
And rather doubtful it a loss performance wise in a loaded server. All the buffer headers are touched on other cores and doing the sort with indirection will greatly increase bus traffic.

>The good news, as I argued before, is that the order is only advisory
>help with performance, but the correctness is really that all
>buffers are written and fsync is called in the end, and does not depend
>the buffer order. That is how it currently works anyway

It's not particularly desirable to have a performance feature that works less well if the server is heavily and concurrently loaded. The likelihood of bogus sort results will increase with the churn rate in shared buffers.


Please excuse brevity and formatting - I am writing this on my mobile phone.

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-08-10 19:40:59 Re: WIP: SCRAM authentication
Previous Message Josh Berkus 2015-08-10 19:21:55 Re: WIP: SCRAM authentication