Quick Links

Re: checkpointer continuous flushing - V18

From:	Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: checkpointer continuous flushing - V18
Date:	2016-02-21 21:49:35
Message-ID:	alpine.DEB.2.10.1602212236000.3927@sto
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>>>> [...] I do think that this whole writeback logic really does make
>>>> sense *per table space*,
>>>
>>> Leads to less regular IO, because if your tablespaces are evenly sized
>>> (somewhat common) you'll sometimes end up issuing sync_file_range's
>>> shortly after each other. For latency outside checkpoints it's
>>> important to control the total amount of dirty buffers, and that's
>>> obviously independent of tablespaces.
>>
>> I do not understand/buy this argument.
>>
>> The underlying IO queue is per device, and table spaces should be per device
>> as well (otherwise what the point?), so you should want to coalesce and
>> "writeback" pages per device as wel. Calling sync_file_range on distinct
>> devices should probably be issued more or less randomly, and should not
>> interfere one with the other.
>
> The kernel's dirty buffer accounting is global, not per block device.

Sure, but this is not my point. My point is that "sync_file_range" moves
buffers to the device io queues, which are per device. If there is one
queue in pg and many queues on many devices, the whole point of coalescing
to get sequential writes is somehow lost.

> It's also actually rather common to have multiple tablespaces on a
> single block device. Especially if SANs and such are involved; where you
> don't even know which partitions are on which disks.

Ok, some people would not benefit if the use many tablespaces on one
device, too bad but that does not look like a useful very setting anyway,
and I do not think it would harm much in this case.

>> If you use just one context, the more table spaces the less performance
>> gains, because there is less and less aggregation thus sequential writes per
>> device.
>>
>> So for me there should really be one context per tablespace. That would
>> suggest a hashtable or some other structure to keep and retrieve them, which
>> would not be that bad, and I think that it is what is needed.
>
> That'd be much easier to do by just keeping the context in the
> per-tablespace struct. But anyway, I'm really doubtful about going for
> that; I had it that way earlier, and observing IO showed it not being
> beneficial.

ISTM that you would need a significant number of tablespaces to see the
benefit. If you do not do that, the more table spaces the more random the
IOs, which is disappointing. Also, "the cost is marginal", so I do not see
any good argument not to do it.

--
Fabien.

In response to

Re: checkpointer continuous flushing - V18 at 2016-02-21 20:08:29 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2016-02-21 21:52:27	Re: Add generate_series(date,date) and generate_series(date,date,integer)
Previous Message	Christoph Berg	2016-02-21 21:42:20	Re: Add generate_series(date,date) and generate_series(date,date,integer)