Re: Checksums by default?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checksums by default?
Date: 2017-01-23 10:26:01
Message-ID: 23c5dc60-e72e-0323-f448-8f3acfa51dcc@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/23/2017 09:57 AM, Amit Kapila wrote:
> On Mon, Jan 23, 2017 at 1:18 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On 01/23/2017 08:30 AM, Amit Kapila wrote:
>>>
>>>
>>> I think if we can get data for pgbench read-write workload when data
>>> doesn't fit in shared buffers but fit in RAM, that can give us some
>>> indication. We can try by varying the ratio of shared buffers w.r.t
>>> data. This should exercise the checksum code both when buffers are
>>> evicted and at next read. I think it also makes sense to check the
>>> WAL data size for each of those runs.
>>>
>>
>> Yes, I'm thinking that's pretty much the worst case for OLTP-like workload,
>> because it has to evict buffers from shared buffers, generating a continuous
>> stream of writes. Doing that on good storage (e.g. PCI-e SSD or possibly
>> tmpfs) will further limit the storage overhead, making the time spent
>> computing checksums much more significant. Makes sense?
>>
>
> Yeah, I think that can be helpful with respect to WAL, but for data,
> if we are considering the case where everything fits in RAM, then
> faster storage might or might not help.
>

I'm not sure I understand. Why wouldn't faster storage help? It's only a
matter of generating enough dirty buffers (that get evicted from shared
buffers) to saturate the storage. With some storage you'll hit that at
100 MB/s, with PCI-e it might be more like 1GB/s.

Of course, if the main bottleneck is somewhere else (e.g. hitting 100%
CPU utilization before putting any pressure on storage), that's not
going to make much difference.

Or perhaps I missed something important?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stas Kelvich 2017-01-23 11:26:41 Re: Speedup twophase transactions
Previous Message Amit Langote 2017-01-23 10:25:32 Re: Declarative partitioning vs. BulkInsertState