Re: disk writes within a transaction

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: 2xlp - ListSubscriptions <postgres(at)2xlp(dot)com>
Cc: pgsql-general general <pgsql-general(at)postgresql(dot)org>
Subject: Re: disk writes within a transaction
Date: 2017-02-17 21:05:11
Message-ID: CAMkU=1wmztpcQ7egvfCkn_UztC6-DQ0UH7-xrNWyTukiu-7u7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Feb 16, 2017 at 11:33 AM, 2xlp - ListSubscriptions <
postgres(at)2xlp(dot)com> wrote:

> Can someone enlighten me to how postgres handles disk writing? I've read
> some generic remarks about buffers, but that's about it.
>
> We have a chunk of code that calls Postgres in a less-than-optimal way
> within a transaction block. I'm wondering where to prioritize fixing it,
> as the traffic on the wire isn't an issue.
>
> Basically the code looks like this:
>
> begin;
> update foo set foo.a='1' where foo.bar = 1;
> ...
> update foo set foo.b='2' where foo.bar = 1;
> ...
> update foo set foo.c='3' where foo.bar = 1;
> commit;
>
> If the updates are likely to be a memory based operation, consolidating
> them can wait. If they are likely to hit the disk, I should schedule
> refactoring this code sooner than later.
>

You are going to generate more volume of WAL data, which has to reach disk
eventually. Although it is likely they will all be consolidated into about
the same number of physical writes and syncs.

You are also likely to inhibit the Heap-only-tuple mechanism, because you
will end up with 4 copies of the row which all have to fit in the same
block. If they don't, it has to migrate some of them to a different block
plus do index maintenance, so you will generate more dirty blocks that
way. How many more depends on how many indexes you have, and whether the
columns being updated are themselves included in indexes.

There is also a CPU issue when the same tuple is updated repeatedly in a
single transaction. Each update has to wade through all the previous row
versions, so it is an N^2 operation in the number of updates.

It will probably be easier to refactor the code than to quantify just how
much damage it does.

cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arnold Somogyi 2017-02-18 00:53:45 Multiply ON CONFLICT ON CONSTRAINT
Previous Message Adrian Klaver 2017-02-17 20:56:32 Re: PostgreSQL mirroring from RPM install to RPM install-revisited