Re: UPDATEDs slowing SELECTs in a fully cached database

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "lars" <lhofhansl(at)yahoo(dot)com>
Cc: "Ivan Voras" <ivoras(at)freebsd(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: UPDATEDs slowing SELECTs in a fully cached database
Date: 2011-07-12 20:51:24
Message-ID: 4E1C6D7C020000250003F268@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

lars <lhofhansl(at)yahoo(dot)com> wrote:

> So a read of a row *will* trigger dead tuple pruning, and that
> requires WAL logging, and this is known/expected?

Yes, because pruning dead line pointers will make subsequent reads
faster. It's intended to be an optimization.

> This is actually the only answer I am looking for. :) I have not
> seen this documented anywhere.

You would currently need to look at the README-HOT file or source
code, I think. There probably should be some mention in the user
docs, but I haven't noticed any, and it is more technical than most
of the documentation gets. Perhaps a "note" block somewhere...

> The fact that a select (maybe a big analytical query we'll run)
> touching many rows will update the WAL and wait (apparently) for
> that IO to complete is making a fully cached database far less
> useful.

Well, I've never run into this because I have directly attached
storage through a RAID controller with a battery-backed cache
configured for write-back. The pruning is pretty light on CPU
usage, and with a BBU controller, the WAL writes just move from one
cache to another.

If that's not an option for you, you could contrive to have the
update code reread the modified rows after COMMIT, or configure your
autovacuum to be very aggressive so that a background process
usually takes care of this before a SELECT gets to it. And there's
a good chance that tuning your query and/or running with literal
values available to the planner would be a big net win even without
this issue; if this issue is a problem for you, it's just another
reason to do that tuning.

> Just dropped the table to test something so I can't get the plan
> right now. Will send an update as soon as I get
> it setup again.

I'll be surprised if you don't see a seqscan. The most interesting
bit at this point (at least to me) is working on tuning the cost
factors for the planner.

-Kevin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message lars 2011-07-12 20:58:22 Re: UPDATEDs slowing SELECTs in a fully cached database
Previous Message lars 2011-07-12 20:04:57 Re: UPDATEDs slowing SELECTs in a fully cached database