Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", )

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", )
Date: 2005-11-01 00:56:04
Message-ID: 20051101005604.GU20349@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Mon, Oct 31, 2005 at 09:02:59PM -0300, Alvaro Herrera wrote:
> Jim C. Nasby wrote:
> > Now that I've got a little better idea of what this code does, I've
> > noticed something interesting... this issue is happening on an 8-way
> > machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
> > greatly increase the odds of buffer conflicts? Bug aside, would it be
> > better to set NUM_SLRU_BUFFERS higher for a larger number of CPUs?
>
> We had talked about increasing NUM_SLRU_BUFFERS depending on
> shared_buffers, but it didn't get done. Something to consider for 8.2
> though. I think you could have better performance by increasing that
> setting, while at the same time dimishing the possibility that the race
> condition appears.

Ok, I'll look into that. This database is definately having issues due
to the sheer transaction volume, so maybe that will help.

If NUM_SLRU_BUFFERS were to be tied to something, wouldn't it make more
sense to tie it to wal_buffers though? One example is a data warehouse
might have a very high shared_buffers, but most likely won't have a high
transaction rate. ISTM that most databases with a high transaction rate
are likely to have increased wal_buffers.

> I think you should also consider increasing PGPROC_MAX_CACHED_SUBXIDS
> (src/include/storage/proc.h), because that should decrease the chance
> that the subtrans area needs to be scanned. By how much, however, I
> wouldn't know -- it depends on the number of subtransactions you
> typically have; I guess you could activate the measuring code in
> procarray.c to get a figure.

AFAIK they're not using subtransactions at all, but I'll check.

Is there anywhere this stuff is documented other than in code? It sounds
like an advanced tuning guide would be very valuable for environments
like this one...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2005-11-01 01:19:42 regression failures on WIndows in machines with some non-English locales
Previous Message Mark Wong 2005-11-01 00:10:44 Re: Spinlocks, yet again: analysis and proposed patches

Browse pgsql-patches by date

  From Date Subject
Next Message Neil Conway 2005-11-01 01:02:07 Re: [PATCHES] Partitioning docs
Previous Message Alvaro Herrera 2005-11-01 00:02:59 Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", )