Re: Bug: Buffer cache is not scan resistant

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(at)enterprisedb(dot)com>
Cc: "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>, "Gavin Sherry" <swm(at)alcove(dot)com(dot)au>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "PGSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Doug Rady" <drady(at)greenplum(dot)com>, "Sherry Moore" <sherry(dot)moore(at)sun(dot)com>
Subject: Re: Bug: Buffer cache is not scan resistant
Date: 2007-03-05 18:55:46
Message-ID: 20896.1173120946@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Pavan Deolasee" <pavan(at)enterprisedb(dot)com> writes:
> I am wondering whether seqscan would set the usage_count to 1 or to a higher
> value. usage_count is incremented while unpinning the buffer. Even if
> we use
> page-at-a-time mode, won't the buffer itself would get pinned/unpinned
> every time seqscan returns a tuple ? If thats the case, the overhead would
> be O(BM_MAX_USAGE_COUNT * N) for every N reads.

No, it's only once per page. There's a good deal of PrivateRefCount
thrashing that goes on while examining the individual tuples, but the
shared state only changes when we leave the page, because we hold pin
continuously on the current page of a seqscan. If you don't believe
me, insert some debug printouts for yourself.

> How about smaller value for BM_MAX_USAGE_COUNT ?

This is not relevant to the problem: we are concerned about usage count
1 versus 0, not the other end of the range.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-03-05 19:02:32 Re: Bug: Buffer cache is not scan resistant
Previous Message Josh Berkus 2007-03-05 18:46:16 Re: Bug: Buffer cache is not scan resistant