Re: Bug: Buffer cache is not scan resistant

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Luke Lonergan <LLonergan(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Doug Rady <drady(at)greenplum(dot)com>, Sherry Moore <sherry(dot)moore(at)sun(dot)com>
Subject: Re: Bug: Buffer cache is not scan resistant
Date: 2007-03-05 20:11:37
Message-ID: 1173125497.13722.315.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2007-03-05 at 03:51 -0500, Luke Lonergan wrote:
> The Postgres shared buffer cache algorithm appears to have a bug. When
> there is a sequential scan the blocks are filling the entire shared
> buffer cache. This should be "fixed".
>
> My proposal for a fix: ensure that when relations larger (much larger?)
> than buffer cache are scanned, they are mapped to a single page in the
> shared buffer cache.
>

I don't see why we should strictly limit sequential scans to one buffer
per scan. I assume you mean one buffer per scan, but that raises these
two questions:

(1) What happens when there are more seq scans than cold buffers
available?
(2) What happens when two sequential scans need the same page, do we
double-up?

Also, the first time we access any heap page of a big table, we are very
unsure whether we will access it again, regardless of whether it's part
of a seq scan or not.

In our current system of 4 LRU lists (depending on how many times a
buffer has been referenced), we could start "more likely" (e.g. system
catalogs, index pages) pages in higher list, and heap reads from big
tables in the lowest possible list. Assuming, of course, that has any
benefit (frequently accessed cache pages are likely to move up in the
lists very quickly anyway).

But I don't think we should eliminate caching of heap pages in big
tables all together. A few buffers might be polluted during the scan,
but most of the time they will be replacing other low-priority pages
(perhaps from another seq scan) and probably be replaced again quickly.
If that's not happening, and it's polluting frequently-accessed pages, I
agree that's a bug.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2007-03-05 20:12:34 Re: Bug: Buffer cache is not scan resistant
Previous Message Simon Riggs 2007-03-05 20:06:18 Re: Bug: Buffer cache is not scan resistant