Re: Seq scans status update

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Seq scans status update
Date: 2007-05-30 19:36:54
Message-ID: 465DD256.3010702@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Jeff Davis wrote:
> On Tue, 2007-05-29 at 17:43 -0700, Jeff Davis wrote:
>>> Hmm. But we probably don't want the same buffer in two different
>>> backends' rings, either. You *sure* the sync-scan patch has no
>>> interaction with this one?
>>>
>> I will run some tests again tonight, I think the interaction needs more
>> testing than I did originally. Also, I'm not sure that the hardware I
>> have is sufficient to test those cases.
>>
>
> I ran some sanity tests last night with cvs head, plus my syncscan20-
> cvshead.patch, plus scan_recycle_buffers.v3.patch.
>
> It passed the sanity tests at least.
>
> I did see that there was more interference with sync_seqscan_threshold=0
> (always on) and scan_recycle_buffers=0 (off) than I had previously seen
> with 8.2.4, so I will test again against 8.2.4 to see why that might be.
> The interference that I saw was still quite small, a scan moving
> concurrently with 9 other scans was about 10% slower than a scan running
> alone -- which is still very good compared with plain cvs head and no
> sync scan -- it's just not ideal.
>
> However, turning scan_recycle_buffers between 0 (off), 16, 32, and 128
> didn't have much effect. At 32 it appeared to be about 1% worse during
> 10 scans, but that may have been noise. The other values I tried didn't
> have any difference that I could see.
>
> This was really just a quick sanity test, I think more hard data would
> be useful.

The interesting question is whether the small buffer ring is big enough
to let all synchronized scans to process a page before it's being
recycled. Keep an eye on pg_buffercache to see if it gets filled with
pages from the table you're querying.

I just ran a quick test with 4 concurrent scans on a dual-core system,
and it looks like we do "leak" buffers from the rings because they're
pinned at the time they would be recycled. A full scan of a 30GB table
took just under 7 minutes, and starting after a postmaster restart it
took ~4-5 minutes until all of the 320MB of shared_buffers were used.
That means we're leaking a buffer from the ring very roughly on every
20-40th ReadBuffer call, but I'll put in some proper instrumentation and
test with different configurations to get a better picture.

The synchronized scans gives such a big benefit when it's applicable,
that I think that some cache-spoiling is acceptable and in fact
unavoidable in some scenarios. It's much better than 8.2 behavior anyway.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Neil Conway 2007-05-30 19:40:17 Re: boolean <=> text explicit casts
Previous Message Peter Eisentraut 2007-05-30 19:23:32 Re: boolean <=> text explicit casts