Re: Synchronized Scan benchmark results

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronized Scan benchmark results
Date: 2007-04-04 17:23:49
Message-ID: 1175707429.4152.64.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2007-04-04 at 10:40 +0100, Simon Riggs wrote:
> > That makes no sense to me, so it's probably a fluke (by which I mean
> > some other activity on the system, perhaps swapping some large
> > applications). The second two tests are consistent with all the other
> > numbers I got, but the first one took 40 seconds longer than I would
> > expect. I'll do a simple re-test tonight.
>
> What did you set scan_recycle_buffers to? The default was zero.
>
> I think v2 of the patch interpreted that setting as meaning attempt to
> reuse the same buffer again immediately, which probably wouldn't be
> optimal. Which is why I issued v3... I think you'll need to set
> scan_recycle_buffers = 0 (==off in v3) and scan_recycle_buffers = 32 to
> get sensible comparison figures.
>

I used v2 with default in those tests, so I think that means it used the
same buffer.

By the way, on another test I did that results came out at 165s, which
is consistent with the other results. I think the time I ran that the
machine must have been swapping out applications or something... who
knows.

> So please can you use v3 for any further testing. Thanks.

I'll use v3 of the patch as located here:

http://archives.postgresql.org/pgsql-hackers/2007-03/msg00709.php

By the way, it might be easier to find the right one if the archives
contained filenames for the attachments. Am I missing something obvious?

> > > I would like to see some tests with different queries that have varying
> > > I/O and CPU requirements to see if they stay together too. That won't
> > > block the patch, but it will help everybody understand what the range of
> > > real world applicability there is in this. I'd guess this can benefit us
> > > sufficiently frequently in most cases that its worth it.
> >
> > I'll do some more varied tests. The best idea I've come up with so far
> > is to do something that requires random seeking going concurrently with
> > the scans.
>
> No, what I mean is different kinds of scans:
> - a simple scan like count(*)

Will use my same "scan.rb" benchmark.

> - a more complex one that does buckets of cycles per tuple

I'll use a modified "scan.rb" that does a computation in the select list
(I'll call the function volatile so that it recomputes with each tuple).

> - a hash join

This is where I got stuck.

* If it's one big ( > NBuffers/2 ) table and one small table, the small
table will only serve to occupy some shared_buffers (right?)
* If it's two big tables, a join would be a major operation. I don't
think it would even choose a hash join in that situation, right?

To summarize, in the next round of testing, I will
* disable sync_seqscan_offset completely
* use recycle_buffers=0 and 32
* I'll still test against 8.2.3 for consistency in case you suggest
otherwise.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-04-04 17:28:12 Re: PL/Python warnings in CVS HEAD
Previous Message Tom Lane 2007-04-04 17:01:18 Re: Bug in UTF8-Validation Code?