Re: [PERFORM] pgbench to the MAXINT

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] pgbench to the MAXINT
Date: 2011-02-09 19:40:08
Message-ID: 20110209194008.GX4116@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Greg,

* Greg Smith (greg(at)2ndquadrant(dot)com) wrote:
> I took that complexity out and just put a hard line
> in there instead: if scale>=20000, you get bigints. That's not
> very different from the real limit, and it made documenting when the
> switch happens easy to write and to remember.

Agreed completely on this.

> It turns out that even though I've been running an i386 Linux on
> here, it's actually a 64-bit CPU. (I think that it has a 32-bit
> install may be an artifact of Adobe Flash install issues, sadly) So
> this may not be as good of a test case as I'd hoped.

Actually, I would think it'd still be sufficient.. If you're under a
32bit kernel you're not going to be using the extended registers, etc,
that would be available under a 64bit kernel.. That said, the idea that
we should care about 32-bit systems these days, in a benchmarking tool,
is, well, silly, imv.

> 1) A look into the expected range of the rand() function suggests
> the glibc implementation normally proves 30 bits of resolution, so
> about 1 billion numbers. You'll have >1B rows in a pgbench database
> once the scale goes over 10,000. So without a major overhaul of how
> random number generation is treated here, people can expect the
> distribution of rows touched by a test run to get less even once the
> database scale gets very large.

Just wondering, did you consider just calling random() twice and
smashing the result together..?

> I added another warning paragraph
> to the end of the docs in this update to mention this. Long-term, I
> suspect we may need to adopt a superior 64-bit RNG approach,
> something like a Mersenne Twister perhaps. That's a bit more than
> can be chewed on during 9.1 development though.

I tend to agree that we should be able to improve the random number
generation in the future. Additionally, imv, we should be able to say
"pg_bench version X isn't comparable to version Y" in the release notes
or something, or have seperate version #s for it which make it clear
what can be compared to each other and what can't. Painting ourselves
into a corner by saying we can't ever make pgbench generate results that
can't be compared to every other released version of pgbench just isn't
practical.

> 2) I'd rate odds are good there's one or more corner-case bugs in
> \setrandom or \setshell I haven't found yet, just from the way that
> code was converted. Those have some changes I haven't specifically
> tested exhaustively yet. I don't see any issues when running the
> most common two pgbench tests, but that's doesn't mean every part of
> that 32 -> 64 bit conversion was done correctly.

I'll take a look. :)

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2011-02-09 19:51:44 Re: SSI patch version 14
Previous Message Robert Haas 2011-02-09 19:36:25 Re: SSI patch version 14

Browse pgsql-performance by date

  From Date Subject
Next Message felix 2011-02-09 22:54:29 Re: Really really slow select count(*)
Previous Message Nick Lello 2011-02-09 15:34:01 Re: Re: Indexes with condition using immutable functions applied to column not used