Re: Thoughts on statistics for continuously advancing columns

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Nathan Boley <npboley(at)gmail(dot)com>
Subject: Re: Thoughts on statistics for continuously advancing columns
Date: 2009-12-30 19:21:37
Message-ID: 1262200897.15659.4.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On tis, 2009-12-29 at 22:08 -0500, Tom Lane wrote:
> This seems like a fundamentally broken approach, first because "time
> between analyzes" is not even approximately a constant, and second
> because it assumes that we have a distance metric for all datatypes.

Maybe you could compute a correlation between the column values and the
transaction numbers to recognize a continuously advancing column. It
wouldn't tell you much about how fast they are advancing, but at least
the typical use cases of serial and current timestamp columns should
clearly stick out. And then instead of assuming that a value beyond the
histogram bound doesn't exist, you assume for example the average
frequency, which should be pretty good for the serial and timestamp
cases. (Next step: Fourier analysis ;-) )

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-12-30 19:23:03 Re: PATCH: Add hstore_to_json()
Previous Message Joshua D. Drake 2009-12-30 19:20:51 Re: Thoughts on statistics for continuously advancing columns