Quick Links

Re: Thoughts on statistics for continuously advancing columns

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Nathan Boley <npboley(at)gmail(dot)com>
Subject:	Re: Thoughts on statistics for continuously advancing columns
Date:	2009-12-31 18:56:05
Message-ID:	m2ljgjht16.fsf@hi-media.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Actually, in the problematic cases, it's interesting to consider the
> following strategy: when scalarineqsel notices that it's being asked for
> a range estimate that's outside the current histogram bounds, first try
> to obtain the actual current max() or min() of the column value --- this
> is something we can get fairly cheaply if there's a btree index on the
> column. If we can get it, plug it into the histogram, replacing the
> high or low bin boundary. Then estimate as we currently do. This would
> work reasonably well as long as re-analyzes happen at a time scale such
> that the histogram doesn't move much overall, ie, the number of
> insertions between analyzes isn't a lot compared to the number of rows
> per bin. We'd have some linear-in-the-bin-size estimation error because
> the modified last or first bin actually contains more rows than other
> bins, but it would certainly work a lot better than it does now.

I know very little about statistics in general, but your proposal seems
straigth enough for me to understand it, and looks good: +1.

Regards,
--
dim

In response to

Re: Thoughts on statistics for continuously advancing columns at 2009-12-30 19:55:20 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2009-12-31 19:25:38	Re: Thoughts on statistics for continuously advancing columns
Previous Message	Bruce Momjian	2009-12-31 18:44:32	Re: uintptr_t for Datum