Quick Links

Re: estimating # of distinct values

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: estimating # of distinct values
Date:	2010-12-27 23:04:20
Message-ID:	4D18C7140200002500038BF5@gw.wicourts.gov
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Well, first, those scans occur only once every few hundred million
> transactions, which is not likely a suitable timescale for
> maintaining statistics.

I was assuming that the pass of the entire table was priming for the
incremental updates described at the start of this thread. I'm not
clear on how often the base needs to be updated for the incremental
updates to keep the numbers "close enough".

> And second, we keep on having discussions about rejiggering
> the whole tuple-freezing strategy. Even if piggybacking on those
> scans looked useful, it'd be unwise to assume it'll continue to
> work the same way it does now.

Sure, it might need to trigger its own scan in the face of heavy
deletes anyway, since the original post points out that the
algorithm handles inserts better than deletes, but as long as we
currently have some sequential pass of the data, it seemed sane to
piggyback on it when possible. And maybe we should be considering
things like this when we weigh the pros and cons of rejiggering.
This issue of correlated values comes up pretty often....

-Kevin

In response to

Re: estimating # of distinct values at 2010-12-27 22:55:12 from Tom Lane

Responses

Re: estimating # of distinct values at 2010-12-28 00:20:31 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2010-12-28 00:03:47	Re: estimating # of distinct values
Previous Message	Tom Lane	2010-12-27 22:55:12	Re: estimating # of distinct values