Re: estimating # of distinct values

From: Csaba Nagy <ncslists(at)googlemail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: estimating # of distinct values
Date: 2011-01-05 13:43:06
Message-ID: 1294234986.3889.22.camel@clnt-sysecm-cnagy
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2010-12-30 at 21:02 -0500, Tom Lane wrote:
> How is an incremental ANALYZE going to work at all?

How about a kind of continuous analyze ?

Instead of analyzing just once and then drop the intermediate results,
keep them on disk for all tables and then piggyback the background
writer (or have a dedicated process if that's not algorithmically
feasible) and before writing out stuff update the statistics based on
the values found in modified buffers. Probably it could take a random
sample of buffers to minimize overhead, but if it is done by a
background thread the overhead could be minimal anyway on multi-core
systems.

Not sure this makes sense at all, but if yes it would deliver the most
up to date statistics you can think of.

Cheers,
Csaba.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2011-01-05 13:54:18 Streaming base backups
Previous Message Florian Pflug 2011-01-05 13:41:35 Re: Support for negative index values in array fetching