Re: estimating # of distinct values

From: Csaba Nagy <ncslists(at)googlemail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: estimating # of distinct values
Date: 2011-01-05 13:43:06
Message-ID: 1294234986.3889.22.camel@clnt-sysecm-cnagy
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2010-12-30 at 21:02 -0500, Tom Lane wrote:
> How is an incremental ANALYZE going to work at all?

How about a kind of continuous analyze ?

Instead of analyzing just once and then drop the intermediate results,
keep them on disk for all tables and then piggyback the background
writer (or have a dedicated process if that's not algorithmically
feasible) and before writing out stuff update the statistics based on
the values found in modified buffers. Probably it could take a random
sample of buffers to minimize overhead, but if it is done by a
background thread the overhead could be minimal anyway on multi-core

Not sure this makes sense at all, but if yes it would deliver the most
up to date statistics you can think of.


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2011-01-05 13:54:18 Streaming base backups
Previous Message Florian Pflug 2011-01-05 13:41:35 Re: Support for negative index values in array fetching