Re: estimating # of distinct values

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: estimating # of distinct values
Date: 2010-12-30 20:23:18
Message-ID: 1293740412-sup-9219@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Excerpts from Tomas Vondra's message of jue dic 30 16:38:03 -0300 2010:

> > Since the need to regularly VACUUM tables hit by updated or deleted
> > won't go away any time soon, we could piggy-back the bit field
> > rebuilding onto VACUUM to avoid a second scan.
>
> Well, I guess it's a bit more complicated. First of all, there's a local
> VACUUM when doing HOT updates. Second, you need to handle inserts too
> (what if the table just grows?).
>
> But I'm not a VACUUM expert, so maybe I'm wrong and this is the right
> place to handle rebuilds of distinct stats.

I was thinking that we could have two different ANALYZE modes, one
"full" and one "incremental"; autovacuum could be modified to use one or
the other depending on how many changes there are (of course, the user
could request one or the other, too; not sure what should be the default
behavior). So the incremental one wouldn't worry about deletes, only
inserts, and could be called very frequently. The other one would
trigger a full table scan (or nearly so) to produce a better estimate in
the face of many deletions.

I haven't followed this discussion closely so I'm not sure that this
would be workable.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aidan Van Dyk 2010-12-30 20:24:09 Re: Sync Rep Design
Previous Message Marti Raudsepp 2010-12-30 20:11:40 Re: Sync Rep Design