Re: Optimisation deficiency: currval('seq')-->seq scan, constant-->index scan

From: Tiago Antão <tra(at)fct(dot)unl(dot)pt>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jules Bean <jules(at)jellybean(dot)co(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimisation deficiency: currval('seq')-->seq scan, constant-->index scan
Date: 2000-08-23 15:03:42
Message-ID: Pine.LNX.4.21.0008231543340.4273-100000@eros.si.fct.unl.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Wed, 23 Aug 2000, Tom Lane wrote:

> Yes, we know about that one. We have stats about the most common value
> in a column, but no information about how the less-common values are
> distributed. We definitely need stats about several top values not just
> one, because this phenomenon of a badly skewed distribution is pretty
> common.

An end-biased histogram has stats on top values and also on the least
frequent values. So if a there is a selection on a value that is well
bellow average, the selectivity estimation will be more acurate. On some
research papers I've read, it's refered that this is a better approach
than equi-width histograms (which are said to be the "industry" standard).

I not sure whether to use a table or a array attribute on pg_stat for
the histogram, the problem is what could be expected from the size of the
attribute (being a text). I'm very affraid of the cost of going through
several tuples on a table (pg_histogram?) during the optimization phase.

One other idea would be to only have better statistics for special
attributes requested by the user... something like "analyze special
table(column)".

Best Regards,
Tiago

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-08-23 15:20:28 Re: Problem with insert
Previous Message Tom Lane 2000-08-23 14:46:28 Re: analyze.c