Re: index usage (and foreign keys/triggers)

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Patrik Kudo <kudo(at)pingpong(dot)net>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: index usage (and foreign keys/triggers)
Date: 2003-02-27 16:21:46
Message-ID: 20030227081353.T86039-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 27 Feb 2003, Patrik Kudo wrote:

> Stephan Szabo wrote:
> >>>be lower, or that perhaps there's some level of clustering in the data
> >>>that's not being picked up. You might want to try raising the
> >>>number of statistics buckets and re-analyzing just to see if that helps.
> >>
> >>I'm afraid I'm a bit too new at this kind of tweaking... do you mean the
> >>"default_statistics_target"? In that case I tried to raise it from the
> >>default 10 to as high as 45, but without any other result than vacuum
> >>analyze being slower. Did I understand your suggestion right?
> >
> >
> > I'd thought about doing it with ALTER TABLE ALTER COLUMN SET STATISTICS,
> > but I would think that it would probably have worked with default as well.
>
> What exactly does this setting do and how does it affect the
> planner/optimizer? I couldn't find much about this in the docs.

It changes the number of entries that get stored in the statistics table,
but IIRC in this case (potentially more importantly) also raises the
number of rows that it grabs as its sample.

> > Is it possible that the data has local clustering on the field (many
> > rows with the same value stuck together) while not being terribly ordered
> > overall? That's a case that the statistics don't really cover right now
> > (there have been some discussions of this in the past)
>
> How can I find this out? A simple "select * from login" and just browse
> the result, or is there any automated way to analyze this?

There's probably a reasonable way to automate it, although I don't know if
anyone's done it. Technically I think want you really want to know is for
a given value of the search key how many pages of the database heap file
can you find such rows on versus the total number of pages in the table.
You might be able to estimate the number of distinct pages that live rows
are taking by something like:
select ctid, col from table order by col;
and parsing it in perl (AFAIK the first part of the ctid column is the
page). You can get the total number from vacuum verbose's output or
an estimate from last vacuum (I think) from pg_class.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message scott.marlowe 2003-02-27 16:22:57 Re: 7.4?
Previous Message scott.marlowe 2003-02-27 15:59:51 Re: Is renaming a database easy or dangerous