Re: varchar index joins not working?

From: "Adam Gundy" <adam(at)starsilk(dot)net>
To: "Richard Huxton" <dev(at)archonet(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: varchar index joins not working?
Date: 2008-04-10 18:54:25
Message-ID: 6f55f1270804101154t3d4a01eco2b7bff7a0053af38@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Apr 10, 2008 at 8:52 AM, Adam Gundy <adam(at)starsilk(dot)net> wrote:
> Richard Huxton wrote:
> > How many distinct values do you have in groups.groupid and
> group_access.group_id?
> >
>
> for the small database (since it shows the same problem):
>
> group_access: 280/268
> groups: 2006/139
>
> for the large database:
>
> group_access: same
> groups: 1712647/140
>
> the groupid key is an MD5 hash, so it should be uniformly distributed.
> maybe that throws the stats? but, again, char works, varchar doesn't :-(

OK, I'm thinking the varchar/char part is not the issue.

the database is very unbalanced, most of the groups are 1000 or less
records, with one group occupying 95% of the records.

I *think* that when I analyze using char instead of varchar, it is
recording a stat for the large group, but for some reason with varchar
doesn't add a stat for that one.

so, the real question is, how do I fix this? I can turn the stats way
up to 1000, but that doesn't guarantee that I'll get a stat for the
large group :-(

can I turn the statistics off completely for this column? I'm guessing
that if I can, that will mean it takes a guess based on the number of
distinct values in the groups table, which is still large number of
records, possibly enough to trigger the seqscan anyway.

does postgres have a way of building a 'counted index' that the
planner can use for it's record counts? some way of forcibly
maintaining a stat for every group?

the groups are not related to one another - is it possible to
partition them into their own indexes somehow?

ahh. lots of questions, no (obvious to me) answers from googling around.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Stosberg 2008-04-10 21:28:51 Re: recommendations for web/db connection pooling or DBD::Gofer reviews
Previous Message Greg Smith 2008-04-10 18:47:59 Re: large tables and simple "= constant" queries using indexes