Re: Interpreting query plan

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Chris Smith <cdsmith(at)twu(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Interpreting query plan
Date: 2004-07-06 19:41:28
Message-ID: 20040706122813.M19308@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 6 Jul 2004, Chris Smith wrote:

> Stephan Szabo wrote:
>
> > I wonder if in practice this query uses distinct to get around a
> > problem with subqueries.
>
> Yes, it does exactly that. The database was recently moved over to PostgreSQL
> 7.4. The thought of switching over to IN-subqueries is a bit scary, since the
> query generator is a really over-generalized mess of about 7000 LOC in 17
> different source files; but I'll certainly look at that.

It might be worth just converting some by hand to see what explain analyze
says about them in contrast to the original.

> > These steps are for the distinct. It's not alot of actual time, but
> > if the row set returned was large enough to exceed sort_mem the sort
> > might start going off to disk and be slower.
>
> Indeed, that looks like it could be a problem for some of our larger customers
> who have up to tens of thousands of users. The IN form would avoid this sort?

The IN form would potentially use a generally different plan. It still
uses sort_mem for some of its determinations, but I *think* you'd get more
in than you would be for the sort. If you have the RAM and are doing
queries like this alot, you might want to raise sort_mem if you haven't
changed it from the default.

> > The row estimate is pretty reasonable, estimated 629 versus actual
> > 753. How many rows are in useraccount? I'm wondering if 629 is a
> > reasonable percentage of the rows to see if seq scan is reasonable
> > here.
>
> Total? On this server, it's currently 2566.

Okay, so 629 doing a seq scan is pretty reasonable if the table doesn't
have alot of empty space.

> > Here the estimate isn't so good, estimated 115 vs actual 1328. You
> > might want to consider raising the groupid column's statistics target
> > and re-analyzing to see if you can get a better estimate.
>
> Alright. So how exactly do I raise the groupid column's statistics target?

Oops, I mentioned it in a previous version of that paragraph and
apparently removed it upon editing.

ALTER TABLE <table> ALTER COLUMN <column> SET STATISTICS <integer>.
Maybe try 100 to see what it gives you.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruno Wolff III 2004-07-06 20:39:05 Re: backup using cron
Previous Message Chris Browne 2004-07-06 19:27:54 Re: postgresql +AMD64 +big address spaces - does it work?