Re: slow query on tables with new columns added.

From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: "M(dot) D(dot)" <lists(at)turnkey(dot)bz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: slow query on tables with new columns added.
Date: 2011-09-26 23:45:59
Message-ID: CAP_rwwmUwW6a6CD+5AQXaOTo5==US5gN+Nz5vOhYDesSLqOkwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2011/9/26 M. D. <lists(at)turnkey(dot)bz>

> I have full access to the database, but no access to the application
> source code. If creating an index will help, I can do that, but with the
> columns I don't see it helping as I don't have access to the application
> source to change that.
>
> So yes, by changing settings, I would like to know if there's any memory
> settings I can change to help or create an index. There is an index on the
> customer_id column in the gltx table, so I'm not sure what else could be
> done.
>
> If there was a way to create a select trigger, I would do it and return 0
> for both columns on that customer_id as it should always be 0.
>
>
>
Hi

I didn't respond earlier, because I actually don't see any easy way of
speeding up the query.

The memory settings seem fine for this size of data.

It does not look like you can change things by simply adding indexes. I
mean, you can certainly add a specially crafted partial index on
gltx.customer_id WHERE (gltx.inactive_on IS NULL) AND (gltx.posted = 'Y') -
this can earn you a few percent max.

The problem here might be the type of join columns - we can see they are
about 10 characters which is not an ideal choice (that's one of reasons why
I'm a fan of artificial integer pkeys).

You _could_ try running the query with enable_mergejoin = off and see what
happens.

You can check if the problem persists after dumping and reloading to another
db.

If app modification was possible, you could materialize the data _before_ it
must be queried - using summary table and appropriate triggers for keeping
it up-to-date.

Regarding your last comment - on that customer_id values should be 0 - if
it's a persistent business rule, I would try to create a CHECK to reflect
it. With some luck and fiddling, constraint_exclusion might come to help
with speeding up your query.

Also, if there is something special about customer_id distribution - table
partitioning might be an option.

Ok, that's a long list - hope this helps, and good luck.

After all you can throw more hardware at the problem - or hire some Pg
magician :-)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2011-09-27 00:09:37 Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3
Previous Message Timothy Garnett 2011-09-26 21:11:33 Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3