Re: How can I make this query faster (resend)

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Cstdenis <cstdenis(at)voicio(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: How can I make this query faster (resend)
Date: 2006-05-22 15:20:18
Message-ID: 20060522152016.GM64371@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, May 19, 2006 at 03:56:49PM -0700, Cstdenis wrote:
> (Its been a hour and I dont see my message on the list so I'm sending it again. I've moved the queries and analyze out of the email incase it was rejected because too long)
>
> query: http://pastebin.ca/57218
>
> In the pictures table all the ratings have a shared index
>
> CREATE INDEX idx_rating ON pictures USING btree (rating_nudity, rating_violence, rating_sex, rating_racism, rating_spoilers, rating_yaoi, rating_yuri, rating_profanity);
>
> and approved and date_submitted and user_id also have their own btree indexes.
>
> In the picture_categories table pid and cat_id have their own btree indices plus one together.
>
> Full table definition: http://pastebin.ca/57219
>
> the cat_id and rating values vary from query to query. The one listed above took 54 seconds in a test run just now. Here is explain analyze: http://pastebin.ca/57220

pictures is the interesting table here. It looks like the planner would
do better to choose something other than a nested loop on it. Try
running EXPLAIN ANALYZE on the query with enable_nestloop=off and see
what you get (you'll need to compare it to what you get with
enable_nestloop on to see what the change is).

> Both pictures and picture categories have about 287,000 rows
>
> This query needs to run in under about a second or it kills my site by clogging apache slots (apache maxes out at 256 and I can have several hundred people on my site at a time). How can I make it run faster?
>
>
> Server is a dual xeon with a gig of ram dedicated mostly to postgresql.
> Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222

I suspect the low work_mem may be why it's using a nested loop. In
addition to the test above, it would be interesting to see what happens
to the plan if you set work_mem to 10000.

To be honest, you're pushing things expecting a machine with only 1G to
serve 300 active connections. How large is the database itself?

> I know hyperthreading is considered something that can slow down a server but with my very high concurancy (averages about 400-500 concurant users during peak hours) I am hoping the extra virtual CPUs wil help. Anyone have experance that says diferent at high concurancy?

Best bet is to try it and see. Generally, people find HT hurts, but I
recently saw it double the performance of pgbench on a windows XP
machine, so it's possible that windows is just more clever about how to
use it than linux is.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-05-22 15:21:18 Re: utilizing multiple disks for i/o performance
Previous Message Scott Marlowe 2006-05-22 15:14:51 Re: utilizing multiple disks for i/o performance