Skip site navigation (1) Skip section navigation (2)

Re: Why so slow?

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: Alessandro Gagliardi <alessandro(at)path(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why so slow?
Date: 2012-02-17 19:21:47
Message-ID: 4F3EA8CB.5010304@pinpointresearch.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 02/17/2012 10:34 AM, Alessandro Gagliardi wrote:
> Comparing
> SELECT DISTINCT(user_id) FROM blocks JOIN seen_its USING (user_id) 
> WHERE seen_its.created BETWEEN (now()::date - interval '8 
> days')::timestamp AND now()::date::timestamp
> to
> SELECT DISTINCT(user_id) FROM seen_its WHERE created BETWEEN 
> (now()::date - interval '8 days')::timestamp AND now()::date::timestamp
> the difference is 100x.
> ...
Though I could figure it out, it would be helpful to actually specify 
which query is faster and to post the explain of *both* queries.

But in general, it is not terribly unusual to find that rewriting a 
query can lead the planner to generate a superior plan. Trying and 
testing different ways of writing a query is a standard tuning technique.

There are also version-specific issues with some versions of PostgreSQL 
preferring ...where foo in (select... and others preferring ...where 
exists (select...

If you are planning to ramp up to high volumes it is also *very* 
important to test and tune using the size of database you plan to have 
on the hardware you will use in production. You cannot extrapolate from 
a dev database on an i486 (?!?) machine to a production server with more 
spindles, different RAID setup, different CPU, more cores, vastly more 
memory, etc.

In the case of your queries, the second one eliminates a join and gives 
the planner an easy way to optimize using the available indexes so I'm 
not surprised it's faster.

Note: I am guessing that your seen_its table just grows and grows but is 
rarely, if ever, modified. If it is basically a log-type table it will 
be a prime candidate for partitioning on date and queries like this will 
only need to access a couple relatively small child tables instead of 
one massive one.

Cheers,
Steve


In response to

  • Why so slow? at 2012-02-17 18:34:50 from Alessandro Gagliardi

Responses

pgsql-performance by date

Next:From: Alessandro GagliardiDate: 2012-02-17 22:20:29
Subject: Re: Why so slow?
Previous:From: Alessandro GagliardiDate: 2012-02-17 18:34:50
Subject: Why so slow?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group