On 02/17/2012 10:34 AM, Alessandro Gagliardi wrote:
> SELECT DISTINCT(user_id) FROM blocks JOIN seen_its USING (user_id)
> WHERE seen_its.created BETWEEN (now()::date - interval '8
> days')::timestamp AND now()::date::timestamp
> SELECT DISTINCT(user_id) FROM seen_its WHERE created BETWEEN
> (now()::date - interval '8 days')::timestamp AND now()::date::timestamp
> the difference is 100x.
Though I could figure it out, it would be helpful to actually specify
which query is faster and to post the explain of *both* queries.
But in general, it is not terribly unusual to find that rewriting a
query can lead the planner to generate a superior plan. Trying and
testing different ways of writing a query is a standard tuning technique.
There are also version-specific issues with some versions of PostgreSQL
preferring ...where foo in (select... and others preferring ...where
If you are planning to ramp up to high volumes it is also *very*
important to test and tune using the size of database you plan to have
on the hardware you will use in production. You cannot extrapolate from
a dev database on an i486 (?!?) machine to a production server with more
spindles, different RAID setup, different CPU, more cores, vastly more
In the case of your queries, the second one eliminates a join and gives
the planner an easy way to optimize using the available indexes so I'm
not surprised it's faster.
Note: I am guessing that your seen_its table just grows and grows but is
rarely, if ever, modified. If it is basically a log-type table it will
be a prime candidate for partitioning on date and queries like this will
only need to access a couple relatively small child tables instead of
one massive one.
In response to
- Why so slow? at 2012-02-17 18:34:50 from Alessandro Gagliardi
pgsql-performance by date
|Next:||From: Alessandro Gagliardi||Date: 2012-02-17 22:20:29|
|Subject: Re: Why so slow?|
|Previous:||From: Alessandro Gagliardi||Date: 2012-02-17 18:34:50|
|Subject: Why so slow?|