Re: Query optimization using order by and limit

From: Michael Viscuso <michael(dot)viscuso(at)getcarbonblack(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Query optimization using order by and limit
Date: 2011-09-22 22:55:47
Message-ID: 4E7BBCF3.4080502@getcarbonblack.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen,

I spent the better part of the day implementing an application layer
nested loop and it seems to be working well. Of course it's a little
slower than a Postgres only solution because it has to pass data back
and forth for each daily table query until it reaches the limit, but at
least I don't have "runaway" queries like I was seeing before. That
should be a pretty good stopgap solution for the time being.

I was really hoping there was a Postgres exclusive answer though! :) If
there are any other suggestions, it's a simple flag in my application to
query the other way again...

Thanks for all your help - and I'm still looking to change those
numerics to bigints, just haven't figured out the best way yet.

Mike

On 9/22/2011 10:53 AM, Stephen Frost wrote:
> * Michael Viscuso (michael(dot)viscuso(at)getcarbonblack(dot)com) wrote:
>> Adding the final condition hosts_guid = '2007075705813916178' is what
>> ultimately kills it http://explain.depesz.com/s/8zy. By adding the
>> host_guid, it spends considerably more time in the older tables than
>> without this condition and I'm not sure why.
>
> What I think is happening here is that PG is pushing down that filter
> (not typically a bad thing..), but with that condition, it's going to
> scan the index until it finds a match for that filter before returning
> back up only to have that result cut out due to the limit. Having it as
> numerics isn't helping here, but the bigger issue is having to check all
> those tuples for a match to the filter.
>
> Mike, the filter has to be applied before the order by/limit, since
> those clauses come after the filter has been applied (you wouldn't want
> a 'where x = 2 limit 10' to return early just because it found 10
> records where x didn't equal 2).
>
> What would be great is if PG would realize that the CHECK constraints
> prevent earlier records from being in these earlier tables, so it
> shouldn't need to consider them at all once the records from the
> 'latest' table has been found and the limit reached (reverse all this
> for an 'ascending' query, of course), which we can do when there's no
> order by. I don't believe we have that kind of logic or that
> information available at this late stage- the CHECK constraints are used
> to eliminate the impossible-to-match tables, but that's it.
>
> One option, which isn't great of course, would be to implement your own
> 'nested loop' construct (something I typically despise..) in the
> application which just walks backwards from the latest and pulls
> whatever records it can from each day and then stops once it hits the
> limit.
>
> Thanks,
>
> Stephen

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOe7zzAAoJEBKjVK2HR1IXYwAIAKQBnFOtCNljL1Hs1ZQW3e+I
ele/kZCiHzgHLFpN7zawt1Y7qf+3ntd6u+mkatJsnqeC+HY1Qee4VTUqr+hIKhcc
VIGuuYkzuojs6/PgF6MAERHP24lRFdLCQtMgTY8RshYODvc07VpqkLq1cXhsNJZw
6pNBTEpEmA0MzMrmk3x6C8lFbyXZAYUxNLwG5SEWecV+lkOjnA70oKnSxG6EXRgk
fkj2l1ezVn23KoO8SSUp4xBFHHOY/PQP9JtV7b52Gm5PC7lOqFFrXFygNP0KkWho
TzyjoYKttShEjmTMXoLt181+NB4rQEas8USasemRA1pUkx2NrfvcK46gYucOAsg=
=8yQW
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Crooke 2011-09-22 23:03:51 Re: Optimizing Trigram searches in PG 9.1
Previous Message Jonathan Bartlett 2011-09-22 16:40:46 Optimizing Trigram searches in PG 9.1