Re: LIMIT causes huge slow down

From: Grégory Giannoni <greg(at)wmaker(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: LIMIT causes huge slow down
Date: 2014-03-13 17:05:38
Message-ID: 3E89FAAB-C4EB-45F0-B7C9-64DA24A73899@wmaker.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


Le 12 mars 2014 à 17:34, David Johnston a écrit :

> Harry Rossignol wrote
>> hello
>>
>> i am just a lowly application developer, but i always include my 'where'
>> fields in my order by specification.
>> I.E. ORDER BY id_webzine,id_flickr,id_picasa,id_photo rather then just
>> id_photo
> [...]
> WRT to question posed: probably the easiest workaround is to move the
> unlimited query to a WITH clause and then apply the limit separately.
>
> You should indicate what version of PostgreSQL you are using.
>
> Sorry I'm not much help on the how and why of the actual plan choices here.
> The sorting is constant but since the limit and the where clause target
> different tables a full evaluation is needed to determine a solution so
> picking individual rows, which is what I see happening, doesn't by you
> anything.
>
> But, for all I know your using an old version and this undesirable behavior
> has already been found and fixed.

Hi,

thank you for your answer. I'm using PostgreSQL 9.3.3.

Anyway, I tried Harry's idea.. and the result is for me very surprising : it worked.

=> explain analyze SELECT id_photo, id_album, galerie_photo.etat, galerie_photo.titre, galerie_photo.dimension FROM galerie_photo JOIN galerie_album USING(id_album) WHERE galerie_album.id_webzine='18' AND id_flickr=0 AND id_picasa=0 ORDER BY id_photo DESC, id_album DESC LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=20388.78..20388.80 rows=10 width=37) (actual time=31.402..31.406 rows=10 loops=1)
-> Sort (cost=20388.78..20401.30 rows=5010 width=37) (actual time=31.400..31.401 rows=10 loops=1)
Sort Key: galerie_photo.id_photo, galerie_photo.id_album
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.84..20280.51 rows=5010 width=37) (actual time=0.087..27.920 rows=8247 loops=1)
-> Index Scan using galerie_album_picasa_flickr_idw_pos_idx on galerie_album (cost=0.42..251.20 rows=225 width=4) (actual time=0.032..0.529 rows=256 loops=1)
Index Cond: ((id_picasa = 0) AND (id_flickr = 0) AND (id_webzine = 18))
-> Index Scan using galerie_photo_id_album_idx on galerie_photo (cost=0.43..88.45 rows=57 width=37) (actual time=0.019..0.092 rows=32 loops=256)
Index Cond: (id_album = galerie_album.id_album)
Total runtime: 31.465 ms
(10 rows)

Adding a second order clause make the query planner choose the same plan that without the LIMIT, and perfs are OK.

It is against my thoughts that adding sort element necessary slow down the process... but it work (for this particular case).

Best regards,

Grégory Giannoni.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Fujii Masao 2014-03-13 18:59:45 Re: BUG #9118: WAL Sender does not disconnect replication clients during shutdown
Previous Message David Johnston 2014-03-13 16:16:45 Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns