Re: Slow query: select * order by XXX desc offset 10 limit 10

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Alexander Farber <alexander(dot)farber(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Slow query: select * order by XXX desc offset 10 limit 10
Date: 2011-10-14 11:34:05
Message-ID: BF898984-6BCA-4652-BCA3-019CBAB922C5@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 14 Oct 2011, at 11:14, Alexander Farber wrote:

> I've added 3 new indices on both tables:
>
>
> quincy=> \d quincynoreset
> Table "public.quincynoreset"
> Column | Type | Modifiers
> -------------+-----------------------------+---------------
> appsversion | character varying(30) |
> beta_prog | character varying(20) |
> category | character varying(120) |
> catinfo | character varying(120) |
> details | character varying(50) |
> devinfo | character varying(4000) |
> emailid | character varying(16) |
> email | character varying(320) |
> formfactor | character varying(10) |
> id | character varying(20) | not null
> imei | character varying(25) |
> name | character varying(20) |
> osversion | character varying(30) |
> pin | character varying(12) |
> qdatetime | timestamp without time zone |
> copied | timestamp without time zone | default now()
> Indexes:
> "quincynoreset_pkey" PRIMARY KEY, btree (id)
> "quincynoreset_appsversion_index" btree (appsversion)
> "quincynoreset_osversion_index" btree (osversion)
> "quincynoreset_qdatetime_index" btree (qdatetime)

(...)

> quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as
> QDATETIME_2,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO
> from quincyview where qdatetime <= now() order by QDATETIME desc
> offset 10 limit 10;
>
>
> QUERY PLAN
>
>
> ----------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------
> Limit (cost=558551.88..558551.91 rows=10 width=1172)
> -> Sort (cost=558551.86..560883.79 rows=932773 width=1172)
> Sort Key: quincyview.qdatetime
> -> Subquery Scan quincyview (cost=482428.59..533731.10
> rows=932773 width=1172)
> -> Unique (cost=482428.59..522071.44 rows=932773 width=252)
> -> Sort (cost=482428.59..484760.52 rows=932773 width=252)
> Sort Key: quincynoreset.qdatetime,
> quincynoreset.id, quincynoreset.name, quincynoreset.cate
> gory, quincynoreset.appsversion, quincynoreset.osversion,
> quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
> t.details, quincynoreset.devinfo, quincynoreset.email,
> quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
> uincynoreset.formfactor, quincynoreset.copied
> -> Append (cost=0.00..55177.71
> rows=932773 width=252)
> -> Seq Scan on quincynoreset
> (cost=0.00..39171.89 rows=807446 width=242)
> Filter: (qdatetime <= now())
> -> Seq Scan on quincytrack
> (cost=0.00..6678.09 rows=125327 width=315)
> Filter: (qdatetime <= now())
> (12 rows)

An explain analyse would have been a bit more informative.

Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to guarantee that the results are unique (hence that long Sort Key at the 7th line of explain output).
For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <= now(), which is probably the case.

It doesn't matter that you only want 10 results from that set, the database will first have to figure out which those rows are. That gets more complicated because they can come from two different tables, due to the UNION.

Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case, try UNION ALL instead of UNION.

If you do need unique results, then you could create an index on the combination of all those fields. That should take out the need for those sequential scans.

Alban Hertroys

--
The scale of a problem often equals the size of an ego.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2011-10-14 11:42:13 Re: dll files missing in postgrsql bin folder in Windows
Previous Message Thom Brown 2011-10-14 11:12:53 VACUUM touching file but not updating relation