Any ideas how can I speed up this query?

From: Priyank Tiwari <priyankgt(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Any ideas how can I speed up this query?
Date: 2015-07-28 07:52:16
Message-ID: CADbPz3G+rUaf3gDm5QQ97=38E2HNd619ZJ=9v61uY24p3P0xUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I have following table definition with 6209888 rows in it. It stores the
occurrences of species in various regions.

*TABLE DEFINITION*

Column | Type | Modifiers

--------------+------------------------+----------------------------------------------------------

id | integer | not null default
nextval('occurrences_id_seq'::regclass)

gbifid | integer | not null

sname | character varying(512) |

cname | character varying(512) |

species | character varying(512) |

location | geometry | not null

month | integer |

year | integer |

event_date | date |

dataset_key | character varying(512) |

taxon_key | character varying(512) |

taxon_rank | character varying(512) |

record_basis | character varying(512) |

category_id | integer |

country | character varying(512) |

lat | double precision |

lng | double precision |

Indexes:

"occurrences_pkey" PRIMARY KEY, btree (id)

"unique_occurrences_gbifid" UNIQUE, btree (gbifid)

"index_occurences_taxon_key" btree (taxon_key)

"index_occurrences_category_id" btree (category_id)

"index_occurrences_cname" btree (cname)

"index_occurrences_country" btree (country)

"index_occurrences_lat" btree (lat)

"index_occurrences_lng" btree (lng)

"index_occurrences_month" btree (month)

"index_occurrences_sname" btree (sname)

"occurrence_location_gix" gist (location)

I am trying to fetch the count of number of occurrences within a certain
region. I save the location of each occurrence as a geometric field as well
as lat, lng combination. Both fields are indexed. The query that is issued
is as follows.

*QUERY*

SELECT COUNT(*) FROM "occurrences" WHERE ("lat" >= -27.91550355958 AND
"lat" <= -27.015680440420002 AND "lng" >= 152.13307044728307 AND "lng" <=
153.03137355271693 AND "category_id" = 1 AND (ST_Intersects(
ST_Buffer(ST_PointFromText('POINT(152.582222 -27.465592)')::geography,
50000)::geography, location::geography)));

The problem is it takes more than acceptable time to execute the query.
Below is the explain analyze output for the same query.

*EXPLAIN ANALYZE QUERY OUTPUT (**http://explain.depesz.com/s/p2a
<http://explain.depesz.com/s/p2a>)*

Aggregate (cost=127736.06..127736.07 rows=1 width=0) (actual
time=13491.678..13491.679 rows=1 loops=1)

Buffers: shared hit=3 read=56025

-> Bitmap Heap Scan on occurrences (cost=28249.46..127731.08 rows=1995
width=0) (actual time=528.053..13388.458 rows=167511 loops=1)

Recheck Cond: ((lat >= (-27.91550355958)::double precision) AND
(lat <= (-27.01568044042)::double precision) AND (lng >=
152.133070447283::double precision) AND (lng <= 153.031373552717::double
precision))

Rows Removed by Index Recheck: 748669

Filter: ((category_id = 1) AND
('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography
&& (location)::geography) AND
(_st_distance('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography,
(location)::geography, 0::double precision, false) < 1e-05::double
precision))

Rows Removed by Filter: 6357

Heap Blocks: exact=29947 lossy=22601

Buffers: shared hit=3 read=56025

-> BitmapAnd (cost=28249.46..28249.46 rows=32476 width=0)
(actual time=519.091..519.091 rows=0 loops=1)

Buffers: shared read=3477

-> Bitmap Index Scan on index_occurrences_lat
(cost=0.00..11691.20 rows=365877 width=0) (actual time=218.999..218.999
rows=392415 loops=1)

Index Cond: ((lat >= (-27.91550355958)::double
precision) AND (lat <= (-27.01568044042)::double precision))

Buffers: shared read=1444

-> Bitmap Index Scan on index_occurrences_lng
(cost=0.00..16557.01 rows=517658 width=0) (actual time=285.211..285.211
rows=550523 loops=1)

Index Cond: ((lng >= 152.133070447283::double
precision) AND (lng <= 153.031373552717::double precision))

Buffers: shared read=2033

Planning time: 2.812 ms

Execution time: 13493.617 ms

(19 rows)

It seems that the planner is underestimating the number of rows
returned in Bitmap
Heap Scan on occurrences. I have run vacuum analyze on this table couple of
times, but it still produces the same result. Any idea how I can speed up
this query? How I can assist planner in providing better row estimates for
Bitmap Heap Scan section?

*POSTGRESQL VERSION INFO*

version

------------------------------------------------------------------------------------------------------

PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit

*HARDWARE*

I am running the Postgresql instance on a digital ocean vm with 1 core, SSD
disk and 1 GB of ram.

Appreciate your help.

Thanks,
Priyank

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Graeme B. Bell 2015-07-28 08:19:00 Re: Any ideas how can I speed up this query?
Previous Message Jeff Janes 2015-07-26 00:43:06 Re: Are many idle connections bad?