Re: Slow query when pg_trgm is in inner lopp

From: Sasa Vilic <sasavilic(at)gmail(dot)com>
To: Matthew Hall <mhall(at)mhcomputing(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query when pg_trgm is in inner lopp
Date: 2018-06-20 14:13:18
Message-ID: CAOJhpYdVxeZNxojtpeeR6a4YR67E-1OKsPgDT+Us-4R338WXWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Matthew,

thank you for query response.

There is no particular reason for using GIST instead of GIN. We only
recently discovered pg_trgm so we are new to this. What I read is that GIN
can be faster then GIST but it really depends on query and on amount of
data. Nevertheless, both index are by magnitude order slower then btree
index, right? I tried simple query on our production server (explain
analyze select * from navdata.point where identifier like 'LOWW') where I
am 100% sure there is SSD and we have random_page_cost = 1, and query
itself takes 43 ms. That is not much of the difference compared to test
server.

What interest me, is whenever PG is aware of different costs for different
index types. Given that there is also index on guid which is used on
relationship, in our case it is always better to use that index and filter,
then to use both indexes and BitmapAnd.

Regarding test server, I believe that it is a SSD, but I will get
confirmation for this. I tried changing random_page_cost on test server
from 2.0 to 1.0 (that should be right value for SSD, right?) and also to
4.0 and I get same results.

Regards
Sasa Vilic

On 20 June 2018 at 15:29, Matthew Hall <mhall(at)mhcomputing(dot)net> wrote:

> Is there a reason you used GIST on your pg_trgm indices and not GIN? In my
> tests and previous posts on here, it nearly always performs worse. Also,
> did you make sure if it's really SSD and set the random_page_cost
> accordingly?
>
> Matthew Hall
>
> On Jun 20, 2018, at 8:21 AM, Sasa Vilic <sasavilic(at)gmail(dot)com> wrote:
>
> Hi everyone,
>
> we have a new query that performs badly with specific input parameters. We
> get worst performance when input data is most restrictive. I have partially
> identified a problem: it always happens when index scan is done in inner
> loop
> and index type is pg_trgm. We also noticed that for simple query
> (
> select * from point where identifier = 'LOWW' vs
> select * from point where identifier LIKE 'LOWW'
> )
> the difference between btree index and pg_trgm index can be quite high:
> 0.009 ms vs 32.0 ms.
>
> What I would like to know is whenever query planner is aware that some
> index
> types are more expensive the the others and whenever it can take that into
> account?
>
> I will describe background first, then give you query and its analysis for
> different parameters and in the end I will write about all required
> information
> regarding setup (Postgres version, Schema, metadata, hardware, etc.)
>
> I would like to know whenever this is a bug in query planner or not and
> what
> could we do about it.
>
> ############################################################
> ####################
> # Background
> ############################################################
> ####################
>
> We have a database with navigational data for civil aviation.
> Current query is working on two tables: point and route.
> Point represents a navigational point on Earth and route describes a route
> between two points.
>
> Query that we have finds all routes between two set of points. A set is a
> dynamically/loosely defined by pattern given by the user input. So for
> example
> if user wants to find all routes between international airports in Austria
> toward London Heathrow, he or she would use 'LOW%' as
> :from_point_identifier
> and 'EGLL' as :to_point_identifier. Please keep in mind that is a simple
> case,
> and that user is allowed to define search term any way he/she see it fit,
> i.e. '%OW%', 'EG%'.
>
> SELECT
> r.*
> FROM navdata.route r
> INNER JOIN navdata.point op ON r.frompointguid = op.guid
> INNER JOIN navdata.point dp ON r.topointguid = dp.guid
> WHERE
> r.routeidentifier ILIKE :route_identifier
> AND tsrange(r.startvalid, r.endvalid) @> :validity :: TIMESTAMP
> AND (NOT :use_sources :: BOOLEAN OR r.source = ANY (:sources :: VARCHAR
> []))
> AND CONCAT(op.identifier, '') ILIKE :from_point_identifier
> AND op.type = ANY (:point_types :: VARCHAR [])
> AND tsrange(op.startvalid, op.endvalid) @> :validity :: TIMESTAMP
> AND dp.identifier ILIKE :to_point_identifier :: VARCHAR
> AND dp.type = ANY (:point_types :: VARCHAR [])
> AND tsrange(dp.startvalid, dp.endvalid) @> :validity :: TIMESTAMP
> ORDER BY r.routeidentifier
> LIMIT 1000
>
>
> Most of the tables we have follows this layout principle:
> * uid - is primary key
> * guid - is globally unique key (i.e. London Heathrow could for example
> change it identifier EGLL, but our internal guid will stay same)
> * startvalid, endvalid - defines for which period is entry valid. Entires
> with
> same guid should not have overlapping validity.
>
> We don't use foreign keys for two reasons:
> * We need to do live migration without downtime. Creating a foreign key on
> huge dataset could take quite some time
> * Relationship between entities are defined based on guid and not on uid
> (primary key).
>
> ############################################################
> ####################
> # Query analysis
> ############################################################
> ####################
>
> ------------------------------------------------------------
> --------------------
> # Case 1 : We search for all outgoing routes from Vienna International
> Airport
> ------------------------------------------------------------
> --------------------
>
> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
> SELECT
> r.*
> FROM navdata.route r
> INNER JOIN navdata.point op ON r.frompointguid = op.guid
> INNER JOIN navdata.point dp ON r.topointguid = dp.guid
> WHERE
> r.routeidentifier ILIKE '%'
> AND tsrange(r.startvalid, r.endvalid) @> now() :: TIMESTAMP
> AND (NOT false :: BOOLEAN OR r.source = ANY (ARRAY[] :: VARCHAR []))
> AND op.identifier ILIKE '%LOWW%'
> AND op.type = ANY (ARRAY['PA'] :: VARCHAR [])
> AND tsrange(op.startvalid, op.endvalid) @> now() :: TIMESTAMP
> AND dp.identifier ILIKE '%' :: VARCHAR
> AND dp.type = ANY (ARRAY['PA'] :: VARCHAR [])
> AND tsrange(dp.startvalid, dp.endvalid) @> now() :: TIMESTAMP
> ORDER BY r.routeidentifier
> LIMIT 1000
>
> Limit (cost=666.58..666.58 rows=1 width=349) (actual
> time=358.466..359.688 rows=1000 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid, r.topointguid,
> r.sidguid, r.starguid, r.routeinformation, r.routetype, r.startvalid,
> r.endvalid, r.revisionuid, r.source, r.fufi, r.grounddistance_excl_sidstar,
> r.from_first, r.dep_airports, r.dst_airports, r.tag,
> r.expanded_route_string, r.route_geometry
> Buffers: shared hit=29786 read=1
> -> Sort (cost=666.58..666.58 rows=1 width=349) (actual
> time=358.464..358.942 rows=1000 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid, r.topointguid,
> r.sidguid, r.starguid, r.routeinformation, r.routetype, r.startvalid,
> r.endvalid, r.revisionuid, r.source, r.fufi, r.grounddistance_excl_sidstar,
> r.from_first, r.dep_airports, r.dst_airports, r.tag,
> r.expanded_route_string, r.route_geometry
> Sort Key: r.routeidentifier
> Sort Method: quicksort Memory: 582kB
> Buffers: shared hit=29786 read=1
> -> Nested Loop (cost=149.94..666.57 rows=1 width=349) (actual
> time=291.681..356.261 rows=1540 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid,
> r.topointguid, r.sidguid, r.starguid, r.routeinformation, r.routetype,
> r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Buffers: shared hit=29786 read=1
> -> Nested Loop (cost=149.51..653.92 rows=1 width=349)
> (actual time=291.652..300.076 rows=1546 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid,
> r.topointguid, r.sidguid, r.starguid, r.routeinformation, r.routetype,
> r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Buffers: shared hit=13331 read=1
> -> Bitmap Heap Scan on navdata.point op
> (cost=5.75..358.28 rows=2 width=16) (actual time=95.933..96.155 rows=1
> loops=1)
> Output: op.uid, op.guid, op.airportguid,
> op.identifier, op.icaocode, op.name, op.type, op.coordinates, op.fir,
> op.navaidfrequency, op.elevation, op.magneticvariance, op.startvalid,
> op.endvalid, op.revisionuid, op.source, op.leveltype
> Recheck Cond: ((op.identifier)::text ~~*
> '%LOWW%'::text)
> Filter: (((op.type)::text = ANY
> ('{PA}'::text[])) AND (tsrange(op.startvalid, op.endvalid) @>
> (now())::timestamp without time zone))
> Rows Removed by Filter: 50
> Heap Blocks: exact=51
> Buffers: shared hit=4974 read=1
> -> Bitmap Index Scan on idx_point_08
> (cost=0.00..5.75 rows=178 width=0) (actual time=95.871..95.871 rows=51
> loops=1)
> Index Cond: ((op.identifier)::text ~~*
> '%LOWW%'::text)
> Buffers: shared hit=4924
> -> Bitmap Heap Scan on navdata.route r
> (cost=143.77..147.80 rows=2 width=349) (actual time=195.711..202.308
> rows=1546 loops=1)
> Output: r.uid, r.routeidentifier,
> r.frompointguid, r.topointguid, r.sidguid, r.starguid, r.routeinformation,
> r.routetype, r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Recheck Cond: ((r.frompointguid = op.guid) AND
> (tsrange(r.startvalid, r.endvalid) @> (now())::timestamp without time zone))
> Filter: ((r.routeidentifier)::text ~~* '%'::text)
> Heap Blocks: exact=1231
> Buffers: shared hit=8357
> -> BitmapAnd (cost=143.77..143.77 rows=2
> width=0) (actual time=195.501..195.501 rows=0 loops=1)
> Buffers: shared hit=7126
> -> Bitmap Index Scan on idx_route_02
> (cost=0.00..6.85 rows=324 width=0) (actual time=0.707..0.707 rows=4295
> loops=1)
> Index Cond: (r.frompointguid =
> op.guid)
> Buffers: shared hit=21
> -> Bitmap Index Scan on idx_route_07
> (cost=0.00..135.49 rows=4693 width=0) (actual time=193.881..193.881
> rows=579054 loops=1)
> Index Cond: (tsrange(r.startvalid,
> r.endvalid) @> (now())::timestamp without time zone)
> Buffers: shared hit=7105
> -> Index Scan using cidx_point on navdata.point dp
> (cost=0.43..12.63 rows=1 width=16) (actual time=0.009..0.034 rows=1
> loops=1546)
> Output: dp.uid, dp.guid, dp.airportguid,
> dp.identifier, dp.icaocode, dp.name, dp.type, dp.coordinates, dp.fir,
> dp.navaidfrequency, dp.elevation, dp.magneticvariance, dp.startvalid,
> dp.endvalid, dp.revisionuid, dp.source, dp.leveltype
> Index Cond: (dp.guid = r.topointguid)
> Filter: (((dp.type)::text = ANY ('{PA}'::text[])) AND
> ((dp.identifier)::text ~~* '%'::text) AND (tsrange(dp.startvalid,
> dp.endvalid) @> (now())::timestamp without time zone))
> Rows Removed by Filter: 7
> Buffers: shared hit=16455
> Planning time: 4.603 ms
> Execution time: 360.180 ms
>
> * 360 ms. That is quite fine for our standards. *
>
> ------------------------------------------------------------
> --------------------
> # Case 2 : We search for all routes between Vienna International Airport
> and
> London Heathrow (here is where trouble begins)
> ------------------------------------------------------------
> --------------------
>
> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
> SELECT
> r.*
> FROM navdata.route r
> INNER JOIN navdata.point op ON r.frompointguid = op.guid
> INNER JOIN navdata.point dp ON r.topointguid = dp.guid
> WHERE
> r.routeidentifier ILIKE '%'
> AND tsrange(r.startvalid, r.endvalid) @> now() :: TIMESTAMP
> AND (NOT false :: BOOLEAN OR r.source = ANY (ARRAY[] :: VARCHAR []))
> AND op.identifier ILIKE '%LOWW%'
> AND op.type = ANY (ARRAY['PA'] :: VARCHAR [])
> AND tsrange(op.startvalid, op.endvalid) @> now() :: TIMESTAMP
> AND dp.identifier ILIKE '%EGLL%' :: VARCHAR
> AND dp.type = ANY (ARRAY['PA'] :: VARCHAR [])
> AND tsrange(dp.startvalid, dp.endvalid) @> now() :: TIMESTAMP
> ORDER BY r.routeidentifier
> LIMIT 1000
>
>
> Limit (cost=659.57..659.58 rows=1 width=349) (actual
> time=223118.664..223118.714 rows=36 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid, r.topointguid,
> r.sidguid, r.starguid, r.routeinformation, r.routetype, r.startvalid,
> r.endvalid, r.revisionuid, r.source, r.fufi, r.grounddistance_excl_sidstar,
> r.from_first, r.dep_airports, r.dst_airports, r.tag,
> r.expanded_route_string, r.route_geometry
> Buffers: shared hit=12033194
> -> Sort (cost=659.57..659.58 rows=1 width=349) (actual
> time=223118.661..223118.681 rows=36 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid, r.topointguid,
> r.sidguid, r.starguid, r.routeinformation, r.routetype, r.startvalid,
> r.endvalid, r.revisionuid, r.source, r.fufi, r.grounddistance_excl_sidstar,
> r.from_first, r.dep_airports, r.dst_airports, r.tag,
> r.expanded_route_string, r.route_geometry
> Sort Key: r.routeidentifier
> Sort Method: quicksort Memory: 35kB
> Buffers: shared hit=12033194
> -> Nested Loop (cost=157.35..659.56 rows=1 width=349) (actual
> time=4290.975..223118.490 rows=36 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid,
> r.topointguid, r.sidguid, r.starguid, r.routeinformation, r.routetype,
> r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Buffers: shared hit=12033194
> -> Nested Loop (cost=149.32..649.49 rows=1 width=349)
> (actual time=319.717..367.139 rows=2439 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid,
> r.topointguid, r.sidguid, r.starguid, r.routeinformation, r.routetype,
> r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Buffers: shared hit=15788
> -> Bitmap Heap Scan on navdata.point dp
> (cost=5.75..358.28 rows=2 width=16) (actual time=124.922..125.008 rows=1
> loops=1)
> Output: dp.uid, dp.guid, dp.airportguid,
> dp.identifier, dp.icaocode, dp.name, dp.type, dp.coordinates, dp.fir,
> dp.navaidfrequency, dp.elevation, dp.magneticvariance, dp.startvalid,
> dp.endvalid, dp.revisionuid, dp.source, dp.leveltype
> Recheck Cond: ((dp.identifier)::text ~~*
> '%EGLL%'::text)
> Filter: (((dp.type)::text = ANY
> ('{PA}'::text[])) AND (tsrange(dp.startvalid, dp.endvalid) @>
> (now())::timestamp without time zone))
> Rows Removed by Filter: 6
> Heap Blocks: exact=7
> Buffers: shared hit=6786
> -> Bitmap Index Scan on idx_point_08
> (cost=0.00..5.75 rows=178 width=0) (actual time=124.882..124.882 rows=7
> loops=1)
> Index Cond: ((dp.identifier)::text ~~*
> '%EGLL%'::text)
> Buffers: shared hit=6779
> -> Bitmap Heap Scan on navdata.route r
> (cost=143.57..145.60 rows=1 width=349) (actual time=194.785..237.128
> rows=2439 loops=1)
> Output: r.uid, r.routeidentifier,
> r.frompointguid, r.topointguid, r.sidguid, r.starguid, r.routeinformation,
> r.routetype, r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Recheck Cond: ((r.topointguid = dp.guid) AND
> (tsrange(r.startvalid, r.endvalid) @> (now())::timestamp without time zone))
> Filter: ((r.routeidentifier)::text ~~* '%'::text)
> Heap Blocks: exact=1834
> Buffers: shared hit=9002
> -> BitmapAnd (cost=143.57..143.57 rows=1
> width=0) (actual time=194.460..194.460 rows=0 loops=1)
> Buffers: shared hit=7168
> -> Bitmap Index Scan on idx_route_03
> (cost=0.00..6.66 rows=298 width=0) (actual time=2.326..2.326 rows=15148
> loops=1)
> Index Cond: (r.topointguid =
> dp.guid)
> Buffers: shared hit=63
> -> Bitmap Index Scan on idx_route_07
> (cost=0.00..135.49 rows=4693 width=0) (actual time=190.001..190.001
> rows=579054 loops=1)
> Index Cond: (tsrange(r.startvalid,
> r.endvalid) @> (now())::timestamp without time zone)
> Buffers: shared hit=7105
> -> Bitmap Heap Scan on navdata.point op (cost=8.03..10.06
> rows=1 width=16) (actual time=91.321..91.321 rows=0 loops=2439)
> Output: op.uid, op.guid, op.airportguid,
> op.identifier, op.icaocode, op.name, op.type, op.coordinates, op.fir,
> op.navaidfrequency, op.elevation, op.magneticvariance, op.startvalid,
> op.endvalid, op.revisionuid, op.source, op.leveltype
> Recheck Cond: ((op.guid = r.frompointguid) AND
> ((op.identifier)::text ~~* '%LOWW%'::text))
> Filter: (((op.type)::text = ANY ('{PA}'::text[])) AND
> (tsrange(op.startvalid, op.endvalid) @> (now())::timestamp without time
> zone))
> Rows Removed by Filter: 0
> Heap Blocks: exact=252
> Buffers: shared hit=12017406
> -> BitmapAnd (cost=8.03..8.03 rows=1 width=0)
> (actual time=91.315..91.315 rows=0 loops=2439)
> Buffers: shared hit=12017154
> -> Bitmap Index Scan on cidx_point
> (cost=0.00..2.04 rows=6 width=0) (actual time=0.017..0.017 rows=8
> loops=2439)
> Index Cond: (op.guid = r.frompointguid)
> Buffers: shared hit=7518
> -> Bitmap Index Scan on idx_point_08
> (cost=0.00..5.75 rows=178 width=0) (actual time=91.288..91.288 rows=51
> loops=2439)
> Index Cond: ((op.identifier)::text ~~*
> '%LOWW%'::text)
> Buffers: shared hit=12009636
> Planning time: 5.162 ms
> Execution time: 223118.858 ms
>
> * Please pay attention to index scan on idx_point_08. It takes on average
> 91 ms
> and it is executed 2439 times = 221949 ms. That is where we spend most of
> the
> time. *
>
> ------------------------------------------------------------
> --------------------
> # Case 3 : We again search for all routes between Vienna International
> Airport
> and London Heathrow, but this time I use CONCAT(op.identifier, '') as
> optimization fence.
> ------------------------------------------------------------
> --------------------
>
> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
> SELECT
> r.*
> FROM navdata.route r
> INNER JOIN navdata.point op ON r.frompointguid = op.guid
> INNER JOIN navdata.point dp ON r.topointguid = dp.guid
> WHERE
> r.routeidentifier ILIKE '%'
> AND tsrange(r.startvalid, r.endvalid) @> now() :: TIMESTAMP
> AND (NOT false :: BOOLEAN OR r.source = ANY (ARRAY[] :: VARCHAR []))
> AND CONCAT(op.identifier, '') ILIKE '%LOWW%'
> AND op.type = ANY (ARRAY['PA'] :: VARCHAR [])
> AND tsrange(op.startvalid, op.endvalid) @> now() :: TIMESTAMP
> AND dp.identifier ILIKE '%EGLL%' :: VARCHAR
> AND dp.type = ANY (ARRAY['PA'] :: VARCHAR [])
> AND tsrange(dp.startvalid, dp.endvalid) @> now() :: TIMESTAMP
> ORDER BY r.routeidentifier
> LIMIT 1000
>
> Limit (cost=662.16..662.17 rows=1 width=349) (actual
> time=411.756..411.808 rows=36 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid, r.topointguid,
> r.sidguid, r.starguid, r.routeinformation, r.routetype, r.startvalid,
> r.endvalid, r.revisionuid, r.source, r.fufi, r.grounddistance_excl_sidstar,
> r.from_first, r.dep_airports, r.dst_airports, r.tag,
> r.expanded_route_string, r.route_geometry
> Buffers: shared hit=43025
> -> Sort (cost=662.16..662.17 rows=1 width=349) (actual
> time=411.755..411.776 rows=36 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid, r.topointguid,
> r.sidguid, r.starguid, r.routeinformation, r.routetype, r.startvalid,
> r.endvalid, r.revisionuid, r.source, r.fufi, r.grounddistance_excl_sidstar,
> r.from_first, r.dep_airports, r.dst_airports, r.tag,
> r.expanded_route_string, r.route_geometry
> Sort Key: r.routeidentifier
> Sort Method: quicksort Memory: 35kB
> Buffers: shared hit=43025
> -> Nested Loop (cost=149.75..662.15 rows=1 width=349) (actual
> time=316.518..411.656 rows=36 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid,
> r.topointguid, r.sidguid, r.starguid, r.routeinformation, r.routetype,
> r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Buffers: shared hit=43025
> -> Nested Loop (cost=149.32..649.49 rows=1 width=349)
> (actual time=314.704..326.873 rows=2439 loops=1)
> Output: r.uid, r.routeidentifier, r.frompointguid,
> r.topointguid, r.sidguid, r.starguid, r.routeinformation, r.routetype,
> r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Buffers: shared hit=15788
> -> Bitmap Heap Scan on navdata.point dp
> (cost=5.75..358.28 rows=2 width=16) (actual time=123.267..123.310 rows=1
> loops=1)
> Output: dp.uid, dp.guid, dp.airportguid,
> dp.identifier, dp.icaocode, dp.name, dp.type, dp.coordinates, dp.fir,
> dp.navaidfrequency, dp.elevation, dp.magneticvariance, dp.startvalid,
> dp.endvalid, dp.revisionuid, dp.source, dp.leveltype
> Recheck Cond: ((dp.identifier)::text ~~*
> '%EGLL%'::text)
> Filter: (((dp.type)::text = ANY
> ('{PA}'::text[])) AND (tsrange(dp.startvalid, dp.endvalid) @>
> (now())::timestamp without time zone))
> Rows Removed by Filter: 6
> Heap Blocks: exact=7
> Buffers: shared hit=6786
> -> Bitmap Index Scan on idx_point_08
> (cost=0.00..5.75 rows=178 width=0) (actual time=123.232..123.232 rows=7
> loops=1)
> Index Cond: ((dp.identifier)::text ~~*
> '%EGLL%'::text)
> Buffers: shared hit=6779
> -> Bitmap Heap Scan on navdata.route r
> (cost=143.57..145.60 rows=1 width=349) (actual time=191.429..201.176
> rows=2439 loops=1)
> Output: r.uid, r.routeidentifier,
> r.frompointguid, r.topointguid, r.sidguid, r.starguid, r.routeinformation,
> r.routetype, r.startvalid, r.endvalid, r.revisionuid, r.source, r.fufi,
> r.grounddistance_excl_sidstar, r.from_first, r.dep_airports,
> r.dst_airports, r.tag, r.expanded_route_string, r.route_geometry
> Recheck Cond: ((r.topointguid = dp.guid) AND
> (tsrange(r.startvalid, r.endvalid) @> (now())::timestamp without time zone))
> Filter: ((r.routeidentifier)::text ~~* '%'::text)
> Heap Blocks: exact=1834
> Buffers: shared hit=9002
> -> BitmapAnd (cost=143.57..143.57 rows=1
> width=0) (actual time=191.097..191.097 rows=0 loops=1)
> Buffers: shared hit=7168
> -> Bitmap Index Scan on idx_route_03
> (cost=0.00..6.66 rows=298 width=0) (actual time=2.349..2.349 rows=15148
> loops=1)
> Index Cond: (r.topointguid =
> dp.guid)
> Buffers: shared hit=63
> -> Bitmap Index Scan on idx_route_07
> (cost=0.00..135.49 rows=4693 width=0) (actual time=186.640..186.640
> rows=579054 loops=1)
> Index Cond: (tsrange(r.startvalid,
> r.endvalid) @> (now())::timestamp without time zone)
> Buffers: shared hit=7105
> -> Index Scan using cidx_point on navdata.point op
> (cost=0.43..12.65 rows=1 width=16) (actual time=0.033..0.033 rows=0
> loops=2439)
> Output: op.uid, op.guid, op.airportguid,
> op.identifier, op.icaocode, op.name, op.type, op.coordinates, op.fir,
> op.navaidfrequency, op.elevation, op.magneticvariance, op.startvalid,
> op.endvalid, op.revisionuid, op.source, op.leveltype
> Index Cond: (op.guid = r.frompointguid)
> Filter: (((op.type)::text = ANY ('{PA}'::text[])) AND
> (concat(op.identifier, '') ~~* '%LOWW%'::text) AND (tsrange(op.startvalid,
> op.endvalid) @> (now())::timestamp without time zone))
> Rows Removed by Filter: 8
> Buffers: shared hit=27237
> Planning time: 3.381 ms
> Execution time: 411.944 ms
>
> * We are back into acceptable margin. *
>
> ############################################################
> ####################
> # Postgres version
> ############################################################
> ####################
>
> PostgreSQL 9.6.9 on x86_64-pc-linux-gnu (Ubuntu 9.6.9-2.pgdg16.04+1),
> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609, 64-bit
>
> ############################################################
> ####################
> # Schema
> ############################################################
> ####################
>
> Currently, our tables are heavily indexed due to refactoring process and
> need to
> work with old and new version of software. Once we are finished, lot of
> indexes shell be removed.
>
> CREATE TABLE navdata.point (
> uid uuid NOT NULL,
> guid uuid NULL,
> airportguid uuid NULL,
> identifier varchar(5) NULL,
> icaocode varchar(2) NULL,
> "name" varchar(255) NULL,
> "type" varchar(2) NULL,
> coordinates geography NULL,
> fir varchar(5) NULL,
> navaidfrequency float8 NULL,
> elevation float8 NULL,
> magneticvariance float8 NULL,
> startvalid timestamp NULL,
> endvalid timestamp NULL,
> revisionuid uuid NULL,
> "source" varchar(4) NULL,
> leveltype varchar(1) NULL,
> CONSTRAINT point_pkey PRIMARY KEY (uid)
> )
> WITH (
> OIDS=FALSE
> ) ;
> CREATE INDEX cidx_point ON navdata.point USING btree (guid) ;
> CREATE INDEX idx_point_01 ON navdata.point USING btree (identifier, guid) ;
> CREATE INDEX idx_point_03 ON navdata.point USING btree (identifier) ;
> CREATE INDEX idx_point_04 ON navdata.point USING gist (coordinates) WHERE
> (airportguid IS NULL) ;
> CREATE INDEX idx_point_05 ON navdata.point USING btree (identifier
> text_pattern_ops) ;
> CREATE INDEX idx_point_06 ON navdata.point USING btree (airportguid) ;
> CREATE INDEX idx_point_07 ON navdata.point USING gist (coordinates) ;
> CREATE INDEX idx_point_08 ON navdata.point USING gist (identifier
> gist_trgm_ops) ;
> CREATE INDEX idx_point_09 ON navdata.point USING btree (type) ;
> CREATE INDEX idx_point_10 ON navdata.point USING gist (name gist_trgm_ops)
> ;
> CREATE INDEX idx_point_11 ON navdata.point USING btree (type, identifier
> text_pattern_ops) ;
> CREATE INDEX idx_point_12 ON navdata.point USING gist
> (upper((identifier)::text) gist_trgm_ops) ;
> CREATE INDEX idx_point_13 ON navdata.point USING gist (upper((name)::text)
> gist_trgm_ops) ;
> CREATE INDEX idx_point_tmp ON navdata.point USING btree (leveltype) ;
> CREATE INDEX point_validity_idx ON navdata.point USING gist
> (tsrange(startvalid, endvalid)) ;
>
> CREATE TABLE navdata.route (
> uid uuid NOT NULL,
> routeidentifier varchar(3) NULL,
> frompointguid uuid NULL,
> topointguid uuid NULL,
> sidguid uuid NULL,
> starguid uuid NULL,
> routeinformation varchar NULL,
> routetype varchar(5) NULL,
> startvalid timestamp NULL,
> endvalid timestamp NULL,
> revisionuid uuid NULL,
> "source" varchar(4) NULL,
> fufi uuid NULL,
> grounddistance_excl_sidstar float8 NULL,
> from_first bool NULL,
> dep_airports varchar NULL,
> dst_airports varchar NULL,
> tag varchar NULL,
> expanded_route_string varchar NULL,
> route_geometry geometry NULL,
> CONSTRAINT route_pkey PRIMARY KEY (uid)
> )
> WITH (
> OIDS=FALSE
> ) ;
> CREATE INDEX idx_route_01 ON navdata.route USING btree (uid) ;
> CREATE INDEX idx_route_02 ON navdata.route USING btree (frompointguid) ;
> CREATE INDEX idx_route_03 ON navdata.route USING btree (topointguid) ;
> CREATE INDEX idx_route_04 ON navdata.route USING btree (fufi) ;
> CREATE INDEX idx_route_05 ON navdata.route USING btree (source,
> routeidentifier, startvalid, endvalid) ;
> CREATE INDEX idx_route_06 ON navdata.route USING gist (routeinformation
> gist_trgm_ops) ;
> CREATE INDEX idx_route_07 ON navdata.route USING gist (tsrange(startvalid,
> endvalid)) ;
> CREATE INDEX idx_route_09 ON navdata.route USING gist (routeidentifier
> gist_trgm_ops) ;
>
> ############################################################
> ####################
> # Table metadata
> ############################################################
> ####################
>
> relname |relpages |reltuples |relallvisible |relkind |relnatts
> |relhassubclass |reloptions |pg_table_size |
> --------|---------|----------|--------------|--------|------
> ---|---------------|-----------|--------------|
> route |36600 |938573 |36595 |r |22
> |false |NULL |299941888 |
> point |95241 |2156454 |95241 |r |17
> |false |NULL |780460032 |
>
>
> ############################################################
> ####################
> # History
> ############################################################
> ####################
>
> This is a new query, because data layer is being refactored.
>
> ############################################################
> ####################
> # Hardware
> ############################################################
> ####################
>
> Postgres is running on virtual machine.
>
> * CPU: 8 cores assigned
>
> processor : 7
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 2
> model name : AMD Opteron(tm) Processor 6380
> stepping : 0
> microcode : 0xffffffff
> cpu MHz : 2500.020
> cache size : 2048 KB
> physical id : 0
> siblings : 8
> core id : 7
> cpu cores : 8
> apicid : 7
> initial apicid : 7
> fpu : yes
> fpu_exception : yes
> cpuid level : 13
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm
> rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 popcnt
> aes xsave avx f16c hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a
> misalignsse 3dnowprefetch osvw xop fma4 vmmcall bmi1 arat
> bugs : fxsave_leak sysret_ss_attrs
> bogomips : 4998.98
> TLB size : 1536 4K pages
> clflush size : 64
> cache_alignment : 64
> address sizes : 42 bits physical, 48 bits virtual
> power management:
>
>
> * Memory: 32 GB
>
> * Disk: Should be ssd, but unfortunattely I don't know which model.
>
> ############################################################
> ####################
> # bonnie++
> ############################################################
> ####################
>
> Using uid:111, gid:118.
> format_version,bonnie_version,name,concurrency,seed,file_
> size,io_chunk_size,putc,putc_cpu,put_block,put_block_cpu,
> rewrite,rewrite_cpu,getc,getc_cpu,get_block,get_block_cpu,
> seeks,seeks_cpu,num_files,max_size,min_size,num_dirs,file_
> chunk_size,seq_create,seq_create_cpu,seq_stat,seq_stat_
> cpu,seq_del,seq_del_cpu,ran_create,ran_create_cpu,ran_
> stat,ran_stat_cpu,ran_del,ran_del_cpu,putc_latency,put_
> block_latency,rewrite_latency,getc_latency,get_block_
> latency,seeks_latency,seq_create_latency,seq_stat_
> latency,seq_del_latency,ran_create_latency,ran_stat_
> latency,ran_del_latency
> Writing intelligently...done
> Rewriting...done
> Reading intelligently...done
> start 'em...done...done...done...done...done...
> 1.97,1.97,v6565testdb01,1,1529491960,63G,,,,133872,20,
> 96641,17,,,469654,41,+++++,+++,,,,,,,,,,,,,,,,,,,2117ms,
> 2935ms,,270ms,4760us,,,,,,
> Writing intelligently...done
> Rewriting...done
> Reading intelligently...done
> start 'em...done...done...done...done...done...
> 1.97,1.97,v6565testdb01,1,1529491960,63G,,,,190192,26,
> 143595,23,,,457357,37,+++++,+++,,,,,,,,,,,,,,,,,,,595ms,
> 2201ms,,284ms,6110us,,,,,,
> Writing intelligently...done
> Rewriting...done
> Reading intelligently...done
> start 'em...done...done...done...done...done...
> 1.97,1.97,v6565testdb01,1,1529491960,63G,,,,542936,81,
> 153952,25,,,446369,37,+++++,+++,,,,,,,,,,,,,,,,,,,347ms,
> 3678ms,,101ms,5632us,,,,,,
> Writing intelligently...done
> Rewriting...done
> Reading intelligently...done
> start 'em...done...done...done...done...done...
> 1.97,1.97,v6565testdb01,1,1529491960,63G,,,,244155,33,
> 157543,26,,,441115,38,16111,495,,,,,,,,,,,,,,,,,,,638ms,
> 2667ms,,195ms,9068us,,,,,,
>
>
> ############################################################
> ####################
> # Maintenance Setup
> ############################################################
> ####################
>
> Autovacuum: yes
>
> ############################################################
> ####################
> # postgresql.conf
> ############################################################
> ####################
>
> max_connections = 4096 # (change requires restart)
> shared_buffers = 8GB # (change requires restart)
> huge_pages = try # on, off, or try
> work_mem = 4MB # min 64kB
> maintenance_work_mem = 2GB # min 1MB
> dynamic_shared_memory_type = posix # the default is the first option
> shared_preload_libraries = 'pg_stat_statements'
> pg_stat_statements.max = 10000
> pg_stat_statements.track = all
> wal_level = replica # minimal, replica, or logical
> wal_buffers = 16MB
> max_wal_size = 2GB
> min_wal_size = 1GB
> checkpoint_completion_target = 0.7
> max_wal_senders = 4 # max number of walsender processes
> random_page_cost = 2.0
> effective_cache_size = 24GB
> default_statistics_target = 100 # range 1-10000
>
> ############################################################
> ####################
> # Statistics
> ############################################################
> ####################
>
> frac_mcv |tablename |attname |n_distinct |n_mcv
> |n_hist |
> --------------|----------|----------------------------|-----
> --------|------|-------|
> |route |uid |-1 |
> |101 |
> 0.969699979 |route |routeidentifier |78 |2
> |76 |
> 0.44780004 |route |frompointguid |2899 |100
> |101 |
> 0.441700101 |route |topointguid |3154 |100
> |101 |
> 0.0368666835 |route |sidguid |2254 |100
> |101 |
> 0.0418333709 |route |starguid |3182 |100
> |101 |
> 0.0515667647 |route |routeinformation |-0.335044593 |100
> |101 |
> 0.0528000034 |route |routetype |3 |3
> | |
> 0.755399942 |route |startvalid |810 |100
> |101 |
> 0.962899983 |route |endvalid |22 |3
> |19 |
> 0.00513333362 |route |revisionuid |-0.809282064 |2
> |101 |
> 0.97906667 |route |source |52 |4
> |48 |
> |route |fufi |0 |
> | |
> 0.00923334155 |route |grounddistance_excl_sidstar |-0.552667081 |100
> |101 |
> 0.0505000018 |route |from_first |2 |2
> | |
> 0.0376333408 |route |dep_airports |326 |52
> |101 |
> 0.0367666557 |route |dst_airports |388 |57
> |101 |
> |point |uid |-1 |
> |101 |
> 0.00185333542 |point |guid |-0.164169937 |100
> |101 |
> 0.0573133379 |point |airportguid |23575 |100
> |101 |
> 0.175699964 |point |identifier |209296 |1000
> |1001 |
> 0.754063368 |point |icaocode |254 |41
> |101 |
> 0.00352332788 |point |name |37853 |100
> |101 |
> 0.999230027 |point |type |11 |6
> |5 |
> |point |coordinates |-1 |
> | |
> 0.607223332 |point |fir |281 |62
> |101 |
> 0.0247033276 |point |navaidfrequency |744 |100
> |101 |
> 0.0320866667 |point |elevation |14013 |100
> |101 |
> 0.0011433335 |point |magneticvariance |-0.587834716 |100
> |101 |
> 0.978270054 |point |startvalid |35 |12
> |23 |
> 0.978176594 |point |endvalid |30 |11
> |19 |
> 0.978123426 |point |revisionuid |62 |12
> |50 |
> 0.99999994 |point |source |3 |3
> | |
> 0.777056634 |point |leveltype |7 |7
> | |
>
> ############################################################
> ####################
>
> I am looking forward to your suggestions.
>
> Thanks in advance!
>
> Sasa Vilic
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2018-06-20 14:53:52 Re: Slow query when pg_trgm is in inner lopp
Previous Message Matthew Hall 2018-06-20 13:29:19 Re: Slow query when pg_trgm is in inner lopp