Quick Links

Query uses incorrect index

From:	pasman pasmański <pasman(dot)p(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Query uses incorrect index
Date:	2010-12-21 14:33:21
Message-ID:	AANLkTi=OeYtT=zEOP7aZg7vbvZD+so=Tk_W+UoOezJGt@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

hello.

I ve the table NumeryA with 3 indices. Query below uses incorrect index.

SELECT
A."NKA",
A."NTA",
Min("PołączeniaMin") || ',' || Max("PołączeniaMax") AS "Biling",
Sum("Ile")::text AS "Ilość CDR",
R."LP"::text AS "Sprawa",
R."Osoba weryfikująca" AS "Osoba",
to_char(min("Wartość"),'FM9999990D00') AS "Wartość po kontroli",
max(R."Kontrola po naprawie w Serat - CDR")::text AS "CDR po kontroli",
min(A."KodBłędu")::text AS KodBłędu,
Max(to_char(R."Data kontroli",'YYYY-MM-DD')) AS "Ostatnia Kontrola"
, max("Skutek wprowadzenia błednej ewidencji w Serat") as "Skutek"
FROM
ONLY "NumeryA" A
LEFT JOIN
(select * from "Rejestr stacji do naprawy" where "Data weryfikacji"
>= current_date-3*30) R
ON
A."NKA" = R."Numer kierunkowy"
and substr(A."NTA",1,5) = substr(R."Numer stacji",1,5)
and A."NTA" like R."Numer stacji"
and A."KodBłędu" = R."Kod Błędu"
WHERE
A."DataPliku" >= current_date-3*30
and A."KodBłędu" similar to '74'
GROUP
BY R."Osoba weryfikująca",R."LP",A."NKA", A."NTA"
ORDER
BY Sum("Ile") DESC
LIMIT 4000

This query has plan:

----------------------------------------------------------------
Limit (cost=9656.43..9666.43 rows=4000 width=96) (actual
time=2149.383..2174.363 rows=4000 loops=1)
-> Sort (cost=9656.43..9716.86 rows=24175 width=96) (actual
time=2149.373..2158.355 rows=4000 loops=1)
Sort Key: (sum(a.Ile"))"
Sort Method: top-N heapsort Memory: 1028kB
-> HashAggregate (cost=6711.21..8089.19 rows=24175 width=96)
(actual time=2040.721..2110.075 rows=9080 loops=1)
-> Merge Left Join (cost=5338.65..5925.53 rows=24175
width=96) (actual time=1180.490..1717.727 rows=33597 loops=1)
Merge Cond: (((a.NKA")::text = ("Rejestr stacji do
naprawy"."Numer kierunkowy")::text) AND ((substr((a."NTA")::text, 1,
5)) = (substr(("Rejestr stacji do naprawy"."Numer stacji")::text, 1,
5))) AND ((a."KodBłędu")::text = ("Rejestr stacji do naprawy"."Kod
Błędu")::text))"
Join Filter: ((a.NTA")::text ~~ ("Rejestr stacji
do naprawy"."Numer stacji")::text)"
-> Sort (cost=3565.16..3625.60 rows=24175
width=42) (actual time=819.034..900.141 rows=33597 loops=1)
Sort Key: a.NKA", (substr((a."NTA")::text,
1, 5)), a."KodBłędu""
Sort Method: quicksort Memory: 5487kB
-> Index Scan using dp_kb on NumeryA" a
(cost=0.01..1805.07 rows=24175 width=42) (actual time=0.295..197.627
rows=33597 loops=1)"
Index Cond: (DataPliku" >=
(('now'::text)::date - 90))"
Filter: ((KodBłędu")::text ~
'***:^(?:74)$'::text)"
-> Sort (cost=1773.49..1811.23 rows=15096
width=67) (actual time=361.430..434.675 rows=32948 loops=1)
Sort Key: Rejestr stacji do naprawy"."Numer
kierunkowy", (substr(("Rejestr stacji do naprawy"."Numer
stacji")::text, 1, 5)), "Rejestr stacji do naprawy"."Kod Błędu""
Sort Method: quicksort Memory: 2234kB
-> Bitmap Heap Scan on Rejestr stacji do
naprawy" (cost=141.75..725.68 rows=15096 width=67) (actual
time=2.604..51.567 rows=14893 loops=1)"
Recheck Cond: (Data weryfikacji" >=
(('now'::text)::date - 90))"
-> Bitmap Index Scan on Data
weryfikacji_Kod Błędu" (cost=0.00..137.98 rows=15096 width=0) (actual
time=2.463..2.463 rows=15462 loops=1)"
Index Cond: (Data weryfikacji"
>= (('now'::text)::date - 90))"
Total runtime: 2186.011 ms

When i delete index dp_kb, query runs faster:

-------------------------------------------------------------------------
Limit (cost=15221.69..15231.69 rows=4000 width=96) (actual
time=1296.896..1322.144 rows=4000 loops=1)
-> Sort (cost=15221.69..15282.13 rows=24175 width=96) (actual
time=1296.887..1305.993 rows=4000 loops=1)
Sort Key: (sum(a.Ile"))"
Sort Method: top-N heapsort Memory: 1028kB
-> HashAggregate (cost=12276.48..13654.45 rows=24175
width=96) (actual time=1188.706..1257.669 rows=9080 loops=1)
-> Merge Left Join (cost=0.01..11490.79 rows=24175
width=96) (actual time=0.220..840.102 rows=33597 loops=1)
Merge Cond: (((a.NKA")::text = ("Rejestr stacji do
naprawy"."Numer kierunkowy")::text) AND (substr((a."NTA")::text, 1, 5)
= substr(("Rejestr stacji do naprawy"."Numer stacji")::text, 1, 5))
AND ((a."KodBłędu")::text = ("Rejestr stacji do naprawy"."Kod
Błędu")::text))"
Join Filter: ((a.NTA")::text ~~ ("Rejestr stacji
do naprawy"."Numer stacji")::text)"
-> Index Scan using NTA_5" on "NumeryA" a
(cost=0.01..10016.75 rows=24175 width=42) (actual time=0.132..308.018
rows=33597 loops=1)"
Index Cond: (((KodBłędu")::text =
'74'::text) AND ("DataPliku" >= (('now'::text)::date - 90)))"
Filter: ((KodBłędu")::text ~ '***:^(?:74)$'::text)"
-> Index Scan using 3" on "Rejestr stacji do
naprawy" (cost=0.01..1002.73 rows=15096 width=67) (actual
time=0.047..129.840 rows=32948 loops=1)"
Index Cond: (Rejestr stacji do
naprawy"."Data weryfikacji" >= (('now'::text)::date - 90))"
Total runtime: 1333.347 ms

How to tune settings to use good index ?
Include definitions of indexes:

CREATE TABLE "NumeryA"
(
"Plik" character varying(254) NOT NULL,
"DataPliku" date,
"KodBłędu" character varying(254) NOT NULL,
"NKA" character varying(254) NOT NULL,
"NTA" character varying(254) NOT NULL,
"Ile" integer,
"PołączeniaMin" character varying,
"PołączeniaMax" character varying,
"Wycofane" "char",
"Data" character varying[],
"ID Kobat" character varying[],
"NRB" character varying[],
"LP" integer,
CONSTRAINT "NumeryA_1_pkey" PRIMARY KEY ("NTA", "NKA", "KodBłędu", "Plik")
)
WITH (
OIDS=FALSE
);

CREATE INDEX "NTA_5"
ON "NumeryA"
USING btree
("NKA", substr("NTA"::text, 1, 5), "KodBłędu", "DataPliku");

CREATE INDEX dp_kb
ON "NumeryA"
USING btree
("DataPliku");

CREATE INDEX nka_nta
ON "NumeryA"
USING btree
("NKA", "NTA");

Here my planner settings:

-----------------------------------------------------------
# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

seq_page_cost = 0.3 # measured on an arbitrary scale
random_page_cost = 0.5 # same scale as above
cpu_tuple_cost = 0.007 # same scale as above
#cpu_index_tuple_cost = 0.005 # same scale as above
#cpu_operator_cost = 0.0025 # same scale as above
#effective_cache_size = 128MB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5 # range 1-10
#geqo_pool_size = 0 # selects default based on effort
#geqo_generations = 0 # selects default based on effort
#geqo_selection_bias = 2.0 # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 25 # range 1-10000
constraint_exclusion = partition # on, off, or partition
cursor_tuple_fraction = 0.05 # range 0.0-1.0
from_collapse_limit = 8
join_collapse_limit = 8 # 1 disables collapsing of explicit
# JOIN clauses

------------
pasman

Responses

Re: Query uses incorrect index at 2010-12-21 15:44:15 from Kevin Grittner

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Michael Ben-Nes	2010-12-21 15:17:07	Re: MySQL HandlerSocket - Is this possible in PG?
Previous Message	Mladen Gogala	2010-12-21 11:34:27	Re: CPU bound