hashjoin chosen over 1000x faster plan

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: hashjoin chosen over 1000x faster plan
Date: 2007-10-09 20:09:51
Message-ID: 470B99BE.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have a situation where a query is running much slower than I would
expect. The ANALYZE showed that it is hashing some information which
is rarely needed. When I set enable_hashjoin = off for the
connection the query run in 1/1000 the time.

This isn't a debilitating level of performance, but it would be nice
to clean it up, and we haven't yet come up with a viable solution.

The runs below are after several identical runs to create a fully
cached situation. Autovacuum is aggressive and there is a nightly
vacuum analyze of the whole database. This box has 4 x 2 GHz Xeon
CPUs, 6 GB RAM, RAID 5 with 13 spindles on 256 MB BBU controller.

I simplified the original a bit; sorry it's still kinda big.

-Kevin

listen_addresses = '*'
max_connections = 200
shared_buffers = 160MB
temp_buffers = 50MB
work_mem = 10MB
maintenance_work_mem = 160MB
max_fsm_pages = 800000
bgwriter_lru_percent = 20.0
bgwriter_lru_maxpages = 200
bgwriter_all_percent = 10.0
bgwriter_all_maxpages = 600
wal_buffers = 160kB
checkpoint_segments = 10
random_page_cost = 2.0
effective_cache_size = 5GB
redirect_stderr = on
log_line_prefix = '[%m] %p %q<%u %d %r> '
stats_block_level = on
stats_row_level = on
autovacuum = on
autovacuum_naptime = 10s
autovacuum_vacuum_threshold = 1
autovacuum_analyze_threshold = 1
datestyle = 'iso, mdy'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
escape_string_warning = off
standard_conforming_strings = on
sql_inheritance = off

bigbird=> select version();
version
-------------------------------------------------------------------------------------
PostgreSQL 8.2.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)
(1 row)

bigbird=> explain analyze
SELECT
"CH"."caseNo",
"CH"."countyNo",
"CH"."chargeNo",
"CH"."statuteCite",
"CH"."sevClsCode",
"CH"."modSevClsCode",
"CH"."descr",
"CH"."offenseDate",
"CH"."pleaCode",
"CH"."pleaDate",
"CH"."chargeSeqNo",
"CHST"."eventDate" AS "reopEventDate",
"CTHE"."descr" AS "reopEventDescr"
FROM "Charge" "CH"
LEFT OUTER JOIN "CaseHist" "CHST"
ON ( "CHST"."countyNo" = "CH"."countyNo"
AND "CHST"."caseNo" = "CH"."caseNo"
AND "CHST"."histSeqNo" = "CH"."reopHistSeqNo"
)
LEFT OUTER JOIN "CaseTypeHistEvent" "CTHE"
ON ( "CHST"."eventType" = "CTHE"."eventType"
AND "CHST"."caseType" = "CTHE"."caseType"
AND "CHST"."countyNo" = "CTHE"."countyNo"
)
WHERE (
("CH"."caseNo" = '2005CF000001')
AND ("CH"."countyNo" = 13))
ORDER BY
"chargeNo",
"chargeSeqNo"
;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=2554.50..2554.52 rows=7 width=146) (actual time=443.068..443.070 rows=3 loops=1)
Sort Key: "CH"."chargeNo", "CH"."chargeSeqNo"
-> Hash Left Join (cost=2318.91..2554.40 rows=7 width=146) (actual time=443.004..443.039 rows=3 loops=1)
Hash Cond: ((("CHST"."eventType")::bpchar = ("CTHE"."eventType")::bpchar) AND (("CHST"."caseType")::bpchar = ("CTHE"."caseType")::bpchar))
-> Nested Loop Left Join (cost=0.00..208.13 rows=7 width=131) (actual time=0.062..0.093 rows=3 loops=1)
-> Index Scan using "Charge_pkey" on "Charge" "CH" (cost=0.00..15.37 rows=7 width=112) (actual time=0.052..0.059 rows=3 loops=1)
Index Cond: ((("countyNo")::smallint = 13) AND (("caseNo")::bpchar = '2005CF000001'::bpchar))
-> Index Scan using "CaseHist_pkey" on "CaseHist" "CHST" (cost=0.00..27.46 rows=6 width=41) (actual time=0.002..0.002 rows=0 loops=3)
Index Cond: ((("CHST"."countyNo")::smallint = 13) AND (("CHST"."caseNo")::bpchar = '2005CF000001'::bpchar) AND (("CHST"."histSeqNo")::smallint = ("CH"."reopHistSeqNo")::smallint))
-> Hash (cost=2084.80..2084.80 rows=15607 width=98) (actual time=442.919..442.919 rows=15607 loops=1)
-> Subquery Scan "CTHE" (cost=1630.43..2084.80 rows=15607 width=98) (actual time=331.665..411.390 rows=15607 loops=1)
-> Merge Right Join (cost=1630.43..1928.73 rows=15607 width=89) (actual time=331.661..391.999 rows=15607 loops=1)
Merge Cond: (((d."countyNo")::smallint = "inner"."?column9?") AND ((d."caseType")::bpchar = "inner"."?column10?") AND ((d."eventType")::bpchar = "inner"."?column11?"))
-> Index Scan using "CaseTypeHistEventD_pkey" on "CaseTypeHistEventD" d (cost=0.00..87.77 rows=2051 width=21) (actual time=0.026..0.730 rows=434 loops=1)
-> Sort (cost=1630.43..1669.45 rows=15607 width=76) (actual time=331.022..341.450 rows=15607 loops=1)
Sort Key: (c."countyNo")::smallint, (b."caseType")::bpchar, (b."eventType")::bpchar
-> Nested Loop (cost=0.00..543.41 rows=15607 width=76) (actual time=0.035..47.206 rows=15607 loops=1)
-> Index Scan using "ControlRecord_pkey" on "ControlRecord" c (cost=0.00..4.27 rows=1 width=2) (actual time=0.010..0.017 rows=1 loops=1)
Index Cond: (("countyNo")::smallint = 13)
-> Seq Scan on "CaseTypeHistEventB" b (cost=0.00..383.07 rows=15607 width=74) (actual time=0.019..14.634 rows=15607 loops=1)
Total runtime: 444.452 ms
(21 rows)

bigbird=> set enable_hashjoin = off;
SET
bigbird=> [same query]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=3497.26..3497.28 rows=7 width=146) (actual time=0.115..0.117 rows=3 loops=1)
Sort Key: "CH"."chargeNo", "CH"."chargeSeqNo"
-> Merge Left Join (cost=3380.05..3497.17 rows=7 width=146) (actual time=0.091..0.097 rows=3 loops=1)
Merge Cond: (("outer"."?column16?" = "inner"."?column5?") AND ("outer"."?column17?" = "inner"."?column6?"))
-> Sort (cost=208.23..208.25 rows=7 width=131) (actual time=0.087..0.089 rows=3 loops=1)
Sort Key: ("CHST"."caseType")::bpchar, ("CHST"."eventType")::bpchar
-> Nested Loop Left Join (cost=0.00..208.13 rows=7 width=131) (actual time=0.053..0.070 rows=3 loops=1)
-> Index Scan using "Charge_pkey" on "Charge" "CH" (cost=0.00..15.37 rows=7 width=112) (actual time=0.043..0.048 rows=3 loops=1)
Index Cond: ((("countyNo")::smallint = 13) AND (("caseNo")::bpchar = '2005CF000001'::bpchar))
-> Index Scan using "CaseHist_pkey" on "CaseHist" "CHST" (cost=0.00..27.46 rows=6 width=41) (actual time=0.001..0.001 rows=0 loops=3)
Index Cond: ((("CHST"."countyNo")::smallint = 13) AND (("CHST"."caseNo")::bpchar = '2005CF000001'::bpchar) AND (("CHST"."histSeqNo")::smallint = ("CH"."reopHistSeqNo")::smallint))
-> Sort (cost=3171.82..3210.84 rows=15607 width=98) (never executed)
Sort Key: ("CTHE"."caseType")::bpchar, ("CTHE"."eventType")::bpchar
-> Subquery Scan "CTHE" (cost=1630.43..2084.80 rows=15607 width=98) (never executed)
-> Merge Right Join (cost=1630.43..1928.73 rows=15607 width=89) (never executed)
Merge Cond: (((d."countyNo")::smallint = "inner"."?column9?") AND ((d."caseType")::bpchar = "inner"."?column10?") AND ((d."eventType")::bpchar = "inner"."?column11?"))
-> Index Scan using "CaseTypeHistEventD_pkey" on "CaseTypeHistEventD" d (cost=0.00..87.77 rows=2051 width=21) (never executed)
-> Sort (cost=1630.43..1669.45 rows=15607 width=76) (never executed)
Sort Key: (c."countyNo")::smallint, (b."caseType")::bpchar, (b."eventType")::bpchar
-> Nested Loop (cost=0.00..543.41 rows=15607 width=76) (never executed)
-> Index Scan using "ControlRecord_pkey" on "ControlRecord" c (cost=0.00..4.27 rows=1 width=2) (never executed)
Index Cond: (("countyNo")::smallint = 13)
-> Seq Scan on "CaseTypeHistEventB" b (cost=0.00..383.07 rows=15607 width=74) (never executed)
Total runtime: 0.437 ms
(24 rows)

bigbird=> \d "Charge"
Table "public.Charge"
Column | Type | Modifiers
--------------------+---------------------+-----------
caseNo | "CaseNoT" | not null
chargeSeqNo | "ChargeSeqNoT" | not null
countyNo | "CountyNoT" | not null
areSentCondsMet | boolean | not null
caseType | "CaseTypeT" | not null
chargeNo | "ChargeNoT" | not null
descr | "StatuteDescrT" | not null
isPartyTo | boolean | not null
lastChargeModSeqNo | integer | not null
lastJdgmtSeqNo | "JdgmtSeqNoT" | not null
ordStatuteFlag | character(1) | not null
plntfAgencyNo | "PlntfAgencyNoT" | not null
statuteAgencyNo | "PlntfAgencyNoT" | not null
statuteCite | "StatuteCiteT" | not null
statuteEffDate | "DateT" |
arrestCaseNo | "ArrestCaseNoT" |
arrestDate | "DateT" |
arrestTrackingNo | "ArrestTrackingNoT" |
bookAgencyNo | "IssAgencyNoT" |
bookCaseNo | "BookCaseNoT" |
chargeId | "ChargeIdT" |
dispoCode | "DispoCodeT" |
issAgencyNo | "IssAgencyNoT" |
modSevClsCode | "SevClsCodeT" |
offenseDate | "DateT" |
offenseDateRange | "OffenseDateRangeT" |
pleaCode | "PleaCodeT" |
pleaDate | "DateT" |
reopHistSeqNo | "HistSeqNoT" |
sevClsCode | "SevClsCodeT" |
statuteSevSeqNo | "StatuteSevSeqNoT" |
wcisClsCode | "WcisClsCodeT" |
pleaHistSeqNo | "HistSeqNoT" |
chargeStatusCode | "ChargeStatusCodeT" |
Indexes:
"Charge_pkey" PRIMARY KEY, btree ("countyNo", "caseNo", "chargeSeqNo")
"Charge_ArrestTrackingNo" UNIQUE, btree ("arrestTrackingNo", "countyNo", "caseNo", "chargeSeqNo")
"Charge_OffenseDate" btree ("offenseDate", "countyNo", "issAgencyNo")

bigbird=> \d "CaseHist"
Table "public.CaseHist"
Column | Type | Modifiers
---------------+------------------+-----------
caseNo | "CaseNoT" | not null
histSeqNo | "HistSeqNoT" | not null
countyNo | "CountyNoT" | not null
caseType | "CaseTypeT" |
eventAmt | "MoneyT" |
eventDate | "DateT" |
eventType | "EventTypeT" |
userId | "UserIdT" |
courtRptrCode | "CtofcNoT" |
ctofcNo | "CtofcNoT" |
dktTxt | "TextT" |
prevRespCtofc | "CtofcNoT" |
tag | "TagTypeT" |
tapeCounterNo | "TapeCounterNoT" |
tapeLoc | "TapeLocT" |
wcisReported | "DateT" |
weightPd | "PdCodeT" |
weightTime | "CalDurationT" |
sealCtofcNo | "CtofcNoT" |
sccaCaseNo | "SccaCaseNoT" |
Indexes:
"CaseHist_pkey" PRIMARY KEY, btree ("countyNo", "caseNo", "histSeqNo")
"CaseHist_CaseHistCibRpt" btree ("countyNo", "eventDate", "eventType", "caseType")

bigbird=> \d "CaseTypeHistEvent"
View "public.CaseTypeHistEvent"
Column | Type | Modifiers
----------------+---------------+-----------
caseType | "CaseTypeT" |
eventType | "EventTypeT" |
descr | "EventDescrT" |
isActive | boolean |
isKeyEvent | boolean |
isMoneyEnabled | boolean |
keyEventSeqNo | integer |
countyNo | "CountyNoT" |
View definition:
SELECT b."caseType", b."eventType", b.descr, b."isActive",
CASE
WHEN d."eventType" IS NOT NULL THEN d."isKeyEvent"
ELSE b."isKeyEvent"
END AS "isKeyEvent",
CASE
WHEN d."eventType" IS NOT NULL THEN d."isMoneyEnabled"
ELSE b."isMoneyEnabled"
END AS "isMoneyEnabled", COALESCE(
CASE
WHEN d."eventType" IS NOT NULL THEN d."keyEventSeqNo"::smallint
ELSE b."keyEventSeqNo"::smallint
END::integer, 0) AS "keyEventSeqNo", c."countyNo"
FROM ONLY "CaseTypeHistEventB" b
JOIN ONLY "ControlRecord" c ON 1 = 1
LEFT JOIN ONLY "CaseTypeHistEventD" d ON d."caseType"::bpchar = b."caseType"::bpchar AND d."eventType"::bpchar = b."eventType"::bpchar AND d."countyNo"::smallint = c."countyNo"::smallint;

bigbird=> \d "CaseTypeHistEventB"
Table "public.CaseTypeHistEventB"
Column | Type | Modifiers
----------------+----------------+-----------
caseType | "CaseTypeT" | not null
eventType | "EventTypeT" | not null
descr | "EventDescrT" | not null
isActive | boolean | not null
isKeyEvent | boolean | not null
isMoneyEnabled | boolean | not null
keyEventSeqNo | "KeyEventSeqT" |
Indexes:
"CaseTypeHistEventB_pkey" PRIMARY KEY, btree ("caseType", "eventType") CLUSTER

bigbird=> \d "CaseTypeHistEventD"
Table "public.CaseTypeHistEventD"
Column | Type | Modifiers
----------------+----------------+-----------
countyNo | "CountyNoT" | not null
caseType | "CaseTypeT" | not null
eventType | "EventTypeT" | not null
isMoneyEnabled | boolean | not null
isKeyEvent | boolean | not null
keyEventSeqNo | "KeyEventSeqT" |
Indexes:
"CaseTypeHistEventD_pkey" PRIMARY KEY, btree ("countyNo", "caseType", "eventType")
"CaseTypeHistEventD_CaseType" btree ("caseType", "eventType")

bigbird=> select count(*), count("reopHistSeqNo") from "Charge";
count | count
----------+--------
14041511 | 141720
(1 row)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Radhika S 2007-10-09 20:12:56 Shared Buffer setting in postgresql.conf
Previous Message Radhika S 2007-10-09 20:00:04 Postgres running Very slowly