Re: Hash join on int takes 8..114 seconds

From: "Andrus" <kobruleht2(at)hot(dot)ee>
To: "Richard Huxton" <dev(at)archonet(dot)com>, "PFC" <lists(at)peufeu(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hash join on int takes 8..114 seconds
Date: 2008-11-21 15:12:27
Message-ID: 75A918DFCF25470FAF96B3FE3D25A38B@andrusnotebook
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

PFC,

thank you.

> OK so vmstat says you are IO-bound, this seems logical if the same plan
> has widely varying timings...
>
> Let's look at the usual suspects :
>
> - how many dead rows in your tables ? are your tables data, or bloat ?
> (check vacuum verbose, etc)

set search_path to firma2,public;
vacuum verbose dok; vacuum verbose rid

INFO: vacuuming "firma2.dok"
INFO: index "dok_pkey" now contains 1235086 row versions in 9454 pages
DETAIL: 100 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.16s/0.38u sec elapsed 0.77 sec.
INFO: index "dok_dokumnr_idx" now contains 1235086 row versions in 9454
pages
DETAIL: 100 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.14s/0.40u sec elapsed 0.78 sec.
INFO: index "dok_klient_idx" now contains 1235086 row versions in 18147
pages
DETAIL: 887 index row versions were removed.
3265 index pages have been deleted, 3033 are currently reusable.
CPU 0.36s/0.46u sec elapsed 31.87 sec.
INFO: index "dok_krdokumnr_idx" now contains 1235086 row versions in 11387
pages
DETAIL: 119436 index row versions were removed.
1716 index pages have been deleted, 1582 are currently reusable.
CPU 0.47s/0.55u sec elapsed 63.38 sec.
INFO: index "dok_kuupaev_idx" now contains 1235101 row versions in 10766
pages
DETAIL: 119436 index row versions were removed.
659 index pages have been deleted, 625 are currently reusable.
CPU 0.62s/0.53u sec elapsed 40.20 sec.
INFO: index "dok_tasudok_idx" now contains 1235104 row versions in 31348
pages
DETAIL: 119436 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.18s/1.08u sec elapsed 118.97 sec.
INFO: index "dok_tasudok_unique_idx" now contains 99 row versions in 97
pages
DETAIL: 98 index row versions were removed.
80 index pages have been deleted, 80 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.48 sec.
INFO: index "dok_tasumata_idx" now contains 1235116 row versions in 11663
pages
DETAIL: 119436 index row versions were removed.
5340 index pages have been deleted, 5131 are currently reusable.
CPU 0.43s/0.56u sec elapsed 53.96 sec.
INFO: index "dok_tellimus_idx" now contains 1235122 row versions in 11442
pages
DETAIL: 119436 index row versions were removed.
1704 index pages have been deleted, 1569 are currently reusable.
CPU 0.45s/0.59u sec elapsed 76.91 sec.
INFO: index "dok_yksus_pattern_idx" now contains 1235143 row versions in
5549 pages
DETAIL: 119436 index row versions were removed.
529 index pages have been deleted, 129 are currently reusable.
CPU 0.19s/0.46u sec elapsed 2.72 sec.
INFO: index "dok_doktyyp" now contains 1235143 row versions in 3899 pages
DETAIL: 119436 index row versions were removed.
188 index pages have been deleted, 13 are currently reusable.
CPU 0.14s/0.44u sec elapsed 1.40 sec.
INFO: index "dok_sihtyksus_pattern_idx" now contains 1235143 row versions
in 5353 pages
DETAIL: 119436 index row versions were removed.
286 index pages have been deleted, 5 are currently reusable.
CPU 0.13s/0.45u sec elapsed 3.01 sec.
INFO: "dok": removed 119436 row versions in 13707 pages
DETAIL: CPU 0.80s/0.37u sec elapsed 14.15 sec.
INFO: "dok": found 119436 removable, 1235085 nonremovable row versions in
171641 pages
DETAIL: 2 dead row versions cannot be removed yet.
There were 1834279 unused item pointers.
0 pages are entirely empty.
CPU 6.56s/6.88u sec elapsed 450.54 sec.
INFO: vacuuming "pg_toast.pg_toast_40595"
INFO: index "pg_toast_40595_index" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_40595": found 0 removable, 0 nonremovable row versions in 0
pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "firma2.rid"
INFO: index "rid_pkey" now contains 3275197 row versions in 13959 pages
DETAIL: 38331 index row versions were removed.
262 index pages have been deleted, 262 are currently reusable.
CPU 0.42s/1.05u sec elapsed 58.56 sec.
INFO: index "rid_dokumnr_idx" now contains 3275200 row versions in 14125
pages
DETAIL: 38331 index row versions were removed.
572 index pages have been deleted, 571 are currently reusable.
CPU 0.49s/1.14u sec elapsed 71.57 sec.
INFO: index "rid_inpdokumnr_idx" now contains 3275200 row versions in 15103
pages
DETAIL: 38331 index row versions were removed.
579 index pages have been deleted, 579 are currently reusable.
CPU 0.66s/1.03u sec elapsed 68.38 sec.
INFO: index "rid_toode_idx" now contains 3275224 row versions in 31094
pages
DETAIL: 38331 index row versions were removed.
2290 index pages have been deleted, 2290 are currently reusable.
CPU 1.39s/1.58u sec elapsed 333.82 sec.
INFO: index "rid_rtellimus_idx" now contains 3275230 row versions in 7390
pages
DETAIL: 18591 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.18s/0.66u sec elapsed 1.78 sec.
INFO: index "rid_toode_pattern_idx" now contains 3275230 row versions in
16310 pages
DETAIL: 17800 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.44s/1.04u sec elapsed 6.55 sec.
INFO: "rid": removed 38331 row versions in 3090 pages
DETAIL: CPU 0.20s/0.10u sec elapsed 5.49 sec.
INFO: "rid": found 38331 removable, 3275189 nonremovable row versions in
165282 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 1878923 unused item pointers.
0 pages are entirely empty.
CPU 5.06s/7.27u sec elapsed 607.59 sec.

Query returned successfully with no result in 1058319 ms.

> - what's the size of the dataset relative to the RAM ?

Db size is 7417 MB
relevant table sizes in desc by size order:

1 40595 dok 2345 MB
2 1214 pg_shdepend 2259 MB
3 40926 rid 2057 MB
6 1232 pg_shdepend_depender_index 795 MB
7 1233 pg_shdepend_reference_index 438 MB
8 44286 dok_tasudok_idx 245 MB
9 44299 rid_toode_idx 243 MB
10 44283 dok_klient_idx 142 MB
11 19103791 rid_toode_pattern_idx 127 MB
14 44298 rid_inpdokumnr_idx 118 MB
15 44297 rid_dokumnr_idx 110 MB
16 43573 rid_pkey 109 MB
18 44288 dok_tasumata_idx 91 MB
19 44289 dok_tellimus_idx 89 MB
20 44284 dok_krdokumnr_idx 89 MB
21 44285 dok_kuupaev_idx 84 MB
23 43479 dok_pkey 74 MB
24 44282 dok_dokumnr_idx 74 MB
25 19076304 rid_rtellimus_idx 58 MB
26 18663923 dok_yksus_pattern_idx 43 MB
27 18801591 dok_sihtyksus_pattern_idx 42 MB
29 18774881 dok_doktyyp 30 MB
46 40967 toode 13 MB

server is HP Proliant DL320 G3
http://h18000.www1.hp.com/products/quickspecs/12169_ca/12169_ca.HTML
CPU is 2.93Ghz Celeron 256kb cache.

Server has 2 GB RAM.
It has SATA RAID 0,1 integrated controller (1.5Gbps) and SAMSUNG HD160JJ
mirrored disks.

> Now let's look more closely at the query :
>
> explain analyze
> SELECT sum(1)
> FROM dok JOIN rid USING (dokumnr)
> JOIN toode USING (toode)
> LEFT JOIN artliik using(grupp,liik)
> WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01'
>
>
> I presume doing the query without artliik changes nothing to the runtime,
> yes ?

Yes. After removing artkliik from join I got response times 65 and 50
seconds, so this does not make difference.

> Your problem here is that, no matter what, postgres will have to examine
> - all rows where dok.kuupaev>='2008-09-01',
> - and all rows where rid.toode = 'X05'.
> If you use dok.kuupaev>='2007-09-01' (note : 2007) it will probably have
> to scan many, many more rows.

Probably yes, since then it reads one year more sales data.

> If you perform this query often you could CLUSTER rid on (toode) and dok
> on (kuupaev), but this can screw other queries.

Some reports are by sales date (dok.kuupaev) and customers.
CLUSTER rid on (toode) slows them down. Also autovacuum cannot do
clustering.

> What is the meaning of the columns ?

This is typical sales data:

-- Receipt headers:
DOK ( dokumnr INT SERIAL PRIMARY KEY,
kuupaev DATE --- sales date
)

-- Receipt details
RID ( dokumnr INT,
toode CHAR(20), -- item code
CONSTRAINT rid_dokumnr_fkey FOREIGN KEY (dokumnr) REFERENCES dok
(dokumnr),
CONSTRAINT rid_toode_fkey FOREIGN KEY (toode)
REFERENCES firma2.toode (toode)
)

-- Products
TOODE (
toode CHAR(20) PRIMARY KEY
)

> To make this type of query faster I would tend to think about :

> - denormalization (ie adding a column in one of your tables and a
> multicolumn index)

For this query it is possible to duplicate kuupaev column to rid table.
However most of the this seems to go to scanning rid table, so I suspect
that this will help.

> - materialized views
> - materialized summary tables (ie. summary of sales for last month, for
> instance)

There are about 1000 items and reports are different.

Andrus.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message tv 2008-11-21 16:15:10 Re: Hash join on int takes 8..114 seconds
Previous Message Andrus 2008-11-21 13:49:24 Re: Hash join on int takes 8..114 seconds