Re: Hash join on int takes 8..114 seconds

From: "Andrus" <kobruleht2(at)hot(dot)ee>
To: <tv(at)fuzzy(dot)cz>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Scott Carey" <scott(at)richrelevance(dot)com>, "PFC" <lists(at)peufeu(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hash join on int takes 8..114 seconds
Date: 2008-11-24 21:16:27
Message-ID: B0B2992D8A2A4EB4BAFC4663EEC3EB60@andrusnotebook
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tomas,

> OK, what was the number of unused pointer items in the VACUUM output?

I posted it in this thread:

VACUUM FULL ANALYZE VERBOSE;
...
INFO: free space map contains 14353 pages in 314 relations
DETAIL: A total of 20000 page slots are in use (including overhead).
89664 page slots are required to track all free space.
Current limits are: 20000 page slots, 1000 relations, using 182 KB.
NOTICE: number of page slots needed (89664) exceeds max_fsm_pages (20000)
HINT: Consider increasing the configuration parameter "max_fsm_pages" to a
value over 89664.

Query returned successfully with no result in 10513335 ms.

> The query performance is still the same as when the tables were bloated?

Seems to be the same.
However I improved yesterday after previous message other queries not to
scan whole
product orders (rid) table.
Now there is only few disk activity after 5-300 seconds which seems not to
affect to those query results. So issue in this thread has been gone away.

Now this query runs using constant time 8 seconds:

explain analyze
SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
WHERE rid.toode = 'X05' AND dok.kuupaev>='2008-09-01'
"Aggregate (cost=182210.77..182210.78 rows=1 width=0) (actual
time=8031.600..8031.604 rows=1 loops=1)"
" -> Nested Loop (cost=74226.74..182149.27 rows=24598 width=0) (actual
time=2602.474..7948.121 rows=21711 loops=1)"
" -> Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1
width=24) (actual time=0.077..0.089 rows=1 loops=1)"
" Index Cond: ('X05'::bpchar = toode)"
" -> Hash Join (cost=74226.74..181897.28 rows=24598 width=24)
(actual time=2602.378..7785.315 rows=21711 loops=1)"
" Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
" -> Bitmap Heap Scan on rid (cost=4084.54..101951.60
rows=270725 width=28) (actual time=1129.925..4686.601 rows=278417 loops=1)"
" Recheck Cond: (toode = 'X05'::bpchar)"
" -> Bitmap Index Scan on rid_toode_idx
(cost=0.00..4084.54 rows=270725 width=0) (actual time=1123.202..1123.202
rows=278426 loops=1)"
" Index Cond: (toode = 'X05'::bpchar)"
" -> Hash (cost=69419.29..69419.29 rows=112766 width=4)
(actual time=1251.496..1251.496 rows=111088 loops=1)"
" -> Bitmap Heap Scan on dok (cost=1492.68..69419.29
rows=112766 width=4) (actual time=70.837..776.249 rows=111088 loops=1)"
" Recheck Cond: (kuupaev >= '2008-09-01'::date)"
" -> Bitmap Index Scan on dok_kuupaev_idx
(cost=0.00..1492.68 rows=112766 width=0) (actual time=64.177..64.177
rows=111343 loops=1)"
" Index Cond: (kuupaev >=
'2008-09-01'::date)"
"Total runtime: 8031.905 ms"

Interestingly using like is 80 times faster:

explain analyze
SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
WHERE rid.toode like 'X05' AND dok.kuupaev>='2008-09-01'
"Aggregate (cost=88178.69..88178.70 rows=1 width=0) (actual
time=115.335..115.339 rows=1 loops=1)"
" -> Hash Join (cost=71136.22..88117.36 rows=24532 width=0) (actual
time=115.322..115.322 rows=0 loops=1)"
" Hash Cond: ("outer".toode = "inner".toode)"
" -> Hash Join (cost=70163.36..86253.20 rows=24598 width=24)
(actual time=0.046..0.046 rows=0 loops=1)"
" Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
" -> Bitmap Heap Scan on rid (cost=21.16..6307.52 rows=270725
width=28) (actual time=0.037..0.037 rows=0 loops=1)"
" Filter: (toode ~~ 'X05'::text)"
" -> Bitmap Index Scan on rid_toode_pattern_idx
(cost=0.00..21.16 rows=1760 width=0) (actual time=0.028..0.028 rows=0
loops=1)"
" Index Cond: (toode ~=~ 'X05'::bpchar)"
" -> Hash (cost=69419.29..69419.29 rows=112766 width=4)
(never executed)"
" -> Bitmap Heap Scan on dok (cost=1492.68..69419.29
rows=112766 width=4) (never executed)"
" Recheck Cond: (kuupaev >= '2008-09-01'::date)"
" -> Bitmap Index Scan on dok_kuupaev_idx
(cost=0.00..1492.68 rows=112766 width=0) (never executed)"
" Index Cond: (kuupaev >=
'2008-09-01'::date)"
" -> Hash (cost=853.29..853.29 rows=13429 width=24) (actual
time=114.757..114.757 rows=13412 loops=1)"
" -> Seq Scan on toode (cost=0.00..853.29 rows=13429
width=24) (actual time=0.014..58.319 rows=13412 loops=1)"
"Total runtime: 115.505 ms"

I posted also a test script in other thread which shows also that like is
magitude faster than equality check.

rid.toode = 'X05'

and

rid.toode like 'X05'

are exactly the same conditions, there are indexes for both conditions.

So I do'nt understand why results are so different.

In other sample which I posted in thread "Increasing pattern index query
speed" like is 4 times slower:

SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
WHERE rid.toode = '99000010' AND dok.kuupaev BETWEEN '2008-11-21' AND
'2008-11-21'
AND dok.yksus LIKE 'ORISSAARE%'

"Aggregate (cost=43.09..43.10 rows=1 width=0) (actual
time=12674.675..12674.679 rows=1 loops=1)"
" -> Nested Loop (cost=29.57..43.08 rows=1 width=0) (actual
time=2002.045..12673.645 rows=19 loops=1)"
" -> Nested Loop (cost=29.57..37.06 rows=1 width=24) (actual
time=2001.922..12672.344 rows=19 loops=1)"
" -> Index Scan using dok_kuupaev_idx on dok (cost=0.00..3.47
rows=1 width=4) (actual time=342.812..9810.627 rows=319 loops=1)"
" Index Cond: ((kuupaev >= '2008-11-21'::date) AND
(kuupaev <= '2008-11-21'::date))"
" Filter: (yksus ~~ 'ORISSAARE%'::text)"
" -> Bitmap Heap Scan on rid (cost=29.57..33.58 rows=1
width=28) (actual time=8.948..8.949 rows=0 loops=319)"
" Recheck Cond: (("outer".dokumnr = rid.dokumnr) AND
(rid.toode = '99000010'::bpchar))"
" -> BitmapAnd (cost=29.57..29.57 rows=1 width=0)
(actual time=8.930..8.930 rows=0 loops=319)"
" -> Bitmap Index Scan on rid_dokumnr_idx
(cost=0.00..2.52 rows=149 width=0) (actual time=0.273..0.273 rows=2
loops=319)"
" Index Cond: ("outer".dokumnr =
rid.dokumnr)"
" -> Bitmap Index Scan on rid_toode_idx
(cost=0.00..26.79 rows=1941 width=0) (actual time=8.596..8.596 rows=15236
loops=319)"
" Index Cond: (toode = '99000010'::bpchar)"
" -> Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1
width=24) (actual time=0.043..0.048 rows=1 loops=19)"
" Index Cond: ('99000010'::bpchar = toode)"
"Total runtime: 12675.191 ms"

explain analyze SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
WHERE rid.toode like '99000010%' AND dok.kuupaev BETWEEN '2008-11-21' AND
'2008-11-21'
AND dok.yksus LIKE 'ORISSAARE%'

"Aggregate (cost=17.99..18.00 rows=1 width=0) (actual
time=46465.609..46465.613 rows=1 loops=1)"
" -> Nested Loop (cost=0.00..17.98 rows=1 width=0) (actual
time=3033.947..46465.462 rows=19 loops=1)"
" -> Nested Loop (cost=0.00..11.96 rows=1 width=24) (actual
time=3033.890..46464.310 rows=19 loops=1)"
" Join Filter: ("outer".dokumnr = "inner".dokumnr)"
" -> Index Scan using dok_kuupaev_idx on dok (cost=0.00..5.93
rows=1 width=4) (actual time=0.044..5.419 rows=319 loops=1)"
" Index Cond: ((kuupaev >= '2008-11-21'::date) AND
(kuupaev <= '2008-11-21'::date))"
" Filter: (yksus ~~ 'ORISSAARE%'::text)"
" -> Index Scan using rid_toode_pattern_idx on rid
(cost=0.00..6.01 rows=1 width=28) (actual time=0.025..88.046 rows=15322
loops=319)"
" Index Cond: ((toode ~>=~ '99000010'::bpchar) AND (toode
~<~ '99000011'::bpchar))"
" Filter: (toode ~~ '99000010%'::text)"
" -> Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1
width=24) (actual time=0.034..0.039 rows=1 loops=19)"
" Index Cond: ("outer".toode = toode.toode)"
"Total runtime: 46465.833 ms"

> What are the outputs of iostat/vmstat/dstat/top when running the query?

iostat and dstat are not installed in Gentoo.
For last like query which constantly takes 46 seconds vmstat 5 shows:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
0 0 216 124780 0 1899236 0 0 10 13 277 78 3 1 95
0
0 0 216 124656 0 1899248 0 0 2 8 267 59 2 0 98
0
1 0 216 124664 0 1899244 0 0 0 7 266 62 2 0 97
0
0 0 216 124788 0 1899248 0 0 0 8 273 73 3 1 96
0
0 0 216 124656 0 1899252 0 0 0 2 265 54 2 0 97
0
1 0 216 124656 0 1899252 0 0 0 22 268 63 14 39 48
0 <-------- start of query
1 0 216 124656 0 1899252 0 0 0 62 267 61 25 75 0
0
1 0 216 124656 0 1899252 0 0 0 2 266 55 28 72 0
0
1 0 216 124664 0 1899256 0 0 0 5 265 56 26 74 0
0
1 0 216 124788 0 1899256 0 0 0 10 271 67 25 75 0
0
1 0 216 124664 0 1899256 0 0 0 3 265 54 25 75 0
0
1 0 216 124160 0 1899260 0 0 0 1 265 57 28 72 0
0
1 0 216 125020 0 1899260 0 0 0 21 272 60 28 72 0
0
1 0 216 124020 0 1899264 0 0 0 0 271 73 29 71 0
0
0 0 216 124260 0 1899268 0 0 0 3 266 61 19 59 22
0 <------ end of query
1 0 216 125268 0 1899260 0 0 0 9 268 59 2 0 97
0
0 0 216 124912 0 1899268 0 0 0 5 270 59 3 0 96
0
0 0 216 124656 0 1899272 0 0 0 5 267 64 2 0 98
0
0 0 216 124664 0 1899272 0 0 0 0 263 50 2 0 98
0

top shows single postmaster process activity:

top - 23:07:49 up 27 days, 4:20, 1 user, load average: 0.25, 0.22, 0.12
Tasks: 50 total, 3 running, 47 sleeping, 0 stopped, 0 zombie
Cpu(s): 29.3% us, 70.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 2075828k total, 1951604k used, 124224k free, 0k buffers
Swap: 3911816k total, 216k used, 3911600k free, 1899348k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29687 postgres 25 0 144m 124m 121m R 96.3 6.2 8:18.04 postmaster
8147 root 16 0 4812 1628 1316 S 1.0 0.1 0:00.03 sshd
8141 root 15 0 5080 1892 1528 S 0.7 0.1 0:00.02 sshd
8145 sshd 15 0 4816 1220 912 S 0.3 0.1 0:00.01 sshd
8151 sshd 15 0 4812 1120 816 S 0.3 0.1 0:00.01 sshd
1 root 16 0 1480 508 444 S 0.0 0.0 0:01.89 init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/0
3 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/0
4 root 10 -5 0 0 0 S 0.0 0.0 0:00.63 khelper
5 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread
7 root 10 -5 0 0 0 S 0.0 0.0 2:21.73 kblockd/0
8 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid
115 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0
114 root 15 0 0 0 0 S 0.0 0.0 9:21.41 kswapd0
116 root 10 -5 0 0 0 S 0.0 0.0 0:12.06 xfslogd/0
117 root 10 -5 0 0 0 S 0.0 0.0 1:36.43 xfsdatad/0
706 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kseriod
723 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 kpsmoused
738 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 ata/0
740 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0
741 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_1
742 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_2
743 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_3
762 root 10 -5 0 0 0 S 0.0 0.0 0:18.64 xfsbufd
763 root 10 -5 0 0 0 S 0.0 0.0 0:00.79 xfssyncd
963 root 16 -4 1712 528 336 S 0.0 0.0 0:00.24 udevd
6677 root 15 0 1728 572 400 S 0.0 0.0 0:10.14 syslog-ng
7183 root 15 0 3472 828 672 S 0.0 0.0 0:08.50 sshd
7222 root 16 0 1736 672 556 S 0.0 0.0 0:00.03 cron
7237 root 16 0 1620 712 608 S 0.0 0.0 0:00.00 agetty
7238 root 17 0 1616 712 608 S 0.0 0.0 0:00.00 agetty
7239 root 16 0 1616 712 608 S 0.0 0.0 0:00.00 agetty
7240 root 16 0 1616 708 608 S 0.0 0.0 0:00.00 agetty
7241 root 16 0 1616 708 608 S 0.0 0.0 0:00.00 agetty
31873 root 15 0 1616 712 608 S 0.0 0.0 0:00.00 agetty
14908 postgres 16 0 141m 10m 9936 S 0.0 0.5 0:01.44 postmaster
14915 postgres 16 0 8468 1360 896 S 0.0 0.1 0:00.36 postmaster

Andrus.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2008-11-24 21:24:47 Re: Monitoring buffercache...
Previous Message Andrus 2008-11-24 20:33:37 Re: Increasing pattern index query speed