Re: Hash join on int takes 8..114 seconds

From: "Andrus" <kobruleht2(at)hot(dot)ee>
To: "Scott Carey" <scott(at)richrelevance(dot)com>, "PFC" <lists(at)peufeu(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hash join on int takes 8..114 seconds
Date: 2008-11-23 22:43:09
Message-ID: 8F4674DDA1F24158A615BE304697988B@andrusnotebook
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott,

thank you.

>> > work_mem = 512
>This is very easy to try. You can change work_mem for just a single
>session, and this can in some cases help performance quite a bit, and in
>others not at all.
>I would not recommend having it lower than at least 4MB on a server like
>that unless you have a lot of concurrently active queries / connections.
>To try it, simply use the SET command. To try out 32MB, just do:
>SET work_mem = '32MB';

8.1.4 requires int value.
SET work_mem = 33554432;
causes:
ERROR: 33554432 is outside the valid range for parameter "work_mem" (64 ..
2097151)

So max allowed value seems to be 2 MB
I tested it when this server was idle by running both queries several times
after VACUUM FULL was running

SET work_mem = 2097151;
set search_path to firma2,public;
explain analyze
SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01'
"Aggregate (cost=177291.36..177291.37 rows=1 width=0) (actual
time=5153.856..5153.859 rows=1 loops=1)"
" -> Nested Loop (cost=73607.45..177229.96 rows=24561 width=0) (actual
time=1395.935..5071.247 rows=21541 loops=1)"
" -> Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1
width=24) (actual time=0.078..0.087 rows=1 loops=1)"
" Index Cond: ('X05'::bpchar = toode)"
" -> Hash Join (cost=73607.45..176978.33 rows=24561 width=24)
(actual time=1395.836..4911.425 rows=21541 loops=1)"
" Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
" -> Bitmap Heap Scan on rid (cost=4083.10..101802.05
rows=270316 width=28) (actual time=238.578..2367.189 rows=278247 loops=1)"
" Recheck Cond: (toode = 'X05'::bpchar)"
" -> Bitmap Index Scan on rid_toode_idx
(cost=0.00..4083.10 rows=270316 width=0) (actual time=150.868..150.868
rows=278248 loops=1)"
" Index Cond: (toode = 'X05'::bpchar)"
" -> Hash (cost=69242.72..69242.72 rows=112651 width=4)
(actual time=1146.081..1146.081 rows=110170 loops=1)"
" -> Bitmap Heap Scan on dok (cost=1492.28..69242.72
rows=112651 width=4) (actual time=46.210..696.803 rows=110170 loops=1)"
" Recheck Cond: (kuupaev >= '2008-09-01'::date)"
" -> Bitmap Index Scan on dok_kuupaev_idx
(cost=0.00..1492.28 rows=112651 width=0) (actual time=33.938..33.938
rows=110232 loops=1)"
" Index Cond: (kuupaev >=
'2008-09-01'::date)"
"Total runtime: 5154.911 ms"

SET work_mem to default;
set search_path to firma2,public;
explain analyze
SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01'
"Aggregate (cost=181869.36..181869.37 rows=1 width=0) (actual
time=7807.867..7807.871 rows=1 loops=1)"
" -> Nested Loop (cost=74048.45..181807.96 rows=24561 width=0) (actual
time=2607.429..7728.138 rows=21541 loops=1)"
" -> Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1
width=24) (actual time=0.079..0.091 rows=1 loops=1)"
" Index Cond: ('X05'::bpchar = toode)"
" -> Hash Join (cost=74048.45..181556.33 rows=24561 width=24)
(actual time=2607.332..7569.612 rows=21541 loops=1)"
" Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
" -> Bitmap Heap Scan on rid (cost=4083.10..101802.05
rows=270316 width=28) (actual time=1147.071..4528.845 rows=278247 loops=1)"
" Recheck Cond: (toode = 'X05'::bpchar)"
" -> Bitmap Index Scan on rid_toode_idx
(cost=0.00..4083.10 rows=270316 width=0) (actual time=1140.337..1140.337
rows=278248 loops=1)"
" Index Cond: (toode = 'X05'::bpchar)"
" -> Hash (cost=69242.72..69242.72 rows=112651 width=4)
(actual time=1240.988..1240.988 rows=110170 loops=1)"
" -> Bitmap Heap Scan on dok (cost=1492.28..69242.72
rows=112651 width=4) (actual time=68.053..769.448 rows=110170 loops=1)"
" Recheck Cond: (kuupaev >= '2008-09-01'::date)"
" -> Bitmap Index Scan on dok_kuupaev_idx
(cost=0.00..1492.28 rows=112651 width=0) (actual time=61.358..61.358
rows=110232 loops=1)"
" Index Cond: (kuupaev >=
'2008-09-01'::date)"
"Total runtime: 7808.174 ms"

In both cases vmstat 2 shows only cpu activity when queries are running:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
...
0 0 232 123692 0 1891044 0 0 0 0 252 17 0 0
100 0
0 0 232 123692 0 1891044 0 0 0 0 252 17 0 0
100 0
0 0 232 123684 0 1891044 0 0 0 162 254 22 0 0
100 0
0 0 232 123684 0 1891044 0 0 0 0 252 18 0 0
100 0
1 0 232 123056 0 1891444 0 0 0 13 254 21 62 5 34
0 <---- start of slower query
1 0 232 102968 0 1911060 0 0 0 16 252 18 26 75 0
0
1 0 232 77424 0 1936996 0 0 0 0 252 18 37 63 0
0
1 0 232 71464 0 1941928 0 0 0 73 260 34 38 62 0
0
0 0 232 123420 0 1891044 0 0 0 32 257 31 8 15 77
0 <-------- end of slower query
0 0 232 123420 0 1891044 0 0 0 25 255 24 0 0
100 0
0 0 232 123420 0 1891044 0 0 0 28 255 27 0 0
100 0

Is it safe to set

work_mem = 2097151

in postgresql.conf file ?

>First, demonstrate that it is all or mostly in memory -- use iostat or
>other tools to ensure that there is not much disk activity during the
>query. If your system doesn't have iostat installed, it should be
>installed. It is a very useful tool.

# iostat
bash: iostat: command not found
# locate iostat
/usr/src/linux-2.6.16-gentoo-r9/Documentation/iostats.txt

I have few experience in Linux. No idea how to install or compile iostat in
this system.

>If it is all cached in memory, you may want to ensure that your
>shared_buffers is a reasonalbe size so that there is less shuffling of data
>from the kernel to postgres and back. Generally, shared_buffers works best
>between 5% and 25% of system memory.

currently shared_buffers = 15000

>If it is completely CPU bound then the work done for the query has to be
>reduced by altering the plan to a more optimal one or making the work it
>has to do at each step easier. Most of the ideas in this thread revolve
>around those things.

When running on loaded server even after VACUUM FULL, response time for
original work_mem is longer probably because it must fetch blocks from
disk.

>Based on the time it took to do the vacuum, I suspect your disk subsystem
>is a bit slow. If it can be determined that there is much disk I/O in your
>use cases, there are generally several things that can be done to tune
>Linux I/O. The main ones in my experience are the 'readahead' value for
>each disk which helps sequential reads significantly, and trying out the
>linux 'deadline' scheduler and comparing it to the more commonly used 'cfq'
>scheduler. If the system is configured with the anticipatory scheduler,
>absolutely switch to cfq or deadline as the anticipatory scheduler will
>perform horribly poorly for a database.

This is 3 year old cheap server.
No idea what to config.

There is also other similar server which as 1.2 GB more usable memory. No
idea is it worth to switch into it.
After some years sales data will still exceed this more memory.

Andrus.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2008-11-23 23:11:40 Re: Hash join on int takes 8..114 seconds
Previous Message Andrus 2008-11-23 21:35:41 Re: Hash join on int takes 8..114 seconds