Re: Problem with grouping, uses Sort and GroupAggregate, HashAggregate is better(?)

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Rolf Østvik (HA/EXA) <rolf(dot)ostvik(at)ericsson(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Problem with grouping, uses Sort and GroupAggregate, HashAggregate is better(?)
Date: 2007-01-14 16:52:52
Message-ID: 909975DE-A4C1-4ECE-9A35-162C413E4137@fastcrypt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 14-Jan-07, at 10:34 AM, Rolf Østvik (HA/EXA) wrote:

> Computer:
> Dell PowerEdge 2950
> openSUSE Linux 10.1
> Intel(R) Xeon 3.00GHz
> 4GB memory
> xfs filesystem on SAS disks
>
>> -----Original Message-----
>> From: pgsql-performance-owner(at)postgresql(dot)org
>> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of
>> Rolf Østvik (HA/EXA)
>> Sent: Sunday, January 14, 2007 1:44 PM
>> To: pgsql-performance(at)postgresql(dot)org
>> Subject: [PERFORM] Problem with grouping, uses Sort and
>> GroupAggregate, HashAggregate is better(?)
>>
>> (now with a more sensible subject)
>>
>> I have been trying to change a many parameters on server versions
>> 7.4.15, 8.1.4, 8.2.0 and 8.2.1. I still hope a have managed to keep
>> my head straigth and that i do not present to much faulty
>> information.
>>
>> The cost estimates generated by the different server versions differ.
>> I have a query which (as far as i can tell) have some strange
>> differences
>> between 8.2.0 8.2.1. I can provide information about that if
>> anyone want
>> it.
>>
>> Generally these parameters are used.
>> default_statistics_target = 10
>> (4 selected columns is set to 1000)
>> (I have tested with 1000 as default value
>> but that did not have an impact)
>> (analyzed whenever value was changed)
>> shared_buffers = 64000 (512MB)

double shared_buffers
>> work_mem/sort_mem = variable, see different run's
>> effective_cache_size = 128000 (1G)

triple effective_cache (which does not actually use memory but tells
the planner what it should expect to see in the buffers)

>> random_page_cost = 2
>> cpu_index_tuple_cost = 0.001
>> cpu_operator_cost = 0.025
>> cpu_tuple_cost = 0.01
>>
>> I have tested with different values for random_page_cost and
>> cpu_*_cost but it have not made a difference.
>> I have tried with random_page cost between 1 and 8,
>> and cpu_*_cost with standard value and 50x bigger)

This is a dubious setting to play with. random_page_cost is the ratio
of random_seeks vs sequential seeks, 4 is generally the right number,
unless you are using a *very* fast disk, or ram disk.
>>

>> Query is:
>> explain
>> analyze
>> select
>> ur.id as ur_id,
>> ur.unit_ref,
>> ur.execution_time,
>> u.serial_number,
>> to_char(ur.start_date_time, 'YYYY-MM-DD'),
>> count(*) as num_test
>> from
>> uut_result as ur
>> inner join units as u
>> on ur.unit_ref=u.ref
>> inner join step_result as sr
>> on ur.id=sr.uut_result
>> where
>> ur.id between 174000 and 174000+999
>> group by
>> ur.id,
>> ur.unit_ref,
>> ur.execution_time,
>> u.serial_number,
>> ur.start_date_time
>> -- order by
>> -- ur.start_date_time
>> ;
>> NB: order by clause is used in some results below.
>>
>> === Run 1:
>> Detect work_mem setting influence (See also Run 2)
>> - server version 8.2.1
>> - Query executed without "order by" clause
>> - work_mem = 8600;
>> QUERY PLAN
>>
>> ---------------------------------------------
>> GroupAggregate (cost=44857.70..47976.79 rows=95972
>> width=37) (actual time=1802.716..2017.337 rows=1000 loops=1)
>> -> Sort (cost=44857.70..45097.63 rows=95972 width=37)
>> (actual time=1802.461..1892.743 rows=138810 loops=1)
>> Sort Key: ur.id, ur.unit_ref, ur.execution_time,
>> u.serial_number, ur.start_date_time
>> -> Nested Loop (cost=0.00..36915.87 rows=95972
>> width=37) (actual time=0.063..268.186 rows=138810 loops=1)
>> -> Nested Loop (cost=0.00..5017.65 rows=981
>> width=37) (actual time=0.047..11.919 rows=1000 loops=1)
>> -> Index Scan using uut_result_pkey on
>> uut_result ur (cost=0.00..1538.77 rows=1000 width=24)
>> (actual time=0.029..1.727 rows=1000 loops=1)
>> Index Cond: ((id >= 174000) AND
>> (id <= 174999))
>> -> Index Scan using units_pkey on units
>> u (cost=0.00..3.47 rows=1 width=17) (actual
>> time=0.006..0.007 rows=1 loops=1000)
>> Index Cond: (ur.unit_ref = u.ref)
>> -> Index Scan using uut_result_key on
>> step_result sr (cost=0.00..30.82 rows=136 width=4) (actual
>> time=0.011..0.125 rows=139 loops=1000)
>> Index Cond: (ur.id = sr.uut_result)
>> Total runtime: 2021.833 ms
>> (12 rows)
>>
>>
>> === Run 2:
>> Detect work_mem setting influence (See also Run 1)
>> - server version 8.2.1
>> - Query executed without "order by" clause
>> - work_mem = 8700;
>> QUERY PLAN
>>
>> ---------------------------------------------
>> HashAggregate (cost=38355.45..39795.03 rows=95972 width=37)
>> (actual time=436.406..439.867 rows=1000 loops=1)
>> -> Nested Loop (cost=0.00..36915.87 rows=95972 width=37)
>> (actual time=0.066..256.235 rows=138810 loops=1)
>> -> Nested Loop (cost=0.00..5017.65 rows=981
>> width=37) (actual time=0.049..10.858 rows=1000 loops=1)
>> -> Index Scan using uut_result_pkey on
>> uut_result ur (cost=0.00..1538.77 rows=1000 width=24)
>> (actual time=0.031..1.546 rows=1000 loops=1)
>> Index Cond: ((id >= 174000) AND (id <= 174999))
>> -> Index Scan using units_pkey on units u
>> (cost=0.00..3.47 rows=1 width=17) (actual time=0.005..0.006
>> rows=1 loops=1000)
>> Index Cond: (ur.unit_ref = u.ref)
>> -> Index Scan using uut_result_key on step_result
>> sr (cost=0.00..30.82 rows=136 width=4) (actual
>> time=0.011..0.123 rows=139 loops=1000)
>> Index Cond: (ur.id = sr.uut_result) Total
>> runtime: 441.193 ms (10 rows)
>>
>> === Comment on Run 1 versus Run 2 (adjusted work_mem) ===
>> The difference in setup is value of work_mem. Bigger work_mem
>> gave different
>> cost estimates and selected HashAggregate instead of GroupAggregate.
>> Result was a reduced runtime. I guess that is as expected.
>>
>> (One remark, the switchover between different plans on
>> version 8.1.5 was for
>> work_mem values of 6800 and 6900)
>>
>> === Run 3 (with order by clause):
>> Test "group by" and "order by" (See also Run 1 and Run 4)
>> - server version 8.2.1
>> - Query executed with "order by" clause
>> - work_mem = 8700
>> (tried values from 2000 to 128000 with same cost and plan
>> as result)
>> QUERY PLAN
>>
>> ---------------------------------------------
>> GroupAggregate (cost=44857.70..47976.79 rows=95972
>> width=37) (actual time=1891.464..2114.462 rows=1000 loops=1)
>> -> Sort (cost=44857.70..45097.63 rows=95972 width=37)
>> (actual time=1891.263..1982.137 rows=138810 loops=1)
>> Sort Key: ur.start_date_time, ur.id, ur.unit_ref,
>> ur.execution_time, u.serial_number
>> -> Nested Loop (cost=0.00..36915.87 rows=95972
>> width=37) (actual time=0.064..264.358 rows=138810 loops=1)
>> -> Nested Loop (cost=0.00..5017.65 rows=981
>> width=37) (actual time=0.047..12.253 rows=1000 loops=1)
>> -> Index Scan using uut_result_pkey on
>> uut_result ur (cost=0.00..1538.77 rows=1000 width=24)
>> (actual time=0.029..1.743 rows=1000 loops=1)
>> Index Cond: ((id >= 174000) AND
>> (id <= 174999))
>> -> Index Scan using units_pkey on units
>> u (cost=0.00..3.47 rows=1 width=17) (actual
>> time=0.006..0.007 rows=1 loops=1000)
>> Index Cond: (ur.unit_ref = u.ref)
>> -> Index Scan using uut_result_key on
>> step_result sr (cost=0.00..30.82 rows=136 width=4) (actual
>> time=0.011..0.124 rows=139 loops=1000)
>> Index Cond: (ur.id = sr.uut_result)
>> Total runtime: 2118.986 ms
>> (12 rows)
>>
>> === Run 4 (with order by clause, on server 8.1.4):
>> Test "group by" and "order by" (See also Run 1 and Run 3)
>> - server version 8.1.4
>> - Query executed with "order by" clause
>> - work_mem = 6900
>> (same plan select for all work_mem values above 6900)
>> QUERY PLAN
>>
>> ------------------------------------------------------------
>> Sort (cost=46578.83..46820.66 rows=96734 width=37) (actual
>> time=505.562..505.988 rows=1000 loops=1)
>> Sort Key: ur.start_date_time
>> -> HashAggregate (cost=37117.40..38568.41 rows=96734
>> width=37) (actual time=498.697..502.374 rows=1000 loops=1)
>> -> Nested Loop (cost=0.00..35666.39 rows=96734
>> width=37) (actual time=0.058..288.270 rows=138810 loops=1)
>> -> Nested Loop (cost=0.00..5342.20 rows=984
>> width=37) (actual time=0.042..11.773 rows=1000 loops=1)
>> -> Index Scan using uut_result_pkey on
>> uut_result ur (cost=0.00..1626.46 rows=1003 width=24)
>> (actual time=0.020..1.868 rows=1000 loops=1)
>> Index Cond: ((id >= 174000) AND
>> (id <= 174999))
>> -> Index Scan using units_pkey on units
>> u (cost=0.00..3.69 rows=1 width=17) (actual
>> time=0.006..0.007 rows=1 loops=1000)
>> Index Cond: ("outer".unit_ref = u.ref)
>> -> Index Scan using uut_result_key on
>> step_result sr (cost=0.00..29.09 rows=138 width=4) (actual
>> time=0.006..0.146 rows=139 loops=1000)
>> Index Cond: ("outer".id = sr.uut_result)
>> Total runtime: 507.452 ms
>> (12 rows)
>>
>> === Coemment on selected plan for 8.2.1 when using "order by" ===
>> Run 3 (8.2.1 with order by) selects same plan as Run1
>> (without order by).
>> It does hovever exist a better plan for Run3, and 8.1.5
>> manages to select
>> that plan (shown in Run 4).
>> Both versions (8.1.5 and 8.2.1) uses same plan until the
>> uppermost Nested Loop.
>> The big difference is that 8.1.5 then will do HashAggregate,
>> and then sort,
>> while 8.2.1 will does a sort and then a GroupAggregate.
>>
>> I have tried different combinations for statistics_target,
>> cpu_*_cost,
>> work_mem and random page cost without finding a solution.
>>
>> Anyone with an idea on what to do? Feel free to suggest one
>> of the above
>> parameters, i might have overlooked some combination.
>>
>> I am a little unsure on how much extra information is necessery,
>> but i
>> will provide some:
>>
>> The three tables are
>> units List of produced items
>> uut_Result Summary of test result
>> step_result Individuel tests results
>> The system is a production test log. (there are a lot of units which
>> does not have an entry in uut_result).
>>
>> Table "public.units"
>> Column | Type |
>> Modifiers
>> ------------------+-----------------------+-------------------
>> ----------
>> ------------------+-----------------------+-------------------
>> ----------
>> ------------------+-----------------------+----------
>> ref | integer | not null default
>> nextval(('public.units_ref_seq'::text)::regclass)
>> serial_number | character varying(30) | not null
>> product_ref | integer | not null
>> week | integer | not null
>> status | integer | not null
>> comment | text |
>> last_user | text | default "current_user"()
>> last_date | date | default
>> ('now'::text)::date
>> product_info_ref | integer | not null
>> Indexes:
>> "units_pkey" PRIMARY KEY, btree (ref)
>> "units_no_sno_idx" UNIQUE, btree (product_ref, week)
>> WHERE serial_number::text = ''::text
>> "units_serial_number_idx" UNIQUE, btree (serial_number,
>> product_info_ref) WHERE serial_number::text <> ''::text
>> "units_product_ref_key" btree (product_ref)
>> Triggers:
>> ct_unit_update_log AFTER UPDATE ON units FOR EACH ROW
>> EXECUTE PROCEDURE cf_unit_update_log()
>> ct_units_update_product_info_ref BEFORE INSERT OR UPDATE
>> ON units FOR EACH ROW EXECUTE PROCEDURE
>> cf_units_update_product_info_ref()
>> select count(*) from units => 292 676 rows
>>
>> Table "public.uut_result"
>> Column | Type |
>> Modifiers
>> -------------------+-----------------------------+------------
>> ----------
>> -------------------+-----------------------------+------------
>> ----------
>> -------------------+-----------------------------+--------
>> id | integer | not null
>> uut_serial_number | text |
>> unit_ref | integer |
>> order_unit_ref | integer |
>> user_login_name | text |
>> start_date_time | timestamp without time zone |
>> execution_time | double precision |
>> uut_status | text |
>> uut_error_code | integer |
>> uut_error_message | text |
>> last_user | text | default
>> "current_user"()
>> last_timestamp | timestamp with time zone | default
>> ('now'::text)::timestamp(6) with time zone
>> test_name | text |
>> teststation_name | text |
>> teststation_ref | integer |
>> process_step_ref | integer |
>> Indexes:
>> "uut_result_pkey" PRIMARY KEY, btree (id)
>> "uut_result_start_date_time_idx" btree (start_date_time)
>> "uut_result_test_name" btree (test_name)
>> Triggers:
>> ct_set_process_step_ref BEFORE INSERT OR UPDATE ON
>> uut_result FOR EACH ROW EXECUTE PROCEDURE
>> cf_set_process_step_ref() select count(*) from uut_result =>
>> 180 111 rows
>>
>> Table "public.step_result"
>> Column | Type | Modifiers
>> --------------------+------------------+-----------
>> id | integer | not null
>> uut_result | integer |
>> step_parent | integer |
>> step_name | text |
>> step_extra_info | text |
>> step_type | text |
>> status | text |
>> report_text | text |
>> error_code | integer |
>> error_message | text |
>> module_time | double precision |
>> total_time | double precision |
>> num_loops | integer |
>> num_passed | integer |
>> num_failed | integer |
>> ending_loop_index | integer |
>> loop_index | integer |
>> interactive_exenum | integer |
>> step_group | text |
>> step_index | integer |
>> order_number | integer |
>> pass_fail | integer |
>> numeric_value | double precision |
>> high_limit | double precision |
>> low_limit | double precision |
>> comp_operator | text |
>> string_value | text |
>> string_limit | text |
>> button_pressed | integer |
>> response | text |
>> exit_code | integer |
>> num_limits_in_file | integer |
>> num_rows_in_file | integer |
>> num_limits_applied | integer |
>> sequence_name | text |
>> sequence_file_path | text |
>> Indexes:
>> "step_result_pkey" PRIMARY KEY, btree (id)
>> "step_parent_key" btree (step_parent)
>> "temp_index_idx" btree (sequence_file_path)
>> "uut_result_key" btree (uut_result)
>> select count(*) from step_result => 17 624 657 rows
>>
>> Best regards
>> Rolf Østvik
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2007-01-14 19:02:24 Re: Large table performance
Previous Message Dave Dutcher 2007-01-14 16:12:18 Re: Problem with grouping, uses Sort and GroupAggregate, HashAggregate is better(?)