Re: Planner cost adjustments

From: Daniel Begin <jfd553(at)hotmail(dot)com>
To: "'PT'" <wmoran(at)potentialtech(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "'Tomas Vondra'" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "'Melvin Davidson'" <melvin6925(at)gmail(dot)com>
Subject: Re: Planner cost adjustments
Date: 2015-06-02 18:01:35
Message-ID: COL129-DS143FA8EFA9B7BD99AED00594B50@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Here is a follow-up on the step-by-step procedure proposed by PT

#1 - setup postgresql planner's cost estimate settings for my hardware.
----------------------------------------------------------------------------
----------
Current parameters values described in section 18.7.2 haven't been changed
except for the effective_cache_size
seq_page_cost = 1
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.005
cpu_operator_cost = 0.0025
effective_cache_size = 10GB

I did a bunch of tests on frequently used queries to see how well they
perform - using SET enable_seqscan = ON/OFF.
As described earlier in this tread, the planner use Seq Scan on tables even
if using an Index Scan is in this case 5 times faster!
Here are the logs of EXPLAIN ANALYSE on a query...

osmdump=# SET enable_seqscan = ON;
osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM
changesets WHERE changesets.user_id IN(SELECT id FROM sample.users);
----------------------------------------------------------------------------
-----------------------------------------------------
Hash Semi Join (cost=21.50..819505.27 rows=726722 width=24) (actual
time=1574.914..7444.938 rows=338568 loops=1)
Hash Cond: (changesets.user_id = users.id)
-> Seq Scan on changesets (cost=0.00..745407.22 rows=25139722 width=24)
(actual time=0.002..4724.578 rows=25133929 loops=1)
-> Hash (cost=14.00..14.00 rows=600 width=8) (actual time=0.165..0.165
rows=600 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 24kB
-> Seq Scan on users (cost=0.00..14.00 rows=600 width=8) (actual
time=0.003..0.073 rows=600 loops=1)
Total runtime: 7658.715 ms
(7 rows)

osmdump=# SET enable_seqscan = OFF;
osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM
changesets WHERE changesets.user_id IN(SELECT id FROM sample.users);
----------------------------------------------------------------------------
--------------------------------------------------------------------
Nested Loop (cost=10000000015.94..10001072613.45 rows=726722 width=24)
(actual time=0.268..1490.515 rows=338568 loops=1)
-> HashAggregate (cost=10000000015.50..10000000021.50 rows=600 width=8)
(actual time=0.207..0.531 rows=600 loops=1)
-> Seq Scan on users (cost=10000000000.00..10000000014.00
rows=600 width=8) (actual time=0.003..0.037 rows=600 loops=1)
-> Index Scan using changesets_useridndx on changesets
(cost=0.44..1775.54 rows=1211 width=24) (actual time=0.038..2.357 rows=564
loops=600
Index Cond: (user_id = users.id)
Total runtime: 1715.517 ms
(6 rows)

#2 - Run ANALYZE DATABASE and look at performance/planning improvement.
----------------------------------------------------------------------------
----------
I ran ANALYZE DATABASE then rerun the query. It did not produce any
significant improvement according to the EXPLAIN ANALYSE below...

osmdump=# SET enable_seqscan = ON;
osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM
changesets WHERE changesets.user_id IN(SELECT id FROM sample.users);
----------------------------------------------------------------------------
-----------------------------------------------------
Hash Semi Join (cost=21.50..819511.42 rows=729133 width=24) (actual
time=1538.100..7307.743 rows=338568 loops=1)
Hash Cond: (changesets.user_id = users.id)
-> Seq Scan on changesets (cost=0.00..745390.84 rows=25138084 width=24)
(actual time=0.027..4620.691 rows=25133929 loops=1)
-> Hash (cost=14.00..14.00 rows=600 width=8) (actual time=0.300..0.300
rows=600 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 24kB
-> Seq Scan on users (cost=0.00..14.00 rows=600 width=8) (actual
time=0.022..0.187 rows=600 loops=1)
Total runtime: 7519.254 ms
(7 rows)

osmdump=# SET enable_seqscan = OFF;
osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM
changesets WHERE changesets.user_id IN(SELECT id FROM sample.users);
----------------------------------------------------------------------------
--------------------------------------------------------------------
Nested Loop (cost=10000000015.94..10001090810.49 rows=729133 width=24)
(actual time=0.268..1466.248 rows=338568 loops=1)
-> HashAggregate (cost=10000000015.50..10000000021.50 rows=600 width=8)
(actual time=0.205..0.530 rows=600 loops=1)
-> Seq Scan on users (cost=10000000000.00..10000000014.00
rows=600 width=8) (actual time=0.003..0.035 rows=600 loops=1)
-> Index Scan using changesets_useridndx on changesets
(cost=0.44..1805.83 rows=1215 width=24) (actual time=0.036..2.314 rows=564
loops=600)
Index Cond: (user_id = users.id)
Total runtime: 1677.447 ms
(6 rows)

#3 - Run EXPLAIN ANALYZE and look for discrepancies between the estimated
and actual times
----------------------------------------------------------------------------
----------
Looking at above results, there are obvious discrepancies between
expected/actual rows and time!
I dug a bit by exploring/trying to understand the different concepts
explained in...

http://www.postgresql.org/docs/9.4/static/planner-stats.html
http://www.postgresql.org/docs/8.1/static/planner-stats-details.html
http://www.postgresql.org/docs/9.2/static/view-pg-stats.html

Concerning discrepancies between the actual number of rows and predicted
value, I looked at what pg_stats was saying about user_id in table
changesets.
Here are the values provided to the planner...
Average_width=8
histogram_bounds: the size of the bins varies between 50 and 150000, which
make sense because if I had divided the column's values into groups of
approximately equal population, I would have produced bins between 1 and
100000 (if sorted by frequency)
n_distinct= 20686 (there is actually 464858 distinct values for user_id in
the table)
most_common_vals: values make sense (I checked the frequency count of a
couple most common users_id)
correlation=0.617782 (?)
most_common_elems, most_common_elem_freqs and elem_count_histogram were
empty

At this point, I wonder if the assumptions behind the planner's statistics
may produce such problems since the distribution of my data is not uniform
but follows a power law (some user_id would return millions of records while
others only one).
This is the farthest I can go at this point. Maybe someone can provide me
with more explanations regarding planner's behavior and ways to go further
to make it work properly?

Best regards,
Daniel

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Pribyl 2015-06-02 18:04:09 Re: postgres db permissions
Previous Message Melvin Davidson 2015-06-02 17:55:44 Re: postgres db permissions