From: | Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> |
---|---|
To: | Mark <mwchambers(at)gmail(dot)com> |
Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2 |
Date: | 2018-12-23 15:28:11 |
Message-ID: | CAK-MWwSLPp0fcQXgmiFPLxjG7sD04YhVMd5_Bmq7kAWXXr2jTw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Hi Mark,
It's look very weird.
Can you try something like this (check that you have enough disk space for
second copy of common_student before):
create table test_table AS SELECT * from common_student;
Vacuum analyze test_table;
explain analyze select * from test_table where school_id = 36;
drop table test_table;
create table test_table AS SELECT * from common_student ORDER BY school_id;
Vacuum analyze test_table;
explain analyze select * from test_table where school_id = 36;
drop table test_table;
And provide results of both explain analyze queries.
On Sat, Dec 22, 2018 at 3:39 AM Mark <mwchambers(at)gmail(dot)com> wrote:
> Hi All,
>
> I'm seeing some very slow queries and it looks like the query planner is
> deciding to do a 'Nested Loop Left Join'
> which is slow. When I SET enable_nestloop=OFF for testing it does a 'Hash
> Left Join' which is much faster.
>
> I think I can see the cause of the problem in the examples below. I can't
> make sense of the statistics being written
> by ANALYSE or the planners row estimates.
>
> Can anyone help me understand....
>
> - Why the row estimate I get in the query below for school_id = 36 is 1 ?
> ( I expect it to be higher)
> - Why does '1' appear in the most_common_vals when it is actually the
> least common value.
> - Why doesn't 36 appear in the most_common_vals (it is more common than 1)
> - Does the analyse output below mean that it only scanned 51538 of 65463
> rows in the table? Is school_id 36 just being missed in the sample? (This
> happens when the analyse is repeated )
>
> Any help with understanding what's happening here would be much
> appreciated.
>
> I hope I've provided enough information below.
>
> Thanks,
>
> Mark
>
> db=> explain analyse select * from common_student where school_id = 36 ;
> QUERY
> PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using idx_common_student_sid on common_student
> (cost=0.41..8.39 rows=1 width=385) (actual time=0.264..1.691 rows=1388
> loops=1)
> Index Cond: (school_id = 36)
> Planning time: 0.087 ms
> Execution time: 2.706 ms
> (4 rows)
>
> db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct
> from pg_stats where attname='school_id' and tablename='common_stude
> nt';
> tablename | attname |
> most_common_vals
> | histogram_bounds | n_distinct
>
> ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------
> common_student | school_id |
> {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1}
> | | 45
> (1 row)
>
> db=> select count(distinct(school_id)) from common_student ;
> count
> -------
> 55
> (1 row)
>
> db=> alter table common_student alter column school_id set statistics
> 100000;
> WARNING: lowering statistics target to 10000
> ALTER TABLE
> db=> analyse verbose common_student(school_id);
> INFO: analyzing "public.common_student"
> INFO: "common_student": scanned 7322 of 7322 pages, containing 65463 live
> rows and 49026 dead rows; 51538 rows in sample, 65463 estimated total rows
> ANALYZE
> db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct
> from pg_stats where attname='school_id' and tablename='common_stude
> nt';
> tablename | attname |
> most_common_vals
> | histogram_bounds | n_distinct
>
> ----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------
> common_student | school_id |
> {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1}
> | | 45
> (1 row)
>
> db=> explain analyse select * from common_student where school_id = 36 ;
> QUERY
> PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using idx_common_student_sid on common_student
> (cost=0.41..8.39 rows=1 width=385) (actual time=0.542..4.022 rows=1388
> loops=1)
> Index Cond: (school_id = 36)
> Planning time: 0.334 ms
> Execution time: 6.542 ms
> (4 rows)
>
> db=> select school_id, count(*) from common_student group by school_id
> order by count(*) limit 6 ;
> school_id | count
> -----------+-------
> 1 | 50
> 88 | 161
> 53 | 252
> 94 | 422
> 31 | 434
> 68 | 454
> (6 rows)
>
> dvpjxbzc=> select school_id, count(*) from common_student where school_id
> = 36 group by school_id ;
> school_id | count
> -----------+-------
> 36 | 1388
> (1 row)
>
> db=> explain analyse select * from common_student where school_id = 1 ;
> QUERY PLAN
>
>
> ----------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on common_student (cost=4.91..243.76 rows=64 width=385)
> (actual time=0.128..0.267 rows=50 loops=1)
> Recheck Cond: (school_id = 1)
> Heap Blocks: exact=16
> -> Bitmap Index Scan on "unique common_student" (cost=0.00..4.90
> rows=64 width=0) (actual time=0.110..0.110 rows=50 loops=1)
> Index Cond: (school_id = 1)
> Planning time: 0.177 ms
> Execution time: 0.414 ms
> (7 rows)
>
> db=> select VERSION();
> version
>
>
> -----------------------------------------------------------------------------
> PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2,
> 64-bit
> (1 row)
>
>
--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/
Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно
когда я так делаю ещё раз?"
From | Date | Subject | |
---|---|---|---|
Next Message | Mitar | 2018-12-23 18:10:50 | Re: Watching for view changes |
Previous Message | Istvan Soos | 2018-12-23 13:39:14 | logical replication resiliency |
From | Date | Subject | |
---|---|---|---|
Next Message | Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?= | 2018-12-23 15:33:34 | Re: [PATCH] Improve tab completion for CREATE TABLE |
Previous Message | Alvaro Herrera | 2018-12-23 14:59:22 | Re: CF app feature request |