Re: Incorrect estimates on columns

From: Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Incorrect estimates on columns
Date: 2007-10-17 19:43:40
Message-ID: 200710171543.40840.chris.kratz@vistashare.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wednesday 17 October 2007 14:49, Tom Lane wrote:
> Chris Kratz <chris(dot)kratz(at)vistashare(dot)com> writes:
> > I'm struggling to get postgres to run a particular query quickly.
>
> The key problem seems to be the join size misestimate here:
> > -> Hash Join (cost=45.92..1251.07 rows=21 width=8)
> > (actual time=0.948..10.439 rows=1199 loops=1) Hash Cond: (par.activity =
> > a.activityid)
> > -> Bitmap Heap Scan on
> > rpt_agencyquestioncache_171_0 par (cost=21.92..1222.19 rows=1247
> > width=8) (actual time=0.415..3.081 rows=1199 loops=1) -> Hash
> > (cost=19.21..19.21 rows=383 width=4) (actual time=0.513..0.513 rows=383
> > loops=1)
>
> Evidently it's not realizing that every row of par will have a join
> partner, but why not? I suppose a.activityid is unique, and in most
> cases that I've seen the code seems to get that case right.
>
> Would you show us the pg_stats rows for par.activity and a.activityid?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

Thanks for the reply and help Tom,

activityid is unique on the activity table.
activity on par is a child table to activity, with multiple rows per activityid.

Here are the pg_stats rows for par.activity and a.activityid.

# select * from pg_stats where tablename='activity' and attname='activityid';
schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+-----------+------------+-----------+-----------+------------+------------------+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
public | activity | activityid | 0 | 4 | -1 | | | {232,2497,3137,3854,4210,5282,9318,11396,12265,12495,12760,13509,13753,15000,15379,15661,16791,17230,17703,18427,18987,19449,19846,20322,20574,20926,21210,21501,21733,22276,22519,23262,24197,24512,24898,25616,25893,26175,26700,27141,27509,27759,29554,29819,30160,30699,32343,32975,33227,33493,33753,33980,34208,34534,34780,35007,35235,35641,35922,36315,36678,37998,38343,38667,39046,39316,39778,40314,40587,40884,41187,41860,42124,42399,42892,43313,43546,43802,45408,45740,46030,46406,46655,47031,47556,47881,48190,48528,48810,49038,49319,49704,49978,50543,50916,51857,52134,52380,52691,53011,53356} | 0.703852
(1 row)

# select * from pg_stats where tablename='rpt_agencyquestioncache_171_0' and attname='activity';
schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+-------------------------------+----------+-----------+-----------+------------+---------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
public | rpt_agencyquestioncache_171_0 | activity | 0 | 4 | 248 | {32905,32910,32912,32909,33530,32906,32931,33839,33837,32943,35267,33832,35552,35550,42040,39167} | {0.1471,0.125533,0.1114,0.0935667,0.0903667,0.0538,0.0378,0.0347667,0.0342667,0.0292667,0.0256333,0.0245333,0.0142333,0.0128333,0.0110333,0.00883333} | {32911,32953,32955,33745,33791,33811,33812,33813,33817,33820,33825,33827,33836,33838,33838,33843,33852,33859,33860,33862,33868,33869,33870,33872,33872,33872,33874,33875,33877,33879,33880,33881,33884,33885,33886,33886,33894,33899,33899,33905,33907,33911,33912,33915,33926,35549,35551,35551,35715,35716,35716,35717,35727,35734,39262,42010,42015,42015,42015,42015,42032,42032,42032,42042,42042,42045,43107,43108,43110,43111,43114,44017,44017,44017,44017,45824,46370,46370,46371,46371,46372,46372,46373,46373,46374,46375,46376,46377,46377,46378,46379,46387,52175,52177,52195,52204,52229,52447,52451,52454,53029} | -0.44304
(1 row)

-Chris

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2007-10-17 20:40:58 Re: Huge amount of memory consumed during transaction
Previous Message Tom Lane 2007-10-17 18:49:34 Re: Incorrect estimates on columns