From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)2ndquadrant(dot)com, petr(at)2ndquadrant(dot)com, jeff(dot)janes(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: multivariate statistics v14 |
Date: | 2016-03-28 08:42:28 |
Message-ID: | 95089064-e388-2cd1-ab62-7c88890eaf67@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 03/26/2016 10:18 AM, Tatsuo Ishii wrote:
>> Fair point. Attached is v18 of the patch, after pgindent cleanup.
>
> Here are some feedbacks to v18 patch.
>
> 1) regarding examples in create_statistics manual
>
> Here are numbers I got. "with statistics" referrers to the case where
> multivariate statistics are used. "without statistics" referrers to the
> case where multivariate statistics are not used. The numbers denote
> estimated_rows/actual_rows. Thus closer to 1.0 is better. Some numbers
> are shown as a fraction to avoid 0 division. In my understanding case
> 1, 3, 4 showed that multivariate statistics superior.
>
> with statistics without statistics
> case1 0.98 0.01
> case2 98/0 1/0
The case2 shows that functional dependencies assume that the conditions
used in queries won't be incompatible - that's something this type of
statistics can't fix.
> case3 1.05 0.01
> case4 1/0 103/0
> case5 18.50 18.33
> case6 111123/0 1111123/0
The last two lines (case5 + case6) seem a bit suspicious. I believe
those are for the histogram data, and I do get these numbers:
case5 0.93 (5517 / 5949) 42.0 (249943 / 5949)
case6 100/0 100/0
Perhaps you've been using the version before the bugfix, with ANALYZE on
the wrong table?
>
> 2) following comments by me are not addressed in the v18 patch.
>
>> - There's no docs for pg_mv_statistic (should be added to "49. System
>> Catalogs")
>>
>> - The word "multivariate statistics" or something like that should
>> appear in the index.
>>
>> - There are some explanation how to deal with multivariate statistics
>> in "14.1 Using Explain" and "14.2 Statistics used by the Planner"
>> section.
Yes, those are valid omissions. I plan to address them, and I'd also
considering adding a section to 65.1 (How the Planner Uses Statistics),
explaining more thoroughly how the planner uses multivariate stats.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2016-03-28 08:49:10 | Re: multivariate statistics v14 |
Previous Message | Peter Geoghegan | 2016-03-28 08:06:25 | Re: Draft release notes for next week's releases |