Re: More stable query plans via more predictable column statistics

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Joel Jacobson <joel(at)trustly(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Stefan Litsche <stefan(dot)litsche(at)zalando(dot)de>
Subject: Re: More stable query plans via more predictable column statistics
Date: 2016-03-09 09:58:20
Message-ID: CACACo5T=a+GmEuvHq9R5bYJQPVJe8Fio-qyYtpnyzs_wrTkY5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Tue, Mar 8, 2016 at 9:10 PM, Joel Jacobson <joel(at)trustly(dot)com> wrote:

> On Wed, Mar 9, 2016 at 1:25 AM, Shulgin, Oleksandr
> <oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
> > Thank you for spending your time to run these :-)
>
> n/p, it took like 30 seconds :-)
>

Great! I'm glad to hear it was as easy to use as I hoped for :-)

> I don't want to be asking for too much here, but is there a chance you
> could
> > try the effects of the proposed patch on an offline copy of your
> database?
>
> Yes, I think that should be possible.
>
> > Do you envision or maybe have experienced problems with query plans
> > referring to the columns that are near the top of the above hist_ratio
> > report? In other words: what are the practical implications for you with
> > the values being duplicated rather badly throughout the histogram like in
> > the example you shown?
>
> I don't know much about the internals of query planner,
> I just read the "57.1. Row Estimation Examples" to get a basic
> understanding.
>
> If I understand it correctly, if the histogram_bounds contains a lot
> of duplicated values,
> then the row estimation will be inaccurate, which in turn will trick
> the query planner
> into a sub-optimal plan?
>

Yes, basically it should matter the most for the equality comparison
operator, such that a MCV entry would provide more accurate selectivity
estimate (and the histogram is not used at all in this case anyway). For
the "less/greater-than" comparison both MCV list and histogram are used, so
the drawback of having repeated values in the histogram, in my
understanding is the same: less accurate selectivity estimates for the
values that could fall precisely into a bin which didn't make it into the
histogram.

We've had some problems lately with the query planner, or actually we've
> always
> had them but never noticed them nor cared about them, but now during peak
> times
> we've had short periods where we haven't been able to fully cope up
> with the traffic.
>
> I tracked down the most self_time-consuming functions and quickly saw
> how to optimize them.
> Many of them where on the form:
> SELECT .. FROM SomeBigTable WHERE Col1 = [some dynamic value] AND Col2
> = [some constant value] AND Col3 = [some other constant value]
> The number of rows matching the WHERE clause were very tiny, perfect
> match for a partial index:
> CREATE INDEX .. ON SomeBigTable USING btree (Col1) WHERE Col2 = [some
> constant value] AND Col3 = [some other constant value];
>
> Even though the new partial index matched the query perfectly, the
> query planner didn't want to use it. Instead it continued to use some
> other sub-optimal index.
>
> The only way to force it to use the correct index was to use the
> "+0"-trick which I recently learned from one of my colleagues:
> SELECT .. FROM SomeBigTable WHERE Col1 = [some dynamic value] AND
> Col2+0 = [some constant value] AND Col3+0 = [some other constant
> value]
> CREATE INDEX .. ON SomeBigTable USING btree (Col1) WHERE Col2+0 =
> [some constant value] AND Col3+0 = [some other constant value];
>
> By adding +0 to the columns, the query planner will as I understand it
> be extremely motivated to use the correct index, as otherwise it would
> have to do a seq scan on the entire big table, which would be very
> costly.
>
> I'm glad the trick worked, now the system is fast again.
>
> We're still on 9.1, so maybe these problems will go away once we upgrade
> to 9.5.
>

Hm... sounds like a planner bug to me. I'm not exceptionally aware of the
changes in partial index handling that were made after 9.1, though grepping
the commit log for "partial index" produces a number of hits after the date
of 9.1 release.

I don't know if these problems I described can be fixed by your patch,
> but I wanted to share this story since I know our systems (Trustly's
> and Zalando's) are quite similar in design,
> so maybe you have experienced something similar.
>

I would not expect this type of problem to be affected by the patch in any
way, though maybe I'm missing the complete picture here.

Also, I'm not aware of similar problems in our systems, but I can ask
around. :-)

Thank you.
--
Alex

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 李海龙 2016-03-09 10:13:43 the include-timestamp data returned by pg_logical_slot_peek_changes() is always 2000-01-01 in 9.5.1
Previous Message Tomas Vondra 2016-03-09 09:54:09 Re: multivariate statistics v11