Re: Scanning all partition when more than 100 items in "where id in ()" clause

From: Soni M <diptatapa(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres Bug <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Scanning all partition when more than 100 items in "where id in ()" clause
Date: 2018-07-27 06:11:39
Message-ID: CAAMgDXk1aPqhRbfowJPWY7GGnUZhhBRHnqTX4+1e5cLwbcC_Cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

After loading the relation information to cache, the best time for 100
items is around Planning time: 2.789 ms, and the best time for the 101 item
is around Planning time: 3.159 ms.

On Fri, Jul 27, 2018 at 9:23 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings,
>
> * Soni M (diptatapa(at)gmail(dot)com) wrote:
> > Hello All, I think I found bug here. Postgres 9.6.9 on Red Hat Enterprise
> > Linux Server release 7.5 (Maipo).
>
> [... the 99 item query ...]
>
> > Planning time: 12.969 ms
> > Execution time: 7.062 ms
>
> [... the 100 item query ...]
>
> > Planning time: 60.537 ms
> > Execution time: 51.401 ms
>
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > I think it's a consequence of the limitation in predtest.c that it won't
> > try to do predicate proofs using ScalarArrayOps with more than 100 array
> > entries:
>
> This is far from the first time we (or, at least, I) have seen
> complaints about that particular constant.
>
> > /*
> > * Proof attempts involving large arrays in ScalarArrayOpExpr nodes are
> > * likely to require O(N^2) time, and more often than not fail anyway.
> > * So we set an arbitrary limit on the number of array elements that
> > * we will allow to be treated as an AND or OR clause.
> > * XXX is it worth exposing this as a GUC knob?
> > */
> > #define MAX_SAOP_ARRAY_SIZE 100
>
> Which certainly makes me think that comment in there might be worth
> something- perhaps we should make this a GUC and let users see just what
> would end up happening with a different choice. There could certainly
> be cases where it'd be better to work it out.
>
> > Not a bug, but a tradeoff. You'd be unhappy if the planner spent longer
> > devising the plan than it saved to eliminate the extra partitions.
>
> While I agree in concept, I'm awful curious how the "simpler" approach
> used when we hit the limit resulted in a 5x increase in planning time.
> Looks a bit like the extra time required to perform that elimination for
> at least a few more items would be saving us cycles somewhere else that
> are apparently pretty expensive.
>
> Soni, any chance that this query was the first time all of those
> partitions were hit in this backend, meaning there was a lot of time
> required to load the relation information for them? What happens if you
> prime the backend by running the 100-item case once, and then do 5 runs
> of the 99-item and then 5 of the 100-item case?
>
> Thanks!
>
> Stephen
>

--
Regards,

Soni Maula Harriz

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2018-07-27 06:40:15 Re: BUG #15300: "do you want the application "postgres" to accept incoming network connections" dialog box madness
Previous Message Amit Kapila 2018-07-27 05:47:49 Re: BUG #15290: Stuck Parallel Index Scan query