Quick Links

Re: pg9.6 segfault using simple query (related to use fk for join estimates)

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Stefan Huehner <stefan(at)huehner(dot)org>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Date:	2016-05-03 14:10:01
Message-ID:	8344835e-18af-9d40-aed7-bd2261be9162@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 04/30/2016 07:35 PM, Tom Lane wrote:
> Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com> writes:
>> On 29/04/2016 18:05, Tom Lane wrote:
>>> Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com> writes:
>>>> The segfault is caused by quals_match_foreign_key() calling get_leftop()
>>>> and get_rightop() on a ScalarArrayOpExpr node.
>>>>
>>>> I'm not sure that assuming this compatibility is the right way to fix
>>>> this though.
>
>>> It certainly isn't.
>
>> Agreed. New attached patch handles explicitly each node tag.
>
> No, this is completely nuts. The problem is that quals_match_foreign_key
> is simply assuming that every clause in the list it's given is an OpExpr,
> which is quite wrong. It should just ignore non-OpExpr quals, since it
> cannot do anything useful with them anyway. There's a comment claiming
> that non-OpExpr quals were already rejected:
>
> * Here since 'usefulquals' only contains bitmap indexes for quals
> * of type "var op var" we can safely skip checking this.
>
> but that doesn't appear to have anything to do with current reality.

Yes, I agree - there should be a check that the node is OpExpr in
quals_match_foreign_key. This is clearly a bug :-(

>
> While this in itself is about a two-line fix, after reviewing
> 137805f89acb3611 I'm pretty unhappy that it got committed at all.
> I think this obvious bug is a good sign that it wasn't ready.
> Other unfinished aspects like invention of an undocumented GUC
> don't leave a good impression either.

The GUC is meant to make testing during development easier. I haven't
realized it got committed, but I assume Simon plans to remove it before
the final release. Can't check right now with Simon, though, as he's is
out of office this week.

In any case, I do agree the GUC should have been documented better, and
we should probably put it on the TODO so that it gets removed before the
actual release.

>
> Moreover, it looks to me like this will add quite a lot of overhead,
> probably far more than is justified, because
> clauselist_join_selectivity is basically O(N^2) in the
> relation-footprint of the current join --- and not with a real small
> multiplier, either, as the functions it calls contain about four
> levels of nested loops themselves. Maybe that's unmeasurable on
> trivial test cases but it's going to be disastrous in large queries,
> or for relations with large numbers of foreign keys.
>

That depends on a lot of factors, I guess. Attached are two SQL scripts
that create 5 tables (f1, f2, f3, f4, f5) and then create N foreign keys
on each of them. The foreign keys reference other tables, which means
the loops won't terminate early (after finding the first match) when
joining the first 5 tables, which makes this the worst case. And then we
join different numbers of tables (2, 3, 4 or 5) and do explain analyze
to measure planning time.

The first script (fk-test.sql) does joins on 2 columns, fk-test-6.sql
does joins on 6 columns (so that we also know how the number of columns
affects the planning time).

Sum of planning time for 100 queries (in milliseconds) with 2 columns,
where "2/on" means "join on 2 tables with enable_fk_estimates=on":

N 2/on 2/off 3/on 3/off 4/on 4/off 5/on 5/off
10 6 4 9 8 22 20 77 68
100 15 11 26 19 64 36 177 93
1000 97 84 217 133 516 233 1246 342

and comparison of the timings:

2 3 4 5
10 155% 115% 114% 113%
100 142% 138% 177% 190%
1000 116% 163% 221% 364%

And with the 6 columns:

2/on 2/off 3/on 3/off 4/on 4/off 5/on 5/off
10 25 7 23 20 96 82 344 297
100 21 14 65 33 233 104 735 328
1000 151 99 492 153 1603 320 4627 593

and comparison:

2 3 4 5
10 371% 120% 117% 116%
100 149% 196% 224% 224%
1000 152% 322% 502% 780%

So yes, the overhead is clearly measurable, no doubt about that.

>
> I think this should be reverted and pushed out to 9.7 development.
> It needs a significant amount of rewriting to fix the performance
> issue, and now's not the time to be doing that.
>

There are probably a few reasonably simple things we could do - e.g.
ignore foreign keys with just a single column, as the primary goal of
the patch is improving estimates with multi-column foreign keys. I
believe that covers a vast majority of foreign keys in the wild.

If that's deemed insufficient, we'll have to resort to a more complex
improvement, perhaps something akin to the cache proposed in in the
unijoin patch. But if that's required, that's 9.7 material.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment	Content-Type	Size
fk-test-6.sql	application/sql	4.6 KB
fk-test.sql	application/sql	2.9 KB

In response to

Re: pg9.6 segfault using simple query (related to use fk for join estimates) at 2016-04-30 17:35:12 from Tom Lane

Responses

Re: pg9.6 segfault using simple query (related to use fk for join estimates) at 2016-05-03 21:18:27 from David Rowley
Re: pg9.6 segfault using simple query (related to use fk for join estimates) at 2016-05-07 08:21:51 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2016-05-03 14:17:05	Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Previous Message	Craig Ringer	2016-05-03 14:03:50	Logical decoding timeline following fails to handle records split across segments