Re: PATCH: use foreign keys to improve join estimates v1

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: use foreign keys to improve join estimates v1
Date: 2015-09-25 13:57:42
Message-ID: 560552D6.3040107@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 09/25/2015 03:39 AM, David Rowley wrote:
> On 24 September 2015 at 23:57, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
>
>
> 2) find_best_match_foreign_key
> ------------------------------
>
> I think the comment before the function needs rephrasing (seems a
> bit broken to me). I do like the approach in general, although it
> changes the semantics a bit. The original code only considered
> "fully matching" fkeys, while the new code simply takes the longest
> match.
>
>
>
> Oops, I did not code this at all the way I had originally pictured it. I
> guess the evidence of that is in the function comment, which I wrote
> before coding the function.
> I of course intended to only allow full matches.
> A full patch which fixes this is attached. This also should fix the
> clause duplication trick that I talked about.

OK, thanks. Let's leave partial FK matches for the future.

>
>> The comment before the function mentions it's possible to confuse
>> the code with duplicate quals. Can you give an example? >
>
> Something like: SELECT * FROM a LEFT JOIN b ON a.id a.id=b.id b.id
> and a.id a.id=b.id b.id AND a.id a.id=b.id b.id AND a.id2 = b.id2 AND
> a.id3 = b.id3;
>
> Note a.id a.id = b.id b.id repeated 3 times.
>
> Where a foreign key exists on (a.id a.id) ref (b.id b.id), and also
> (a.id2, a.id3) ref (b.id2, b.id3). In this case we match 3 quals for
> the FK with 1 key, and 2 quals with the FK with 2 keys. But this is
> now fixed in the attached.
>
> I used an outer join here as they won't be transformed into eclass
> members and back to quals again. INNER JOIN wouldn't allow the
> duplicates to materialise again.

Ah, OK. Didn't think about outer joins.

>
> ...
>
> I looked at this again, and I'm not all that sure it's possible to
> assume that 1.0 / <tuples> is valid when there's more than one
> relation at either side of the join.
>
> My reasoning for this is that the whole basis for the patch is that a
> if we find a foreign key match, then we can be sure enough, as far as
> row estimations go, that exactly 1 tuple will exist matching that
> condition. This assumption of the 1 tuple no longer holds when 2
> relations have already been joined, as this can either cause tuple
> duplication, or elimination.

I don't see why that would be the case. Of course, you need to take the
right <tuples>, i.e. the "target" of the foreign key (the table with
UNIQUE constraint) so that the selectivity matches the fact that exactly
1 tuple (on the PK side) matches.

Let's say we have 3 tables

A (1000000 rows)
B (333333 rows)
D (222222 rows)

and let's assume A references the other two tables

A (b_id) REFERENCES B(id)
A (c_id) REFERENCES C(id)

Now, let's estimate the join

SELECT * FROM A JOIN B ON (a.b_id = b.id)
JOIN C ON (a.c_id = c.id)

Which will do this

card(join) = card(A) * card(B) * card(C) * sel(b_id=id) * sel(c_id=id)

where card(T) is a cardinality of a table, and sel(C) is selectivity of
the conditions. We do know that

card(A) = 1000
card(B) = 333
card(C) = 222

and we do know that selectivity of each condition is 1/card(T) of the
table with the UNIQUE constraint, referenced by the FK. So

sel(b_id=id) = 1/card(B) = 1/333
sel(c_id=id) = 1/card(C) = 1/222

The fact is that these selectivities perfectly eliminate the impact of
cardinality of the table. So

card(join) = 1000 * (333 * (1/333)) * (222 * (1/222))

so the expected cardinality is 1000.

Of course, this estimation effectively happens in steps, i.e. we first
join 2 tables - say A and B, then (A,B) and C. So in the first step we
do this:

card(A,B) = card(A) * card(B) * sel(b_id=id) = 1000

card((A,B),C) = card(A,B) * card(C) * sel(a_id=id) = 1000

The join order does not matter - we could easily join B,C first, and
then join A.

card(B,C) = card(B) * card(C) * sel(NULL) = 73926

and then

card((B,C),A) = card(B,C) * card(A) * sel(a_id=id) * sel(b_id=id)
= 1000

Notice that in this case, the singleton table (A) actually references
primary keys within the join relation - obviously the whole join
relation does not have unique keys or anything, but that does not matter.

The crucial part here is that selectivity of the condition needs to use
the number of tuples of the base relation, because that's the right
selectivity for the join clause.

>
> It's a little hard force the join order so that it happens this way, but
> say the join order in the following example was, from left to right:
>
> a CROSS JOIN b JOIN c on a.id a.id=c.id
>
> Of course, the planner would perform the cross join last in reality, but
> it's possible to trick it too not.
>
> Let's say a foreign key exists on c (id) references a(id). If we
> assume that a.id a.id = c.id <c.id> produces 1 tuple in the (a,b)
> rel, thenwe'd be very wrong, as it's not 1, it's the number of
> rows in b! Which could be millions.

I think this is the part where you're wrong. What needs to happen in the
estimation is this:

card(A,B) = card(A) * card(B) /* cross join */

card((A,B),C) = card(A,B) * card(C) * sel(a.id=c.id)
= card(A,B) * card(C) * (1 / card(A))
= card(B) * card(C)

Which is what the v2 of the patch seems to be getting right. Let me
demonstrate - I'll try to replicate the example you're presenting:

create table a as select i as a_id1, i as a_id2
from generate_series(1,1000) s(i);

create table b as select i as b_id1, i as b_id2
from generate_series(0,332) s(i);
alter table b add unique (b_id1, b_id2);

create table c as select i/3 as c_id1, i/3 as c_id2
from generate_series(0,998) s(i);

alter table c add foreign key (c_id1, c_id2)
references b (b_id1, b_id2);

analyze;

Let's force the join order:

set join_collapse_limit = 1;

and run a bunch of queries joining the tables in different orders. I'll
only present the EXPLAIN output here, but the estimates are perfectly
accurate:

test=# explain select * from (a cross join b)
join c on (b_id1 = c_id1 AND b_id2 = c_id2);

QUERY PLAN
-----------------------------------------------------------------------------------------------
Merge Join (cost=64.91..5981.72 rows=999000 width=24)
Merge Cond: ((b.b_id1 = c.c_id1) AND (b.b_id2 = c.c_id2))
-> Nested Loop (cost=0.15..4199.45 rows=333000 width=16)
-> Index Only Scan using b_b_id1_b_id2_key on b
(cost=0.15..19.45 rows=333 width=8)
-> Materialize (cost=0.00..20.00 rows=1000 width=8)
-> Seq Scan on a (cost=0.00..15.00 rows=1000 width=8)
-> Sort (cost=64.76..67.26 rows=999 width=8)
Sort Key: c.c_id1, c.c_id2
-> Seq Scan on c (cost=0.00..14.99 rows=999 width=8)
(9 rows)

test=# explain select * from (a cross join c)
join b on (b_id1 = c_id1 and b_id2 = c_id2);

QUERY PLAN
----------------------------------------------------------------------
Hash Join (cost=10.32..20052.81 rows=999000 width=24)
Hash Cond: ((c.c_id1 = b.b_id1) AND (c.c_id2 = b.b_id2))
-> Nested Loop (cost=0.00..12519.99 rows=999000 width=16)
-> Seq Scan on a (cost=0.00..15.00 rows=1000 width=8)
-> Materialize (cost=0.00..19.98 rows=999 width=8)
-> Seq Scan on c (cost=0.00..14.99 rows=999 width=8)
-> Hash (cost=5.33..5.33 rows=333 width=8)
-> Seq Scan on b (cost=0.00..5.33 rows=333 width=8)
(8 rows)

test=# explain select * from (b join c on (b_id1=c_id1 and
b_id2=c_id2)) cross join a;

QUERY PLAN
---------------------------------------------------------------------------
Nested Loop (cost=10.32..12537.84 rows=999000 width=24)
-> Seq Scan on a (cost=0.00..15.00 rows=1000 width=8)
-> Materialize (cost=10.32..37.83 rows=999 width=16)
-> Hash Join (cost=10.32..32.84 rows=999 width=16)
Hash Cond: ((c.c_id1 = b.b_id1) AND (c.c_id2 = b.b_id2))
-> Seq Scan on c (cost=0.00..14.99 rows=999 width=8)
-> Hash (cost=5.33..5.33 rows=333 width=8)
-> Seq Scan on b (cost=0.00..5.33 rows=333 width=8)
(8 rows)

> I've attempted to create a test case to demo this against your v2
> patch, and it does fall for it, only it does happen to actually give
> better estimates than without the patch, but I'm sure there must be
> some case to be found where it does not.
>
> create table a (id1 int, id2 int, c_id1 int not null, c_id2 int not
> null, primary key(id1, id2));
> create table b (id int primary key, a_id1 int not null, a_id2 int not null);
> create table c (id1 int, id2 int);
>
> alter table b add constraint b_a_id1_a_id2_fkey foreign key (a_id1,
> a_id2) references a;
>
> insert into a select x,x,x,x from generate_series(1,100) s(x);
> insert into b select id1,id2,id1 from a;
> insert into c select c_id1,c_id2 from a, generate_series(1,100);
>
> analyze a;
> analyze b;
> analyze c;
>
> explain analyze select * from a inner join b on a.id1=b.a_id1 and a.id2
> = b.a_id2 join c on c.id1%2 = a.c_id1%2 and c.id2%2 = a.c_id2%2;

No, I don't think it 'falls for it'. Table C is not connected to the
other two tables by a foreign key, and even if it was, the modulo
expression would make it impossible to detect the join a s FK join.
Which makes the example rather irrelevant for this patch, I think.

If you try to fix those issues, you should get basically the example
I've presented.

It's possible to confuse the patch - e.g. by creating foreign keys in
both directions, or 3-day join with each table referencing the other
two. But it should still give better estimates than the current
estimation that does not consider foreign keys at all.

> I'm not saying that my version of the patch does any better... I've
> actually not tested, but I think we could only use the FK to test
> this if we happened to back that up with something like UNIQUE join
> detection. My unique join patch aims to add some of the
> infrastructure which might be required to make that possible.

I'm not sure where you see the problem in the current patch, so I can't
really say whether this is good or bad.

> Perhaps the patch cannot work well without this, as since we are
> trying to fix a row underestimation, then we might just succeed in
> having the planner switch the join order to some order that is not
> backed up by foreign keys, because the row estimates look more
> favourable that way. Although perhaps that could be fixed by changing
> the 1.0 / <tuples> to <something else> / <tuples>

I don't think the join order should matter (and I tried to demonstrate
that with the example above).

It might matter if we hit the issue with equivalence classes, because
then the planner comes up with new join clauses that may not be backed
by foreign keys. But even in that case we should not do worse that now.

regards

--
Tomas Vondra www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2015-09-25 14:11:33 Re: WIP: Rework access method interface
Previous Message Peter Eisentraut 2015-09-25 13:28:54 cluster_name and update_process_title documentation