Re: pg9.6 segfault using simple query (related to use fk for join estimates)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Stefan Huehner <stefan(at)huehner(dot)org>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Date: 2016-06-06 17:23:42
Message-ID: 2a8adc56-6eb1-123e-8914-0cf025797262@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/06/2016 06:34 PM, Tom Lane wrote:
> ... BTW, another thought occurred to me yesterday: it seems like the
> existing code hasn't thought through its behavior for multiple foreign
> keys very carefully. That is, suppose we have both "A.J references B.K"
> and "A.X references B.Y", as separate FKs not a single multicolumn FK.
> The current code goes to some lengths to decide that one of these is
> better than the other and then ignore the other. Why? Seems to me
> that in such a case you want to behave more nearly as you would for a
> multicolumn FK, that is discard all the join quals matched to either FK
> in favor of a single selectivity estimate based on the number of rows in
> the referenced table. Discarding only the A.J = B.K clause and then
> multiplying by the independent selectivity of A.X = B.Y is surely just as
> wrong as what we've historically done for multicolumn FKs. (Correcting
> for nulls would take a bit of thought, but I wouldn't be surprised if it
> ends up being the same as for the multicolumn-FK case, at least to within
> the precision we can hope to get with the available stats for nulls.)

Yes, that can be improved. The plan was to improve the common case
first, and then look at the more complicated cases. It might seem like a
hand-waving but I'd bet 99% tables are joined on a single FK, so this
seems like a reasonable approach.

When it comes to improving multiple (multi-column) foreign keys, I think
it may get way more complicated that it might seem. What if the foreign
keys overlap, for example? Or what if the keys go in opposite directions
(cycle). And so on ...

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-06-06 17:40:19 Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Previous Message Tomas Vondra 2016-06-06 17:10:39 Re: pg9.6 segfault using simple query (related to use fk for join estimates)