Re: Assertion failure with LEFT JOINs among >500 relations

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Onder Kalaci <onderk(at)microsoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Assertion failure with LEFT JOINs among >500 relations
Date: 2020-10-14 03:26:45
Message-ID: 3531941.1602646005@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> On Wed, 14 Oct 2020 at 04:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> So now I'm imagining something like
>> #define MAXIMUM_ROWCOUNT 1e100

> That seems more reasonable. We likely could push it a bit higher, but
> I'm not all that motivated to since if that was true, then you could
> expect the heat death of the universe to arrive before your query
> results. In which case the user would likely struggle to find
> electrons to power their computer.

Right. But I'm thinking about joins in which both inputs are clamped to
that maximum estimate. If we allowed it to be as high as 1e200, then
multiplying the two input rowcounts together would itself overflow.
At 1e100, we can do that and also multiply in a ridiculous per-row cost,
and we're still well below the overflow threshold. So this should go
pretty far towards preventing internal overflows in any one plan step's
cost & rows calculations.

(For comparison's sake, I believe the number of atoms in the observable
universe is thought to be somewhere on the order of 1e80. So we are
pretty safe in thinking that no practically-useful rowcount estimate
will exceed 1e100; there is no need to make it higher.)

> I've ended up leaving the NaN checks in the join costing functions.
> There was no case mentioned in [1] that showed how we hit that
> reported test case, so I'm not really confident enough to know I'm not
> just reintroducing the same problem again by removing that. The path
> row estimate that had the NaN might not have been through
> clamp_row_est(). Many don't.

Hmm, I will try to find some time tomorrow to reconstruct that.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-10-14 03:29:35 Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2
Previous Message Andy Fan 2020-10-14 03:26:33 Re: Wired if-statement in gen_partprune_steps_internal