Re: Failed assertion clauses != NIL

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Manuel Rigger <rigger(dot)manuel(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Failed assertion clauses != NIL
Date: 2019-11-26 17:41:31
Message-ID: 20191126174131.p3zv5t2oeqkinx4d@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Nov 26, 2019 at 10:56:33AM +0000, Dean Rasheed wrote:
>On Sun, 24 Nov 2019 at 23:40, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> Hi,
>>
>> Attached are two patched, related to this bug report.
>>
>>
>> 0001 - Fix choose_best_statistics to check clauses individually
>> ---------------------------------------------------------------
>>
>> This modifies the choose_best_statistics function to properly check
>> which clauses are actually covered by each statistic object, and only
>> use attnums from those.
>>
>> The patch ended up pretty small, because we already have all the
>> necessary info (per-clause attnums) precalculated. Which means this
>> should not be much more expensive than before.
>>
>> The main drawback is that this does change signature of a function
>> defined in statistics.h - we have to pass more info (per-clause bitmaps
>> and info which clauses are already estimated). Which means ABI break.
>>
>> I'm not sure how likely it is that external code is calling this
>> function, but the probability is non-zero. So maybe even if the patch is
>> fairly small, in backbranches we should use the simple fix with just
>> returning if the list is NIL.
>>
>
>On a quick read-through that algorithm makes a lot more sense. It
>seems pretty unlikely that anyone would be using
>choose_best_statistics() anywhere else, so I think maybe it's fine to
>change that in back-branches.
>
>A couple of comments:
>
>1). I think you should pass estimatedClauses to
>choose_best_statistics() as a pure input parameter (i.e., remove one
>"*"), since it doesn't (and must not) modify that set.

Yeah, good point.

>In fact, on closer inspection, I don't think you need to pass it to
>choose_best_statistics() at all, since its callers already check
>clauses against estimatedClauses. Therefore, in
>choose_best_statistics(), incompatible and already-estimated clauses
>both appear the same (as NULL/empty attribute sets), and therefore the
>estimatedClauses check will never be tripped.
>

Right, but I'm thinking about the patch that allows applying multiple
statistics. With that applied, this changes to a while loop - and we'll
either have to rebuild the list_attnums or pass the bitmap.

>2). The new parameter "clauses" should probably be called something
>like "clause_attnums" or some such, to better reflect what it actually
>is. And it should be documented that NULL values represent
>incompatible/already-estimated clauses.
>

Yes, agreed.

>3). In statext_mcv_clauselist_selectivity(), the check for at least 2
>attributes is no longer needed, because choose_best_statistics() now
>does that, so there's also no need to compute clauses_attnums there.
>

Good point. I'll replace that with an assert.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2019-11-26 18:27:52 Re: BUG #16125: Crash of PostgreSQL's wal sender during logical replication
Previous Message Alvaro Herrera 2019-11-26 17:10:54 Re: BUG #16125: Crash of PostgreSQL's wal sender during logical replication