Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2023-07-31 16:24:52
Message-ID: c8bb964c-b1ad-534e-79a8-b617577cc555@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, all!

> CNF -> DNF conversion
> =====================
>
> Like many great papers, the MDAM paper takes one core idea, and finds
> ways to leverage it to the hilt. Here the core idea is to take
> predicates in conjunctive normal form (an "AND of ORs"), and convert
> them into disjunctive normal form (an "OR of ANDs"). DNF quals are
> logically equivalent to CNF quals, but ideally suited to SAOP-array
> style processing by an ordered B-Tree index scan -- they reduce
> everything to a series of non-overlapping primitive index scans, that
> can be processed in keyspace order. We already do this today in the
> case of SAOPs, in effect. The nbtree "next array keys" state machine
> already materializes values that can be seen as MDAM style DNF single
> value predicates. The state machine works by outputting the cartesian
> product of each array as a multi-column index is scanned, but that
> could be taken a lot further in the future. We can use essentially the
> same kind of state machine to do everything described in the paper --
> ultimately, it just needs to output a list of disjuncts, like the DNF
> clauses that the paper shows in "Table 3".
>
> In theory, anything can be supported via a sufficiently complete CNF
> -> DNF conversion framework. There will likely always be the potential
> for unsafe/unsupported clauses and/or types in an extensible system
> like Postgres, though. So we will probably need to retain some notion
> of safety. It seems like more of a job for nbtree preprocessing (or
> some suitably index-AM-agnostic version of the same idea) than the
> optimizer, in any case. But that's not entirely true, either (that
> would be far too easy).
>
> The optimizer still needs to optimize. It can't very well do that
> without having some kind of advanced notice of what is and is not
> supported by the index AM. And, the index AM cannot just unilaterally
> decide that index quals actually should be treated as filter/qpquals,
> after all -- it doesn't get a veto. So there is a mutual dependency
> that needs to be resolved. I suspect that there needs to be a two way
> conversation between the optimizer and nbtree code to break the
> dependency -- a callback that does some of the preprocessing work
> during planning. Tom said something along the same lines in passing,
> when discussing the MDAM paper last year [2]. Much work remains here.
>
Honestly, I'm just reading and delving into this thread and other topics
related to it, so excuse me if I ask you a few obvious questions.

I noticed that you are going to add CNF->DNF transformation at the index
construction stage. If I understand correctly, you will rewrite
restrictinfo node,
change boolean "AND" expressions to "OR" expressions, but would it be
possible to apply such a procedure earlier? Otherwise I suppose you
could face the problem of
incorrect selectivity of the calculation and, consequently, the
cardinality calculation?
I can't clearly understand at what stage it is clear that the such a
transformation needs to be applied?

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2023-07-31 16:27:45 Re: Request for comment on setting binary format output per session
Previous Message Jelte Fennema 2023-07-31 15:46:25 Re: proposal: psql: show current user in prompt