Re: Memory consumed by paths during partitionwise join planning

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory consumed by paths during partitionwise join planning
Date: 2024-02-19 12:25:28
Message-ID: CAExHW5s_y_VxHcGZ1XX5_J5Yf34DWHU=Czj6BRf23Qx3LhE6jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 16, 2024 at 8:42 AM Andrei Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> Live example: right now, I am working on the code like MSSQL has - a
> combination of NestLoop and HashJoin paths and switching between them in
> real-time. It requires both paths in the path list at the moment when
> extensions are coming. Even if one of them isn't referenced from the
> upper pathlist, it may still be helpful for the extension.

There is no guarantee that every path presented to add_path will be
preserved. Suboptimal paths are freed as and when add_path discovers
that they are suboptimal. So I don't think an extension can rely on
existence of a path. But having a refcount makes it easy to preserve
the required paths by referencing them.

>
> >> About partitioning. As I discovered planning issues connected to
> >> partitions, the painful problem is a rule, according to which we are
> >> trying to use all nomenclature of possible paths for each partition.
> >> With indexes, it quickly increases optimization work. IMO, this can help
> >> a 'symmetrical' approach, which could restrict the scope of possible
> >> pathways for upcoming partitions if we filter some paths in a set of
> >> previously planned partitions.
> >
> > filter or free?
> Filter.
> I meant that Postres tries to apply IndexScan, BitmapScan,
> IndexOnlyScan, and other strategies, passing throughout the partition
> indexes. The optimizer spends a lot of time doing that. So, why not
> introduce a symmetrical strategy and give away from the search some
> indexes of types of scan based on the pathifying experience of previous
> partitions of the same table: if you have dozens of partitions, Is it
> beneficial for the system to find a bit more optimal IndexScan on one
> partition having SeqScans on 999 other?
>
IIUC, you are suggesting that instead of planning each
partition/partitionwise join, we only create paths with the strategies
which were found to be optimal with previous partitions. That's a good
heuristic but it won't work if partition properties - statistics,
indexes etc. differ between groups of partitions.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-02-19 12:31:03 Re: numeric_big in make check?
Previous Message Ashutosh Bapat 2024-02-19 12:17:34 Re: partitioning and identity column