Re: [HACKERS] Runtime Partition Pruning

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: [HACKERS] Runtime Partition Pruning
Date: 2020-10-12 12:22:45
Message-ID: CAKU4AWpFe_W+tJ_dKw9wAq-5vUsi7h7-HO36siippLHxiu=3xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 12, 2020 at 5:48 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
wrote:

> On Mon, Oct 12, 2020 at 7:59 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
> >
> > Sorry for the late reply! Suppose we have partition defined like this:
> > p
> > - p1
> > - p2
> >
> > When you talk about "the statistics from the partitioned table", do you
> > mean the statistics from p or p1/p2? I just confirmed there is no
> statistics
> > for p (at least pg_class.reltuples = -1), so I think you are talking
> about
> > p1/p2.
>
> I am talking about p when I say statistics from the partitioned table.
> I see that pg_statistic row from p is well populated.
> pg_class.reltuples = -1 indicates that the heap doesn't have any rows.
> set_rel_size() sets the number of rows in the partitioned table based
> on the rows in individual unpruned partitions.
>
>
Glad to know that, Thanks for this info!

> >
> > Here we are talking about partkey = $1 or partkey = RunTimeValue.
> > so even the value can hit 1 partition only, but since we don't know
> > the exact value, so we treat all the partition equally. so looks
> > nothing wrong with partition level estimation. However when we cost
> > the Append path, we need know just one of them can be hit, then
> > we need do something there. Both AppendPath->rows/total_cost
> > should be adjusted (That doesn't happen now).
>
> I think in this case we can safely assume that only one partition will
> remain so normalize costs considering that only one partition will
> survive.
>

Exactly. What I am trying to do is fix this at create_append_path,
do you have different suggestions? about the pkey > $1 case, I think
even if we use the statistics from partition level, it would be
hard-code as well since we don't know what value $1 is.

I have gone through the main part of the RunTime partition prune, hope
I can update a runnable patch soon. The main idea is fix the rows/
costs at create_append_path stage. So any suggestion in a different
direction will be very useful.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2020-10-12 12:33:32 Re: BUG #15858: could not stat file - over 4GB
Previous Message Yuki Seino 2020-10-12 12:18:32 Re: [PATCH] Add features to pg_stat_statements