Re: using memoize in in paralel query decreases performance

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: using memoize in in paralel query decreases performance
Date: 2023-03-07 09:50:20
Message-ID: CAFj8pRCO3UbXVCPT=QiSyQHSRHnbLvtbdouwZADPo=vT694h2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

út 7. 3. 2023 v 10:46 odesílatel David Rowley <dgrowleyml(at)gmail(dot)com> napsal:

> On Tue, 7 Mar 2023 at 22:09, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> > I can live with it. This is an analytical query and the performance is
> not too important for us. I was surprised that the performance was about
> 25% worse, and so the hit ratio was almost zero. I am thinking, but I am
> not sure if the estimation of the effectiveness of memoization can depend
> (or should depend) on the number of workers? In this case the number of
> workers is high.
>
> The costing for Memoize takes the number of workers into account by
> way of the change in expected input rows. The number of estimated
> input rows is effectively just divided by the number of parallel
> workers, so if we expect 1 million rows from the outer side of the
> join and 4 workers, then we'll assume the memorize will deal with
> 250,000 rows per worker. If the n_distinct estimate for the cache key
> is 500,000, then it's not going to look very attractive to Memoize
> that. In reality, estimate_num_groups() won't say the number of
> groups is higher than the input rows, but Memoize, with all the other
> overheads factored into the costs, it would never look favourable if
> the planner thought there was never going to be any repeated values.
> The expected cache hit ratio there would be zero.
>

Thanks for the explanation.

Pavel

> David
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-03-07 09:51:20 Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Previous Message David Rowley 2023-03-07 09:46:35 Re: using memoize in in paralel query decreases performance