Re: using memoize in in paralel query decreases performance

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: using memoize in in paralel query decreases performance
Date: 2023-03-07 08:58:03
Message-ID: CAApHDvqatKK5zxPpQa8D+u2oJRm37ue8azUQ6tO6peBwmJFLVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

/On Tue, 7 Mar 2023 at 21:09, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
> po 6. 3. 2023 v 22:52 odesílatel David Rowley <dgrowleyml(at)gmail(dot)com> napsal:
>> I wonder if the additional work_mem required for Memoize is just doing
>> something like causing kernel page cache evictions and leading to
>> fewer buffers for ixfk_ite_itemcategoryid and the item table being
>> cached in the kernel page cache.
>>
>> Maybe you could get an idea of that if you SET track_io_timing = on;
>> and EXPLAIN (ANALYZE, BUFFERS) both queries.
>
>
> https://explain.depesz.com/s/vhk0

This is the enable_memoize=on one. The I/O looks like:

Buffers: shared hit=105661309 read=15274264 dirtied=15707 written=34863
I/O Timings: shared/local read=2671836.341 write=1286.869

2671836.341 / 15274264 = ~0.175 ms per read.

> https://explain.depesz.com/s/R5ju

This is the faster enable_memoize = off one. The I/O looks like:

Buffers: shared hit=44542473 read=18541899 dirtied=11988 written=18625
I/O Timings: shared/local read=1554838.583 write=821.477

1554838.583 / 18541899 = ~0.084 ms per read.

That indicates that the enable_memoize=off version is just finding
more pages in the kernel's page cache than the slower query. The
slower query just appears to be under more memory pressure causing the
kernel to have less free memory to cache useful pages. I don't see
anything here that indicates any problems with Memoize. Sure the
statistics could be better as, ideally, the Memoize wouldn't have
happened for the i_2 relation. I don't see anything that indicates any
bugs with this, however. It's pretty well known that Memoize puts
quite a bit of faith into ndistinct estimates. If it causes too many
issues the enable_memoize switch can be turned to off.

You might want to consider experimenting with smaller values of
work_mem and/or hash_mem_multiplier for this query or just disabling
memoize altogether.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sandro Santilli 2023-03-07 09:00:13 Re: [PATCH] Support % wildcard in extension upgrade filenames
Previous Message Damir Belyalov 2023-03-07 08:35:32 Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)