| From: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
|---|---|
| To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
| Cc: | Lukas Fittl <lukas(at)fittl(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
| Date: | 2025-03-20 12:04:43 |
| Message-ID: | fbf77241-e4c5-4f70-ac71-8bf8cff89f39@tantorlabs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 20.03.2025 13:37, David Rowley wrote:
> est_entries is a uint32, so %u is the correct format character for that type.
>
> I don't think you need to prefix all these properties with "Cache"
> just because the other two properties have that prefix. I also don't
> think the names you've chosen really reflect the meaning. How about
> something like: "Estimated Distinct Lookup Keys: 721 Estimated
> Capacity: 655", in that order. I think maybe having that as one line
> for format=text is better than 2 lines. The EXPLAIN output is already
> often taking up more lines in v18 than in v17, would be good to not
> make that even worse unnecessarily.
LGTM
> I see the existing code there could use ExplainPropertyText rather
> than have a special case for text and non-text formats. That's likely
> my fault. If we're touching this code, then we should probably tidy
> that up. Do you want to create a precursor fixup patch for that?
I fully agree with this suggestion. Then I'll begin with this on another
new thread.
> + double ndistinct; /* Estimated number of distinct memoization keys,
> + * used for cache size evaluation. Kept for EXPLAIN */
>
> Maybe this field in MemoizePath needs a better name. How about
> "est_unique_keys"? and also do the rename in struct Memoize.
LGTM
> I'm also slightly concerned about making struct Memoize bigger. I had
> issues with a performance regression [1] for 908a96861 when increasing
> the WindowAgg struct size last year and the only way I found to make
> it go away was to shuffle the fields around so that the struct size
> didn't increase. I think we'll need to see a benchmark of a query that
> hits Memoize quite hard with a small cache size to see if the
> performance decreases as a result of adding the ndistinct field. It's
> unfortunate that we'll not have the luxury of squeezing this double
> into padding if we do see a slowdown.
I tried rearranging the fields in the structure with 'ndistinct' field,
but unfortunately, sizeof(Memoize) did not remain the same. Therefore,
benchmarking Memoize is definitely necessary. Thanks for the information!
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2025-03-20 12:19:57 | Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints |
| Previous Message | Peter Eisentraut | 2025-03-20 11:59:29 | Re: Index AM API cleanup |