Re: [PATCH] Add extra statistics to explain for Nested Loop

From: Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Greg Stark <stark(at)mit(dot)edu>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, pryzby(at)telsasoft(dot)com
Subject: Re: [PATCH] Add extra statistics to explain for Nested Loop
Date: 2022-06-24 17:16:06
Message-ID: 420960372f05563984984f195522ff01@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, hackers!

We started discussion about overheads and how to calculate it correctly.

Julien Rouhaud wrote:
> Can you give a bit more details on your bench scenario? I see
> contradictory
> results, where the patched version with more code is sometimes way
> faster,
> sometimes way slower. If you're using pgbench
> default queries (including write queries) I don't think that any of
> them will
> hit the loop code, so it's really a best case scenario. Also write
> queries
> will make tests less stable for no added value wrt. this code.
>
> Ideally you would need a custom scenario with a single read-only query
> involving a nested loop or something like that to check how much
> overhead you
> really get when you cumulate those values.
I created 2 custom scenarios. First one contains VERBOSE flag so this
scenario uses extra statistics. Second one doesn't use new feature and
doesn't disable its use (therefore still collect data).
I attach scripts for pgbench to this letter.

Main conclusions are:
1) the use of additional statistics affects no more than 4.5%;
2) data collection affects no more than 1.5%.
I think testing on another machine would be very helpful, so if you get
a chance, I'd be happy if you share your observations.

Some fixes:

> Sure, but if we're going to have a branch for nloops == 0, I think it
> would be
> better to avoid redundant / useless instructions
Right. I done it.

Justin Pryzby wrote:
> Maybe set parallel_leader_participation=no for this test.
Thanks for reporting the issue and advice. I set
parallel_leader_participation = off. I hope this helps to solve the
problem of inconsistencies in the outputs.

If you have any comments on this topic or want to share your
impressions, please write to me.
Thank you very much for your contribution to the development of this
patch.

--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-explain.c-refactor-ExplainNode_v3.patch text/x-diff 4.8 KB
0002-extra_statistics_v8.patch text/x-diff 18.6 KB
pgbench_loop text/plain 907 bytes
pgbench_loop_without_verbose text/plain 920 bytes
overhead_v3.txt text/plain 155 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2022-06-24 17:27:50 Re: Implement hook for self-join simplification
Previous Message Andres Freund 2022-06-24 17:12:02 Re: WIP Patch: Add a function that returns binary JSONB as a bytea