Re: Detoasting optionally to make Explain-Analyze less misleading

From: stepan rutz <stepan(dot)rutz(at)gmx(dot)de>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detoasting optionally to make Explain-Analyze less misleading
Date: 2023-09-12 15:16:00
Message-ID: 2f57ca27-6828-eefc-9dd3-6e2d4578a8fa@gmx.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Matthias,

thanks for your feedback.

I wasn't sure on the memory-contexts. I was actually also unsure on
whether the array

  TupleTableSlot.tts_isnull

is also set up correctly by the previous call to slot_getallattrs(slot).
This would allow to get rid of even more code from this patch, which is
in the loop and determines whether a field is null or not. I switched to
using tts_isnull from TupleTableSlot now, seems to be ok and the docs
say it is save to use.

Also I reuse the MemoryContext throughout the livetime of the receiver.
Not sure if that makes a difference. One more thing I noticed. During
explain.c the DestReceiver's destroy callback was never invoked. I added
a line to do that, however I am unsure whether this is the right place
or a good idea in the first place. This potentially affects plain
analyze calls as well, though seems to behave nicely. Even when I
explain (analyze) select * into ....

This is the call I am talking about, which was missing from explain.c

  dest->rDestroy(dest);

Attached a new patch. Hoping for feedback,

Greetings, Stepan

On 12.09.23 14:25, Matthias van de Meent wrote:
> On Tue, 12 Sept 2023 at 12:56, stepan rutz<stepan(dot)rutz(at)gmx(dot)de> wrote:
>> Hi,
>>
>> I have fallen into this trap and others have too. If you run
>> EXPLAIN(ANALYZE) no de-toasting happens. This makes query-runtimes
>> differ a lot. The bigger point is that the average user expects more
>> from EXPLAIN(ANALYZE) than what it provides. This can be suprising. You
>> can force detoasting during explain with explicit calls to length(), but
>> that is tedious. Those of us who are forced to work using java stacks,
>> orms and still store mostly documents fall into this trap sooner or
>> later. I have already received some good feedback on this one, so this
>> is an issue that bother quite a few people out there.
> Yes, the lack of being able to see the impact of detoasting (amongst
> others) in EXPLAIN (ANALYZE) can hide performance issues.
>
>> It would be great to get some feedback on the subject and how to address
>> this, maybe in totally different ways.
> Hmm, maybe we should measure the overhead of serializing the tuples instead.
> The difference between your patch and "serializing the tuples, but not
> sending them" is that serializing also does the detoasting, but also
> includes any time spent in the serialization functions of the type. So
> an option "SERIALIZE" which measures all the time the server spent on
> the query (except the final step of sending the bytes to the client)
> would likely be more useful than "just" detoasting.
>
>> 0001_explain_analyze_and_detoast.patch
> I notice that this patch creates and destroys a memory context for
> every tuple that the DestReceiver receives. I think that's quite
> wasteful, as you should be able to create only one memory context and
> reset it before (or after) each processed tuple. That also reduces the
> differences in measurements between EXPLAIN and normal query
> processing of the tuples - after all, we don't create new memory
> contexts for every tuple in the normal DestRemote receiver either,
> right?
>
> Kind regards,
>
> Matthias van de Meent

Attachment Content-Type Size
0002_explain_analyze_and_detoast.patch text/x-patch 9.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-09-12 15:22:56 Re: Document that PG_TRY block cannot have a return statement
Previous Message Dagfinn Ilmari Mannsåker 2023-09-12 14:53:28 Re: Adding a pg_get_owned_sequence function?