Re: Doc update for pg_stat_statements normalization

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Doc update for pg_stat_statements normalization
Date: 2023-02-27 22:53:26
Message-ID: 10015573-0539-43BC-B2EB-14E39AB7BC46@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sat, Feb 25, 2023 at 01:59:04PM +0000, Imseih (AWS), Sami wrote:>
> > The overhead of storing this additional private data for the life of the query
> > execution may not be desirable.

> Okay, but why?

Additional memory to maintain the JumbleState data in other structs, and
the additional copy operations.

> This seems like the key point to me here. If we copy more information
> into the Query structures, then we basically have no need for sticky
> entries, which could be an advantage on its own as it simplifies the
> deallocation and lookup logic.

Removing the sticky entry logic will be a big plus, and of course we can
eliminate a query not showing up properly normalized.

> For a DML or a SELECT, the manipulation of the hash table would still
> be a three-step process (post-analyze, planner and execution end), but
> the first step would have no need to use an exclusive lock on the hash
> table because we could just read and copy over the Query the
> normalized query if an entry exists, meaning that we could actually
> relax things a bit?

No lock is held while normalizing, and a shared lock is held while storing,
so there is no apparent benefit from that aspect.

> This relaxation has as cost the extra memory used
> to store more data to allow the insertion to use a proper state of the
> Query[Desc] coming from the JumbleState (this extra data has no need
> to be JumbleState, just the results we generate from it aka the
> normalized query).

Wouldn't be less in terms of memory usage to just store the required
JumbleState fields in Query[Desc]?

clocations_count,
highest_extern_param_id,
clocations_locations,
clocations_length

> > In v14, we added a dealloc metric to pg_stat_statements_info, which is helpful.
> > However, this only deals with the pgss_hash entry deallocation.
> > I think we should also add a metric for the text file garbage collection.

> This sounds like a good idea on its own.

I can create a separate patch for this.

Regards,

--
Sami Imseih
Amazon Web services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-27 23:18:30 Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Previous Message Kirk Wolak 2023-02-27 22:26:01 Re: Proposal: :SQL_EXEC_TIME (like :ROW_COUNT) Variable (psql)