From: | "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> |
---|---|
To: | Julien Rouhaud <rjuju123(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Doc update for pg_stat_statements normalization |
Date: | 2023-02-25 13:59:04 |
Message-ID: | E047191C-E0C6-4BA5-8E8E-1A318858FCE6@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > Could things be done in a more stable way? For example, imagine that
> > we have an extra Query field called void *private_data that extensions
> > can use to store custom data associated to a query ID, then we could
> > do something like that:
> > - In the post-analyze hook, check if an entry with the query ID
> > calculated exists.
> > -- If the entry exists, grab a copy of the existing query string,
> > which may be normalized or not, and save it into Query->private_data.
> > -- If the entry does not exist, normalize the query, store it in
> > Query->private_data but do not yet create an entry in the hash table.
> > - In the planner/utility hook, fetch the normalized query from
> > private_data, then use it if an entry needs to be created in the hash
> > table. The entry may have been deallocated since the post-analyze
> > hook, in which case it is re-created with the normalized copy saved in
> > the first phase.
> I think the idea of a "private_data" like thing has been discussed before and
> rejected IIRC, as it could be quite expensive and would also need to
> accommodate for multiple extensions and so on.
The overhead of storing this additional private data for the life of the query
execution may not be desirable. I think we also will need to copy the
private data to QueryDesc as well to make it available to planner/utility/exec
hooks.
> Overall, I think that if the pgss eviction rate is high enough that it's
> problematic for doing performance analysis, the performance overhead will be so
> bad that simply removing pg_stat_statements will give you a ~ x2 performance
> increase. I don't see much point trying to make such a performance killer
> scenario more usable.
In v14, we added a dealloc metric to pg_stat_statements_info, which is helpful.
However, this only deals with the pgss_hash entry deallocation.
I think we should also add a metric for the text file garbage collection.
Regards
--
Sami Imseih
Amazon Web Services
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2023-02-25 14:05:53 | Re: Add LZ4 compression in pg_dump |
Previous Message | Pavel Stehule | 2023-02-25 12:56:30 | broken formatting? |