Re: pg_stat_statements oddity with track = all

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements oddity with track = all
Date: 2021-03-07 07:39:36
Message-ID: 20210307073936.mvhwxsol4rkt3c5v@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 06, 2021 at 06:56:49PM +0100, Magnus Hagander wrote:
> On Sun, Dec 27, 2020 at 9:39 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> - *
> - * Right now, this structure contains no padding. If you add any, make sure
> - * to teach pgss_store() to zero the padding bytes. Otherwise, things will
> - * break, because pgss_hash is created using HASH_BLOBS, and thus tag_hash
> - * is used to hash this.
>
> I don't think removing this comment completely is a good idea. Right
> now it ends up there, yes, but eventually it might reach the same
> state again. I think it's better to reword it based on the current
> situation while keeping the part about it having to be zeroed for
> padding. And maybe along with a comment at the actual memset() sites
> as well?

Agreed, I'll take care of that.

> AFAICT, it's going to store two copies of the query in the query text
> file (pgss_query_texts.stat)? Can we find a way around that? Maybe by
> looking up the entry with the flag set to the other value, and then
> reusing that?

Yes I was a bit worried about the possible extra text entry. I kept things
simple until now as the general opinion was that entries existing for both top
level and nested level should be very rare so adding extra code and overhead to
spare a few query texts wouldn't be worth it.

I think that using a flag might be a bit expensive, as we would have to make
sure that at least one of the possible two entries has it. So if there are 2
entries and the one with the flag is evicted, we would have to transfer the
flag to the other one, and check the existence of the flag when allocatin a new
entry. And all of that has to be done holding an exclusive lock on pgss->lock.

Maybe having a new hash table (without the toplevel flag) for those query text
might be better, or maybe pgss performance is already so terrible when you have
to regularly evict entries that it wouldn't make any real difference.

Should I try to add some extra code to make sure that we only store the query
text once, or should I document that there might be duplicate, but we expect
that to be very rare?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-03-07 08:06:50 Re: [HACKERS] Custom compression methods
Previous Message Dian M Fay 2021-03-07 07:37:53 Re: [PATCH] postgres-fdw: column option to override foreign types