Re: pg_stat_statements oddity with track = all

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements oddity with track = all
Date: 2021-03-08 17:03:59
Message-ID: CABUevEwnTJHSzq5js5NuBtRS=JdhpZymT9AK1LM7uk4K0B1X0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 7, 2021 at 8:39 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Sat, Mar 06, 2021 at 06:56:49PM +0100, Magnus Hagander wrote:
> > On Sun, Dec 27, 2020 at 9:39 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > >
> > - *
> > - * Right now, this structure contains no padding. If you add any, make sure
> > - * to teach pgss_store() to zero the padding bytes. Otherwise, things will
> > - * break, because pgss_hash is created using HASH_BLOBS, and thus tag_hash
> > - * is used to hash this.
> >
> > I don't think removing this comment completely is a good idea. Right
> > now it ends up there, yes, but eventually it might reach the same
> > state again. I think it's better to reword it based on the current
> > situation while keeping the part about it having to be zeroed for
> > padding. And maybe along with a comment at the actual memset() sites
> > as well?
>
> Agreed, I'll take care of that.
>
> > AFAICT, it's going to store two copies of the query in the query text
> > file (pgss_query_texts.stat)? Can we find a way around that? Maybe by
> > looking up the entry with the flag set to the other value, and then
> > reusing that?
>
> Yes I was a bit worried about the possible extra text entry. I kept things
> simple until now as the general opinion was that entries existing for both top
> level and nested level should be very rare so adding extra code and overhead to
> spare a few query texts wouldn't be worth it.
>
> I think that using a flag might be a bit expensive, as we would have to make
> sure that at least one of the possible two entries has it. So if there are 2
> entries and the one with the flag is evicted, we would have to transfer the
> flag to the other one, and check the existence of the flag when allocatin a new
> entry. And all of that has to be done holding an exclusive lock on pgss->lock.

Yeah, we'd certainly want to minimize things. But what if they both
have the flag at that point? Then we wouldn't have to care on
eviction? But yes, for new allications we'd have to look up if the
query existed with the other value of the flag, and copy it over in
that case.

> Maybe having a new hash table (without the toplevel flag) for those query text
> might be better, or maybe pgss performance is already so terrible when you have
> to regularly evict entries that it wouldn't make any real difference.
>
> Should I try to add some extra code to make sure that we only store the query
> text once, or should I document that there might be duplicate, but we expect
> that to be very rare?

If we expect it to be rare, I think it might be reasonable to just
document that. But do we really have a strong argument for it being
rare?

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-03-08 17:05:15 Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Previous Message Tom Lane 2021-03-08 16:58:39 Re: pg_upgrade failing for 200+ million Large Objects