Re: [survey] New "Stable" QueryId based on normalized query text

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Jim Finnerty <jfinnert(at)amazon(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [survey] New "Stable" QueryId based on normalized query text
Date: 2019-08-10 08:00:25
Message-ID: CAOBaU_YPkz_PBJFb6nLHZv2NcPzFeDHAwPPiF0bca+Jn_Q3Kkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 10, 2019 at 3:27 AM Jim Finnerty <jfinnert(at)amazon(dot)com> wrote:
>
> I missed this thread. I'd be happy to post the code for what we use as the
> stable query identifier, but we could definitely come up with a more
> efficient algorithm if we're willing to assume that the sql statements are
> the same if and only if the parse tree structure is the same.
>
> Currently what we do for the sql hash is to simply replace all the literals
> and then hash the resulting SQL string

Isn't that what pg_store_plan is already doing? Except that it
removes extraneous whitespaces and put identifiers in uppercase so
that you get a reasonable query identifier.

> you could define a stable identifier for each node type, ignore literal
> constants, and hash fully-qualified object names instead of OIDs. That
> should be pretty fast.

This has been discussed already, and resolving all object names and
qualifier names will add a dramatic overhead for many workloads.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2019-08-10 08:20:10 Re: Shrinking tuplesort.c's SortTuple struct (Was: More ideas for speeding up sorting)
Previous Message Pavel Stehule 2019-08-10 07:10:14 Re: [HACKERS] proposal: schema variables