Re: Schema variables - new implementation for Postgres 15

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Joel Jacobson <joel(at)compiler(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Schema variables - new implementation for Postgres 15
Date: 2022-09-01 18:17:31
Message-ID: CAFj8pRAhP32ROovyWxP-QuR+ADDUh8imt_Qr+sRnxMY-Gj8Vsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

po 29. 8. 2022 v 11:00 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com>
napsal:

> Hi,
>
> On Sat, Aug 27, 2022 at 01:17:45PM +0200, Pavel Stehule wrote:
> >
> > after some thinking I think that instead of sequence I can use LSN. The
> > combination oid, LSN should be unique forever
>
> Yeah I was about suggesting doing that instead of a sequence, so +1 for
> that
> approach!
>
> I've been spending a bit of time trying to improve the test coverage on the
> protection for concurrently deleted and recreated variables, and thought
> that a
> new isolation test should be enough. I'm attaching a diff (in .txt
> extension)
> that could be applied to 009-regress-tests-for-session-variables.patch, but
> while working on that I discovered a few problems.
>
> First, the pg_debug_show_used_session_variables() function reports what's
> currently locally known, but there's no guarantee that
> AcceptInvalidationMessages() will be called prior to its execution. For
> instance if you're in a transaction and already hold a lock on the
> function and
> execute it again.
>
> It therefore means that it can display that a locally cached variable isn't
> dropped and still holds a value, while it's not the case. While it may be
> surprising, I think that's still the wanted behavior as you want to know
> what
> is the cache state. FTR this is tested in the last permutation in the
> attached
> patch (although the expected output contains the up-to-date information,
> so you
> can see the failure).
>
> But if invalidation are processed when calling the function, the behavior
> seems
> surprising as far as I can see the cleanup seems to be done in 2 steps:
> mark t
> he hash entry as removed and then remove the hash entry. For instance:
>
> (conn 1) CREATE VARIABLE myvar AS text;
> (conn 1) LET myvar = 'something';
> (conn 2) DROP VARIABLE myvar;
> (conn 1) SELECT schema, name, removed FROM
> pg_debug_show_used_session_variables();
> schema | name | removed
> --------+-------+---------
> public | myvar | t
> (1 row)
>
> (conn 1) SELECT schema, name, removed FROM
> pg_debug_show_used_session_variables();
> schema | name | removed
> --------+------+---------
> (0 rows)
>
> Why are two steps necessary here, and is that really wanted?
>

The value is removed in the first command, but at the end of transaction.
pg_debug_show_used_session_variables is called before, and at this moment
the variable should be in memory.

I enhanced pg_debug_show_used_session_variables about debug output for
start and end, and you can see it.

(2022-08-30 19:38:49) postgres=# set client_min_messages to debug1;
SET
(2022-08-30 19:38:55) postgres=# CREATE VARIABLE myvar AS text;
DEBUG: record for session variable "myvar" (oid:16390) was created in
pg_variable
CREATE VARIABLE
(2022-08-30 19:39:03) postgres=# LET myvar = 'something';
DEBUG: session variable "public.myvar" (oid:16390) has new entry in memory
(emitted by WRITE)
DEBUG: session variable "public.myvar" (oid:16390) has new value
LET
(2022-08-30 19:39:11) postgres=# SELECT schema, name, removed FROM
pg_debug_show_used_session_variables();
DEBUG: pg_variable_cache_callback 84 2941368844
DEBUG: session variable "public.myvar" (oid:16390) should be rechecked
(forced by sinval)
DEBUG: pg_debug_show_used_session_variables start
DEBUG: effective call of sync_sessionvars_all()
DEBUG: pg_debug_show_used_session_variables end
DEBUG: session variable "public.myvar" (oid:16390) is removing from memory
┌────────┬───────┬─────────┐
│ schema │ name │ removed │
╞════════╪═══════╪═════════╡
│ public │ myvar │ t │
└────────┴───────┴─────────┘
(1 row)

(2022-08-30 19:39:32) postgres=# SELECT schema, name, removed FROM
pg_debug_show_used_session_variables();
DEBUG: pg_debug_show_used_session_variables start
DEBUG: pg_debug_show_used_session_variables end
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
└────────┴──────┴─────────┘
(0 rows)

But I missed call sync_sessionvars_all in the drop variable. If I execute
this routine there I can fix this behavior and the cleaning in
sync_sessionvars_all can be more aggressive.

After change

(2022-08-31 06:25:54) postgres=# let x = 10;
LET
(2022-08-31 06:25:59) postgres=# SELECT schema, name, removed FROM
pg_debug_show_used_session_variables();
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
│ public │ x │ f │
└────────┴──────┴─────────┘
(1 row)

-- after drop in other session

(2022-08-31 06:26:00) postgres=# SELECT schema, name, removed FROM
pg_debug_show_used_session_variables();
┌────────┬──────┬─────────┐
│ schema │ name │ removed │
╞════════╪══════╪═════════╡
└────────┴──────┴─────────┘
(0 rows)

>
> Finally, I noticed that it's quite easy to get cache lookup failures when
> using
> transactions. AFAICS it's because the current code first checks in the
> local
> cache (which often isn't immediately invalidated when in a transaction),
> returns an oid (of an already dropped variable), then the code acquires a
> lock
> on that non-existent variable, which internally accepts invalidation after
> the
> lock is acquired. The rest of the code can then fail with some "cache
> lookup
> error" in the various functions as the invalidation has now been processed.
> This is also tested in the attached isolation test.
>
> I think that using a retry approach based on SharedInvalidMessageCounter
> change
> detection, like RangeVarGetRelidExtended(), in IdentifyVariable() should be
> enough to fix that class of problem, but maybe some other general functions
> would need similar protection too.
>

I did it, and with this change it passed the isolation test. Thank you for
your important help!

>
> While looking at the testing, I also noticed that the main regression tests
> comments are now outdated since the new (and more permissive) approach for
> dropped variable detection. For instance:
>
> + ALTER TYPE public.svar_test_type DROP ATTRIBUTE c;
> + -- should to fail
> + SELECT public.svar;
> + svar
> + ---------
> + (10,20)
> + (1 row)
> +
> + ALTER TYPE public.svar_test_type ADD ATTRIBUTE c int;
> + -- should to fail too (different type, different generation number);
> + SELECT public.svar;
> + svar
> + ----------
> + (10,20,)
> + (1 row)
>
>
the comments are obsolete, fixed

>
> + CREATE VARIABLE public.avar AS int;
> + -- should to fail
> + SELECT avar FROM xxtab;
> + avar
> + ------
> + 10
> + (1 row)
> +
> + -- should be ok
> + SELECT public.avar FROM xxtab;
> + avar
> + ------
> +
> + (1 row)
>

fixed

>
>
> For reference, with the code as-is I get the following diff when testing
> the
> attached isolation test:
>
> ---
> /Users/rjuju/git/postgresql/src/test/isolation/expected/session-variable.out
> 2022-08-29 15:41:11.000000000 +0800
> +++
> /Users/rjuju/git/pg/pgmaster_debug/src/test/isolation/output_iso/results/session-variable.out
> 2022-08-29 15:42:17.000000000 +0800
> @@ -16,21 +16,21 @@
> step let: LET myvar = 'test';
> step val: SELECT myvar;
> myvar
> -----
> test
> (1 row)
>
> step s1: BEGIN;
> step drop: DROP VARIABLE myvar;
> step val: SELECT myvar;
> -ERROR: column or variable "myvar" does not exist
> +ERROR: cache lookup failed for session variable 16386
> step sr1: ROLLBACK;
>
> starting permutation: let val dbg drop create dbg val
> step let: LET myvar = 'test';
> step val: SELECT myvar;
> myvar
> -----
> test
> (1 row)
>
> @@ -68,20 +68,16 @@
> schema|name |removed
> ------+-----+-------
> public|myvar|f
> (1 row)
>
> step drop: DROP VARIABLE myvar;
> step create: CREATE VARIABLE myvar AS text
> step dbg: SELECT schema, name, removed FROM
> pg_debug_show_used_session_variables();
> schema|name |removed
> ------+-----+-------
> -public|myvar|t
> +public|myvar|f
> (1 row)
>
> step val: SELECT myvar;
> -myvar
> ------
> -
> -(1 row)
> -
> +ERROR: cache lookup failed for session variable 16389
> step sr1: ROLLBACK;
>
>
attached updated patches

Regards

Pavel

Attachment Content-Type Size
v20220901-1-0011-documentation.patch text/x-patch 42.2 KB
v20220901-1-0010-this-patch-changes-error-message-column-doesn-t-exis.patch text/x-patch 27.0 KB
v20220901-1-0007-possibility-to-dump-session-variables-by-pg_dump.patch text/x-patch 19.5 KB
v20220901-1-0008-typedefs.patch text/x-patch 1.6 KB
v20220901-1-0009-regress-tests-for-session-variables.patch text/x-patch 51.0 KB
v20220901-1-0005-DISCARD-VARIABLES-command.patch text/x-patch 3.2 KB
v20220901-1-0006-enhancing-psql-for-session-variables.patch text/x-patch 15.1 KB
v20220901-1-0004-support-of-LET-command-in-PLpgSQL.patch text/x-patch 11.5 KB
v20220901-1-0003-LET-command.patch text/x-patch 39.1 KB
v20220901-1-0002-session-variables.patch text/x-patch 98.8 KB
v20220901-1-0001-catalog-support-for-session-variables.patch text/x-patch 97.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-09-01 18:35:40 Re: Tracking last scan time
Previous Message Tom Lane 2022-09-01 17:13:15 Re: [PATCH v1] fix potential memory leak in untransformRelOptions