| From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> | 
|---|---|
| To: | Alexander Lakhin <exclusion(at)gmail(dot)com> | 
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: Avoid orphaned objects dependencies, take 3 | 
| Date: | 2024-04-25 07:20:05 | 
| Message-ID: | ZioEJafHC95wjfMk@ip-10-97-1-34.eu-west-3.compute.internal | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
On Thu, Apr 25, 2024 at 09:00:00AM +0300, Alexander Lakhin wrote:
> Hi,
> 
> 25.04.2024 08:00, Bertrand Drouvot wrote:
> > 
> > > though concurrent create/drop operations can still produce
> > > the "cache lookup failed" error, which is probably okay, except that it is
> > > an INTERNAL_ERROR, which assumed to be not easily triggered by users.
> > I did not see any of those "cache lookup failed" during my testing with/without
> > your script. During which test(s) did you see those with v3 applied?
> 
> You can try, for example, table-trigger, or other tests that check for
> "cache lookup failed" psql output only (maybe you'll need to increase the
> iteration count). For instance, I've got (with v4 applied):
> 2024-04-25 05:48:08.102 UTC [3638763] ERROR:  cache lookup failed for function 20893
> 2024-04-25 05:48:08.102 UTC [3638763] STATEMENT:  CREATE TRIGGER modified_c1 BEFORE UPDATE OF c ON t
>         FOR EACH ROW WHEN (OLD.c <> NEW.c) EXECUTE PROCEDURE trigger_func('modified_c');
> 
> Or with function-function:
> 2024-04-25 05:52:31.390 UTC [3711897] ERROR:  cache lookup failed for function 32190 at character 54
> 2024-04-25 05:52:31.390 UTC [3711897] STATEMENT:  CREATE FUNCTION f1() RETURNS int LANGUAGE SQL RETURN f() + 1;
> --
> 2024-04-25 05:52:37.639 UTC [3720011] ERROR:  cache lookup failed for function 34465 at character 54
> 2024-04-25 05:52:37.639 UTC [3720011] STATEMENT:  CREATE FUNCTION f1() RETURNS int LANGUAGE SQL RETURN f() + 1;
I see, so this is during object creation.
It's easy to reproduce this kind of issue with gdb. For example set a breakpoint
on SearchSysCache1() and during the create function f1() once it breaks on:
#0  SearchSysCache1 (cacheId=45, key1=16400) at syscache.c:221
#1  0x00005ad305beacd6 in func_get_detail (funcname=0x5ad308204d50, fargs=0x0, fargnames=0x0, nargs=0, argtypes=0x7ffff2ff9cc0, expand_variadic=true, expand_defaults=true, include_out_arguments=false, funcid=0x7ffff2ff9ba0, rettype=0x7ffff2ff9b9c, retset=0x7ffff2ff9b94, nvargs=0x7ffff2ff9ba4,
    vatype=0x7ffff2ff9ba8, true_typeids=0x7ffff2ff9bd8, argdefaults=0x7ffff2ff9be0) at parse_func.c:1622
#2  0x00005ad305be7dd0 in ParseFuncOrColumn (pstate=0x5ad30823be98, funcname=0x5ad308204d50, fargs=0x0, last_srf=0x0, fn=0x5ad308204da0, proc_call=false, location=55) at parse_func.c:266
#3  0x00005ad305bdffb0 in transformFuncCall (pstate=0x5ad30823be98, fn=0x5ad308204da0) at parse_expr.c:1474
#4  0x00005ad305bdd2ee in transformExprRecurse (pstate=0x5ad30823be98, expr=0x5ad308204da0) at parse_expr.c:230
#5  0x00005ad305bdec34 in transformAExprOp (pstate=0x5ad30823be98, a=0x5ad308204e20) at parse_expr.c:990
#6  0x00005ad305bdd1a0 in transformExprRecurse (pstate=0x5ad30823be98, expr=0x5ad308204e20) at parse_expr.c:187
#7  0x00005ad305bdd00b in transformExpr (pstate=0x5ad30823be98, expr=0x5ad308204e20, exprKind=EXPR_KIND_SELECT_TARGET) at parse_expr.c:131
#8  0x00005ad305b96b7e in transformReturnStmt (pstate=0x5ad30823be98, stmt=0x5ad308204ee0) at analyze.c:2395
#9  0x00005ad305b92213 in transformStmt (pstate=0x5ad30823be98, parseTree=0x5ad308204ee0) at analyze.c:375
#10 0x00005ad305c6321a in interpret_AS_clause (languageOid=14, languageName=0x5ad308204c40 "sql", funcname=0x5ad308204ad8 "f100", as=0x0, sql_body_in=0x5ad308204ee0, parameterTypes=0x0, inParameterNames=0x0, prosrc_str_p=0x7ffff2ffa208, probin_str_p=0x7ffff2ffa200, sql_body_out=0x7ffff2ffa210,
    queryString=0x5ad3082040b0 "CREATE FUNCTION f100() RETURNS int LANGUAGE SQL RETURN f() + 1;") at functioncmds.c:953
#11 0x00005ad305c63c93 in CreateFunction (pstate=0x5ad308186310, stmt=0x5ad308204f00) at functioncmds.c:1221
then drop function f() in another session. Then the create function f1() would:
postgres=# CREATE FUNCTION f() RETURNS int LANGUAGE SQL RETURN f() + 1;
ERROR:  cache lookup failed for function 16400
This stuff does appear before we get a chance to call the new depLockAndCheckObject()
function.
I think this is what Tom was referring to in [1]:
"
So the only real fix for this would be to make every object lookup in the entire
system do the sort of dance that's done in RangeVarGetRelidExtended.
"
The fact that those kind of errors appear also somehow ensure that no orphaned
dependencies can be created.
The patch does ensure that no orphaned depencies can occur after those "initial"
look up are done (should the dependent object be dropped).
I'm tempted to not add extra complexity to avoid those kind of errors and keep the
patch as it is. All of those servicing the same goal: no orphaned depencies are
created.
[1]: https://www.postgresql.org/message-id/2872252.1630851337%40sss.pgh.pa.us
Regards,
-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Richard Guo | 2024-04-25 07:20:10 | Re: Short-circuit sort_inner_and_outer if there are no mergejoin clauses | 
| Previous Message | Anthonin Bonnefoy | 2024-04-25 07:17:52 | Re: Fix parallel vacuum buffer usage reporting |