Re: psql: Fix CREATE SCHEMA scanning of nested routine bodies

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: psql: Fix CREATE SCHEMA scanning of nested routine bodies
Date: 2026-06-23 08:13:43
Message-ID: F63EA225-9903-41A6-8CE4-1D1BDB161B24@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jun 23, 2026, at 07:08, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
>
>
>> On Jun 23, 2026, at 00:04, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
>>> See this repro:
>>> ```
>>> evantest=# CREATE SCHEMA s CREATE VIEW begin AS SELECT 1;
>>> evantest-# end;
>>> WARNING: there is no transaction in progress
>>> CREATE SCHEMA
>>> COMMIT
>>> ```
>>
>> Meh. Yeah, you can fool psql by using non-reserved keywords as
>> identifiers. I'm not super excited about adding complexity that
>> resolves specific edge cases of that kind, because there will always
>> be more of them. As an example, this fools both HEAD and the back
>> branches:
>>
>> regression=# create function begin() returns int
>> regression-# begin atomic return 2+2; end;
>> regression-#
>>
>> Is it really likely that anyone will use "begin" as an object name?
>>
>> Having said that, I agree that what psqlscan.l is doing now is quite
>> hokey and maybe we should make it smarter. I don't like the details
>> of your patch too much though. It's too obviously bolted on: the
>> added parsing logic looks nothing like what was there before.
>> Also, I think it can still be fooled by
>>
>> create schema s
>> create function f() ... as $$ function body $$
>> create view begin as ...
>>
>> because you don't reset create_schema_routine_body unless you
>> see "end".
>>
>> What I'd think about doing, if you want to pursue this, is to
>> have a buffer similar to cur_state->identifiers[] but it tracks
>> identifiers starting from the most recent CREATE sub-clause
>> start within CREATE SCHEMA (ie, a non-nested CREATE keyword),
>> and then applies basically the same logic as in the original code
>> to decide if we're within a CREATE FUNCTION sub-clause.
>>
>> Also, you could avoid cluttering other productions by resetting
>> all the new state when identifier_count is zero, in the same place
>> where we reset cur_state->identifiers[] today.
>>
>> It'd likely be appropriate to pull all this logic out of the
>> {identifier} production and put it in a subroutine, just because
>> it's getting unduly complicated.
>>
>> regards, tom lane
>
> Thanks for the detailed explanation. I will try to rework the patch along the direction you suggested.
>

Following your suggestion, I moved the "{identifier}" logic into a new helper psqlscan_track_identifier(), and added “create_schema_identifiers" to PsqlScanStateData to track identifiers from the current top-level CREATE element within CREATE SCHEMA in the same way as the top level “identifiers". I also added a few more test cases.

Please see the attached v2 for details.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment Content-Type Size
v2-0001-psql-Fix-CREATE-SCHEMA-scanning-with-object-names.patch application/octet-stream 11.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-06-23 08:17:37 Re: Fix variadic argument types for pg_get_xxx_ddl() functions
Previous Message Michael Paquier 2026-06-23 07:53:49 Re: [PATCH] doc: Document that invalid indexes are skipped during ATTACH PARTITION