Re: Avoid corrupting DefElem nodes when parsing publication_names and publish options

From: Henson Choi <assam258(at)gmail(dot)com>
To: sunil s <sunilfeb26(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Avoid corrupting DefElem nodes when parsing publication_names and publish options
Date: 2025-12-28 12:18:12
Message-ID: CAAAe_zDYfsGuLSg=WXwHnLhUBrgqvoaRwu8TDKYEdnomhx_5eQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Analysis of CF 6339: DefElem corruption in publication option parsing

The original report showed that DefElem is corrupted after
SplitIdentifierString()
using a debugger. I looked further to see if this corruption actually
matters,
and found that the publicationcmds.c case is a real bug—not just an API
contract violation.

The problem
-----------

The original commit message says both cases "happen to work in practice".
This is true for pgoutput.c, but not for publicationcmds.c.

In publicationcmds.c, after parse_publication_options() corrupts the DefElem
string, EventTriggerCollectSimpleCommand() copies the statement using
copyObject(). Since copyObject() uses pstrdup() internally, it only copies
up to the first NULL byte—losing everything after the first comma.

Here's the call sequence in AlterPublicationOptions():

parse_publication_options() <- corrupts defel->arg:
"insert\0update\0delete"
...
EventTriggerCollectSimpleCommand(stmt) <- copyObject(stmt) sees
truncated string

Event trigger functions then receive "insert" instead of
"insert,update,delete".

Memory lifetime
---------------

publicationcmds.c:
- DefElem: allocated in MessageContext, freed when command completes
- Copy: allocated in EventTriggerContext by copyObject(), freed when
trigger completes
- Corruption happens before copyObject(), so the copy is already truncated

pgoutput.c:
- DefElem: allocated in cmd_context (child of TopMemoryContext)
- cmd_context is not freed until walsender exits, but the allocation is
small
and happens only once per session, so no memory leak concern
- DefElem itself is never accessed after parsing, so no problem

Reproducing the bug
-------------------

I wrote a small extension that demonstrates this. It provides two functions:
get_publish_option_with_bug() simulates the corruption,
get_publish_option_fixed()
shows the correct behavior.

See attached bug_simulator.tgz.

To test:

$ tar xzf bug_simulator.tgz
$ cd bug_simulator
$ make && sudo make install

$ initdb -D /tmp/pgdata
$ pg_ctl -D /tmp/pgdata start
$ psql -d postgres

CREATE EXTENSION bug_simulator;

CREATE TABLE bug_test_log (
id SERIAL PRIMARY KEY,
command_tag TEXT,
buggy_result TEXT,
fixed_result TEXT
);

CREATE TABLE test_table (id INT PRIMARY KEY);

CREATE FUNCTION test_bug_behavior()
RETURNS event_trigger LANGUAGE plpgsql AS $$
DECLARE obj RECORD;
BEGIN
FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
INSERT INTO bug_test_log (command_tag, buggy_result,
fixed_result)
VALUES (
obj.command_tag,
get_publish_option_with_bug(obj.command),
get_publish_option_fixed(obj.command)
);
END LOOP;
END;
$$;

CREATE EVENT TRIGGER bug_test_trigger ON ddl_command_end
WHEN TAG IN ('CREATE PUBLICATION', 'ALTER PUBLICATION')
EXECUTE FUNCTION test_bug_behavior();

CREATE PUBLICATION test_pub FOR TABLE test_table
WITH (publish = 'insert,update,delete');

ALTER PUBLICATION test_pub SET (publish = 'insert,update');

SELECT command_tag, buggy_result, fixed_result FROM bug_test_log;

Result:

command_tag | buggy_result | fixed_result
--------------------+--------------+----------------------
CREATE PUBLICATION | insert | insert,update,delete
ALTER PUBLICATION | insert | insert,update

Conclusion
----------

- publicationcmds.c: real bug, affects event triggers
- pgoutput.c: harmless, just API cleanup

The bug in publicationcmds.c is minor in practice—it only affects users who
have event triggers on publication DDL and use extensions that extract
option
values from pg_ddl_command. This is an uncommon case, but it is a real bug
nonetheless.

The patch should be applied.

2025년 12월 24일 (수) PM 8:42, sunil s <sunilfeb26(at)gmail(dot)com>님이 작성:

> Here is the reproduction of this issue.
>
> As per the official documentation
> <https://www.postgresql.org/docs/current/sql-createpublication.html>, creating
> a publication with the following syntax will corrupt the option
> list('insert, update, delete')
>
> > CREATE PUBLICATION mypublication FOR ALL TABLES WITH (publish ='insert,
> update, delete');
>
>
> By attaching a debugger to *parse_publication_options(), *we can verify
> that the option list is modified after the call to
> *splitIdentifierString().*
>
> NOTE: Using double quotes (" "), the functionality works correctly.
>
> Thanks & Regards,
> Sunil S
>

Attachment Content-Type Size
bug_simulator.tgz application/x-gzip 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Henson Choi 2025-12-28 12:56:28 Re: Avoid corrupting DefElem nodes when parsing publication_names and publish options
Previous Message Konstantin Knizhnik 2025-12-28 12:11:39 Re: RFC: PostgreSQL Storage I/O Transformation Hooks