From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Invalid pointer access in logical decoding after error |
Date: | 2025-07-02 06:42:17 |
Message-ID: | CALDaNm0x-aCehgt8Bevs2cm=uhmwS28MvbYq1=s2Ekf0aDPkOA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I encountered an invalid pointer access issue. Below are the steps to
reproduce the issue:
-- Create table
CREATE TABLE t1(c1 int, c2 int);
-- Create publications with each publication selecting a different column
CREATE PUBLICATION pub1 for TABLE t1(c1);
CREATE PUBLICATION pub2 for TABLE t1(c2);
-- Create slot
SELECT * FROM pg_create_logical_replication_slot('test', 'pgoutput');
-- Insert couple of records
INSERT INTO t1 VALUES(1,1);
INSERT INTO t1 VALUES(2,2);
-- Execute slot_get_changes which will throw an error because of
different column lists
postgres=# SELECT * FROM pg_logical_slot_get_binary_changes('test',
NULL, NULL, 'proto_version', '4', 'publication_names', 'pub1,pub2');
ERROR: cannot use different column lists for table "public.t1" in
different publications
CONTEXT: slot "test", output plugin "pgoutput", in the change
callback, associated LSN 0/14C3C30
-- The second call simulates an issue where we try to free an invalid pointer
postgres=# SELECT * FROM pg_logical_slot_get_binary_changes('test',
NULL, NULL, 'proto_version', '4', 'publication_names', 'pub1,pub2');
ERROR: pfree called with invalid pointer 0x58983541e6b8 (header
0x6563617073656d61)
CONTEXT: slot "test", output plugin "pgoutput", in the change
callback, associated LSN 0/14C3C30
The error occurs because entry->columns is allocated in the entry
private context (entry->entry_cxt) by pub_collist_to_bitmapset(). This
context is a child of the PortalContext, which is cleared after an
error via: AbortTransaction -> AtAbort_Portals ->
MemoryContextDeleteChildren -> MemoryContextDelete ->
MemoryContextDeleteOnly
As a result, the memory backing entry->columns is freed, but the
RelationSyncCache which resides in CacheMemoryContext and thus
survives the error still holds a dangling pointer to this freed
memory, causing it to pfree an invalid pointer.
In the normal (positive) execution flow, pgoutput_shutdown() is called
to clean up the RelationSyncCache. This happens via:
FreeDecodingContext -> shutdown_cb_wrapper -> pgoutput_shutdown
But this is not called in case of an error case. To handle this case
safely, I suggest calling FreeDecodingContext in the PG_CATCH block to
ensure pgoutput_shutdown is invoked and the stale cache is cleared
appropriately. Attached patch has the changes for the same.
Thoughts?
Regards,
Vignesh
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Fix-referencing-invalid-pointer-in-logical-decodi.patch | application/octet-stream | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2025-07-02 06:48:31 | Re: Allow the "operand" input of width_bucket() to be NaN |
Previous Message | Bertrand Drouvot | 2025-07-02 06:39:25 | Re: Add os_page_num to pg_buffercache |