Re: Column Filtering in Logical Replication

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-12-17 21:07:18
Message-ID: 202112172107.i5ebyfyrifk7@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So I've been thinking about this as a "security" item (you can see my
comments to that effect sprinkled all over this thread), in the sense
that if a publication "hides" some column, then the replica just won't
get access to it. But in reality that's mistaken: the filtering that
this patch implements is done based on the queries that *the replica*
executes at its own volition; if the replica decides to ignore the list
of columns, it'll be able to get all columns. All it takes is an
uncooperative replica in order for the lot of data to be exposed anyway.

If the server has a *separate* security mechanism to hide the columns
(per-column privs), it is that feature that will protect the data, not
the logical-replication-feature to filter out columns.

This led me to realize that the replica-side code in tablesync.c is
totally oblivious to what's the publication through which a table is
being received from in the replica. So we're not aware of a replica
being exposed only a subset of columns through some specific
publication; and a lot more hacking is needed than this patch does, in
order to be aware of which publications are being used.

I'm going to have a deeper look at this whole thing.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-12-17 21:24:05 Re: Adding CI to our tree
Previous Message John Naylor 2021-12-17 21:01:37 Re: speed up text_position() for utf-8