Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-12-18 03:52:05
Message-ID: CAA4eK1+yW_RY=4=KDt8Qw1xqxsQg=GPJZz-iWDk2QNggNph9QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 18, 2021 at 7:04 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2021-Dec-17, Tomas Vondra wrote:
>
> > On 12/17/21 22:07, Alvaro Herrera wrote:
> > > So I've been thinking about this as a "security" item (you can see my
> > > comments to that effect sprinkled all over this thread), in the sense
> > > that if a publication "hides" some column, then the replica just won't
> > > get access to it. But in reality that's mistaken: the filtering that
> > > this patch implements is done based on the queries that *the replica*
> > > executes at its own volition; if the replica decides to ignore the list
> > > of columns, it'll be able to get all columns. All it takes is an
> > > uncooperative replica in order for the lot of data to be exposed anyway.
> >
> > Interesting, I haven't really looked at this as a security feature. And in
> > my experience if something is not carefully designed to be secure from the
> > get go, it's really hard to add that bit later ...
>
> I guess the way to really harden replication is to use the GRANT system
> at the publisher's side to restrict access for the replication user.
> This would provide actual security. So you're right that I seem to be
> barking at the wrong tree ... maybe I need to give a careful look at
> the documentation for logical replication to understand what is being
> offered, and to make sure that we explicitly indicate that limiting the
> column list does not provide any actual security.
>

IIRC, the use cases as mentioned by other databases (like Oracle) are
(a) this helps when the target table doesn't have the same set of
columns or (b) when the columns contain some sensitive information
like personal identification number, etc. I think there could be a
side benefit in this which comes from the fact that the lesser data
will flow across the network which could lead to faster replication
especially when the user filters large column data.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-12-18 04:07:02 Re: Column Filtering in Logical Replication
Previous Message Amit Kapila 2021-12-18 02:33:18 Re: row filtering for logical replication