Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-13 04:54:39
Message-ID: CAA4eK1JL2q+HENgiCf1HLRU7nD9jCcttB9sEqV1tech4mMv_0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 12, 2021 at 3:01 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 7/12/21 6:46 AM, Amit Kapila wrote:
> > On Mon, Jul 12, 2021 at 7:19 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> > Now, the other idea I had in mind was to traverse the WHERE clause
> > expression in publication_add_relation and identify if it contains
> > anything other than the ANDed list of 'foo.bar op constant'
> > expressions. OTOH, for index where clause expressions or policy check
> > expressions, we use a technique similar to what we have in the patch
> > to prohibit certain kinds of expressions.
> >
> > Do you have any preference on how this should be addressed?
> >
>
> I don't think this is sufficient, because who knows where "op" comes
> from? It might be from an extension, in which case the problem pointed
> out by Petr Jelinek [1] would apply. OTOH I suppose we could allow
> expressions like (Var op Var), i.e. "a < b" or something like that. And
> then why not allow (a+b < c-10) and similar "more complex" expressions,
> as long as all the operators are built-in?
>

Yeah, and the patch already disallows the user-defined operators in
filters. I think ideally if the operator doesn't refer to UDFs, we can
allow to directly use such an OP in the filter as we can add a
dependency for the same.

> In terms of implementation, I think there are two basic options - either
> we can define a new "expression" type in gram.y, which would be a subset
> of a_expr etc. Or we can do it as some sort of expression walker, kinda
> like what the transform* functions do now.
>

I think it is better to use some form of walker here rather than
extending the grammar for this. However, the question is do we need
some special kind of expression walker here or can we handle all
required cases via transformWhereClause() call as the patch is trying
to do. AFAIU, the main things we want to prohibit in the filter are:
(a) it doesn't refer to any relation other than catalog in where
clause, (b) it doesn't use UDFs in any way (in expressions, in
user-defined operators, user-defined types, etc.), (c) the columns
referred to in the filter should be part of PK or Replica Identity.
Now, if all such things can be detected by the approach patch has
taken then why do we need a special kind of expression walker? OTOH,
if we can't detect some of this then probably we can use a special
walker.

I think in the long run one idea to allow UDFs is probably by
explicitly allowing users to specify whether the function is
publication predicate safe and if so, then we can allow such functions
in the filter clause.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2021-07-13 05:35:00 Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Previous Message David Rowley 2021-07-13 04:44:12 Re: Add proper planner support for ORDER BY / DISTINCT aggregates