Re: row filtering for logical replication

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-13 19:07:38
Message-ID: 86dcff6e-9b5f-0f9a-8602-69a5602a30c9@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/13/21 5:44 PM, Jeff Davis wrote:
> On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
>> to do. AFAIU, the main things we want to prohibit in the filter are:
>> (a) it doesn't refer to any relation other than catalog in where
>> clause,
>
> Right, because the walsender is using a historical snapshot.
>
>> (b) it doesn't use UDFs in any way (in expressions, in
>> user-defined operators, user-defined types, etc.),
>
> Is this a reasonable requirement? Postgres has a long history of
> allowing UDFs nearly everywhere that a built-in is allowed. It feels
> wrong to make built-ins special for this feature.
>

Well, we can either prohibit UDF or introduce a massive foot-gun.

The problem with functions in general (let's ignore SQL functions) is
that they're black boxes, so we don't know what's inside. And if the
function gets broken after an object gets dropped, the replication is
broken and the only way to fix it is to recover the subscription.

And this is not hypothetical issue, we've seen this repeatedly :-(

So as much as I'd like to see support for UDFs here, I think it's better
to disallow them - at least for now. And maybe relax that restriction
later, if possible.

>> (c) the columns
>> referred to in the filter should be part of PK or Replica Identity.
>
> Why?
>

I'm not sure either.

>
> Also:
>
> * Andres also mentioned that the function should not leak memory.
> * One use case for this feature is when sharding a table, so the
> expression should allow things like "hashint8(x) between ...". I'd
> really like to see this problem solved, as well.
>

I think built-in functions should be fine, because generally don't get
dropped etc. (And if you drop built-in function, well - sorry.)

Not sure about the memory leaks - I suppose we'd free memory for each
row, so this shouldn't be an issue I guess ...

>> I think in the long run one idea to allow UDFs is probably by
>> explicitly allowing users to specify whether the function is
>> publication predicate safe and if so, then we can allow such
>> functions
>> in the filter clause.
>
> This sounds like a better direction. We probably need some kind of
> catalog information here to say what functions/operators are "safe" for
> this kind of purpose. There are a couple questions:
>

Not sure. It's true it's a bit like volatile/stable/immutable categories
where we can't guarantee those labels are correct, and it's up to the
user to keep the pieces if they pick the wrong category.

But we can achieve the same goal by introducing a simple GUC called
dangerous_allow_udf_in_decoding, I think.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-07-13 19:15:48 Re: row filtering for logical replication
Previous Message Ranier Vilela 2021-07-13 18:38:02 Re: printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)