Re: row filtering for logical replication

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-13 15:44:43
Message-ID: 15cdb1b63669b6e6d7a3dcd9a21f156aa1bea8fd.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
> to do. AFAIU, the main things we want to prohibit in the filter are:
> (a) it doesn't refer to any relation other than catalog in where
> clause,

Right, because the walsender is using a historical snapshot.

> (b) it doesn't use UDFs in any way (in expressions, in
> user-defined operators, user-defined types, etc.),

Is this a reasonable requirement? Postgres has a long history of
allowing UDFs nearly everywhere that a built-in is allowed. It feels
wrong to make built-ins special for this feature.

> (c) the columns
> referred to in the filter should be part of PK or Replica Identity.

Why?

Also:

* Andres also mentioned that the function should not leak memory.
* One use case for this feature is when sharding a table, so the
expression should allow things like "hashint8(x) between ...". I'd
really like to see this problem solved, as well.

> I think in the long run one idea to allow UDFs is probably by
> explicitly allowing users to specify whether the function is
> publication predicate safe and if so, then we can allow such
> functions
> in the filter clause.

This sounds like a better direction. We probably need some kind of
catalog information here to say what functions/operators are "safe" for
this kind of purpose. There are a couple questions:

1. Should this notion of safety be specific to this feature, or should
we try to generalize it so that other areas of the system might benefit
as well?

2. Should this marking be superuser-only, or user-specified?

3. Should it be related to the IMMUTABLE/STABLE/VOLATILE designation,
or completely separate?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Wu Haotian 2021-07-13 15:45:49 Re: Add option --drop-cascade for pg_dump/restore
Previous Message David G. Johnston 2021-07-13 15:37:49 Re: DROP relation IF EXISTS Docs and Tests - Bug Fix