Re: Added schema level support for publication.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: Re: Added schema level support for publication.
Date: 2021-08-04 10:40:07
Message-ID: CAA4eK1+rsH7JQdNk-Gn2FnXPd=SpgAFArM6+mvdb-EcMXwNdBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 3, 2021 at 8:38 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Thanks for reporting this, this is fixed in the v18 patch attached.
>

I have started looking into this patch and below are some initial comments.

1.
+ /* Fetch publication name and schema oid from input list */
+ schemaname = strVal(linitial(object));
+ pubname = strVal(lsecond(object));

I think the comment should be: "Fetch schema name and publication name
from input list"

2.
@@ -3902,6 +3958,46 @@ getObjectDescription(const ObjectAddress
*object, bool missing_ok)
break;
}

+ case OCLASS_PUBLICATION_SCHEMA:
+ {
+ HeapTuple tup;
+ char *pubname;
+ Form_pg_publication_schema psform;
+ char *nspname;
+
+ tup = SearchSysCache1(PUBLICATIONSCHEMA,
+ ObjectIdGetDatum(object->objectId));
+ if (!HeapTupleIsValid(tup))
+ {
+ if (!missing_ok)
+ elog(ERROR, "cache lookup failed for publication schema %u",
+ object->objectId);
+ break;
+ }
+
+ psform = (Form_pg_publication_schema) GETSTRUCT(tup);
+ pubname = get_publication_name(psform->pspubid, false);
+ nspname = get_namespace_name(psform->psnspcid);
+ if (!nspname)
+ {
+ Oid psnspcid = psform->psnspcid;
+
+ pfree(pubname);
+ ReleaseSysCache(tup);
+ if (!missing_ok)
+ elog(ERROR, "cache lookup failed for schema %u",
+ psnspcid);
+ break;
+ }

The above code in getObjectDescription looks quite similar to what you
have in getObjectIdentityParts(). Can we extract the common code into
a separate function?

3. Can we use column name pubkind (similar to relkind in pg_class)
instead of pubtype? If so, please change PUBTYPE_ALLTABLES and similar
other defines to PUBKIND_*.

4.
@@ -3632,6 +3650,7 @@ typedef struct CreatePublicationStmt
List *options; /* List of DefElem nodes */
List *tables; /* Optional list of tables to add */
bool for_all_tables; /* Special publication for all tables in db */
+ List *schemas; /* Optional list of schemas */
} CreatePublicationStmt;

Isn't it better to keep a schemas list after tables?

5.
@@ -1163,12 +1168,27 @@ get_rel_sync_entry(PGOutputData *data, Oid relid)
Publication *pub = lfirst(lc);
bool publish = false;

- if (pub->alltables)
+ if (pub->pubtype == PUBTYPE_ALLTABLES)
{
publish = true;
if (pub->pubviaroot && am_partition)
publish_as_relid = llast_oid(get_partition_ancestors(relid));
}
+ else if (pub->pubtype == PUBTYPE_SCHEMA)
+ {
+ Oid schemaId = get_rel_namespace(relid);
+ Oid psid = GetSysCacheOid2(PUBLICATIONSCHEMAMAP,
+ Anum_pg_publication_schema_oid,
+ ObjectIdGetDatum(schemaId),
+ ObjectIdGetDatum(pub->oid));
+
+ if (OidIsValid(psid))
+ {
+ publish = true;
+ if (pub->pubviaroot && am_partition)
+ publish_as_relid = llast_oid(get_partition_ancestors(relid));
+ }
+ }

Isn't it better to get schema publications once as we get relation
publications via GetRelationPublications and then decide whether to
publish or not? I think that will save repeated cache searches for
each publication requested by the subscriber?

6.
+ {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
+ PublicationSchemaPsnspcidPspubidIndexId,
+ 2,
+ {
+ Anum_pg_publication_schema_psnspcid,
+ Anum_pg_publication_schema_pspubid,
+ 0,
+ 0
+ },

Why don't we keep pubid as the first column in this index?

7.
getPublicationSchemas()
{
..
+ /* Get the publication membership for the schema */
+ printfPQExpBuffer(query,
+ "SELECT ps.psnspcid, ps.oid, p.pubname, p.oid AS pubid "
+ "FROM pg_publication_schema ps, pg_publication p "
+ "WHERE ps.psnspcid = '%u' "
+ "AND p.oid = ps.pspubid",
+ nsinfo->dobj.catId.oid);
..
}

Why do you need to use join here? Why the query and another handling
in this function be similar to what we have in getPublicationTables?
Also, there is another function GetPublicationSchemas() in the patch,
can we name one of these differently for the purpose of easy
grepability?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-08-04 10:44:17 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Matthias van de Meent 2021-08-04 10:30:30 Re: Lowering the ever-growing heap->pd_lower