Re: Logical Replication Custom Column Expression

From: Stavros Koureas <koureasstavros(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Logical Replication Custom Column Expression
Date: 2022-11-23 07:53:54
Message-ID: CA+O1jk5VztsmtcUQAG-HLg7o6gbbJuOD7tNiyX4bguN0qOiF+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It's easy to answer this question.

Imagine that in a software company who sells the product and also offers
reporting solutions, the ERP tables will not have this additional column to
all the tables.
Now the reporting department comes and needs to consolidate all that data
from different databases (publishers) and create one multitenant database
to have all the data.
So in an ERP like NAV or anything else you cannot suggest change all the
code to all of the tables plus all functions to add one additional column
to this table, even that was possible then you cannot work with integers
but you need to work with GUIDs as this column should be predefined to each
ERP. Then joining with GUID in the second phase for reporting
definitely will slow down the performance.

In summary:

1. Cannot touch the underlying source (important)
2. GUID identifier column will slow down the reporting performance

Στις Τετ 23 Νοε 2022 στις 5:19 π.μ., ο/η Amit Kapila <
amit(dot)kapila16(at)gmail(dot)com> έγραψε:

> On Wed, Nov 23, 2022 at 1:40 AM Stavros Koureas
> <koureasstavros(at)gmail(dot)com> wrote:
> >
> > Reading more carefully what you described, I think you are interested in
> getting something you call origin from publishers, probably some metadata
> from the publications.
> >
> > This identifier in those metadata maybe does not have business value on
> the reporting side. The idea is to use a value which has specific meaning
> to the user at the end.
> >
> > For example assigning 1 for tenant 1, 2 for tenant 2 and so one, at the
> end based on a dimension table which holds this mapping the user would be
> able to filter the data. So programmatically the user can set the id value
> of the column plus creating the mapping table from an application let’s say
> and be able to distinguish the data.
> >
>
> In your example, are different tenants represent different publisher
> nodes? If so, why can't we have a predefined column and value for the
> required tables on each publisher rather than logical replication
> generate that value while replicating data?
>
> --
> With Regards,
> Amit Kapila.
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-11-23 07:56:50 Re: Allow file inclusion in pg_hba and pg_ident files
Previous Message Peter Eisentraut 2022-11-23 07:52:43 drop postmaster symlink