Re: CDC/ETL system on top of logical replication with pgoutput, custom client

From: Andres Freund <andres(at)anarazel(dot)de>
To: José Neves <rafaneves3(at)msn(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CDC/ETL system on top of logical replication with pgoutput, custom client
Date: 2023-07-31 20:39:27
Message-ID: 20230731203927.6nurjhw5ltisfhc2@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-07-31 14:16:22 +0000, José Neves wrote:
> Hi Amit, thanks for the reply.
>
> In our worker (custom pg replication client), we care only about INSERT,
> UPDATE, and DELETE operations, which - sure - may be part of the issue.

That seems likely. Postgres streams out changes in commit order, not in order
of the changes having been made (that'd not work due to rollbacks etc). If you
just disregard transactions entirely, you'll get something bogus after
retries.

You don't need to store the details for each commit in the target system, just
up to which LSN you have processed *commit records*. E.g. if you have received
and safely stored up to commit 0/1000, you need to remember that.

Are you using the 'streaming' mode / option to pgoutput?

> 1. We have no way to match LSN operations with the respective commit, as
> they have unordered offsets.

Not sure what you mean with "unordered offsets"?

> Assuming that all of them were received in order, we would commit all data with the commit message LSN4-4000 as other events would match the transaction start and end LSN interval of it.

Logical decoding sends out changes in a deterministic order and you won't see
out of order data when using TCP (the entire connection can obviously fail
though).

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tristen Raab 2023-07-31 20:44:21 Re: Correct the documentation for work_mem
Previous Message Robert Haas 2023-07-31 20:06:23 Re: pgsql: Fix search_path to a safe value during maintenance operations.