PostgreSQL logical replication depends on WAL segments?

From: Josef Machytka <josef(dot)machytka(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: PostgreSQL logical replication depends on WAL segments?
Date: 2019-01-22 13:18:12
Message-ID: CAGvVEFvq_VM9LhYPeu+Uw__gEVvrBffGL=FO-88cZEp-35+arA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello, I already tried to ask on stackoverflow but so far without success.
(
https://stackoverflow.com/questions/54292816/postgresql-logical-replication-depends-on-wal-segments
)

Could someone help me please?

****

I am successfully using logical replication between 2 PG 11 cloud VMs for
latest data. But I tried to publish also some older tables to transfer data
between databases and got strange error about missing WAL segment.

These older partitions contain data 5-6 days old. I successfully published
them on master and refreshed subscription on logical replica. But now I am
getting these strange error messages on logical replica:

2019-01-21 15:03:14.713 UTC [17203] LOG: logical replication table
synchronization worker for subscription "mysubscription", table
"mytable_20190115" has finished
2019-01-21 15:03:19.768 UTC [18877] LOG: logical replication apply
worker for subscription "mysubscription" has started
2019-01-21 15:03:19.797 UTC [18877] ERROR: could not receive data
from WAL stream: ERROR: requested WAL segment
000000010000098E000000CB has already been removed
2019-01-21 15:03:19.799 UTC [29534] LOG: background worker "logical
replication worker" (PID 18877) exited with exit code 1
2019-01-21 15:03:24.806 UTC [18910] LOG: logical replication apply
worker for subscription "mysubscription" has started
2019-01-21 15:03:24.824 UTC [18911] LOG: logical replication table
synchronization worker for subscription "mysubscription", table
"mytable_20190116" has started
2019-01-21 15:03:24.831 UTC [18910] ERROR: could not receive data
from WAL stream: ERROR: requested WAL segment
000000010000098E000000CB has already been removed
2019-01-21 15:03:24.834 UTC [29534] LOG: background worker "logical
replication worker" (PID 18910) exited with exit code 1

Which is confusing for me. I tried to find some info but did not find
anything about logical replication depending on WAL segments.

There is no streaming replication running on that particular master and
these error messages I see on both master and replica connected with only
logical replication.

Am I doing something wrong? Is there some special way how to publish older
data? For newer data and latest data all works without problems.

Of course since I published like ~20 tables it took some time for replica
to process all tables - currently it processes always 2 at the time. But I
still do not understand why it should depend on WAL segments... Thank you
very much.

I tried to unpublished and unsubscribe these older tables and publish and
subscribe them again but getting still the same error message for the
exactly the same WAL segment number.

I unpublished and unsubscribed those problematic tables and error messages
stopped so they were definitely related to logical replication. Could they
be caused by snapshot?

I even made additional strange experience with WAL segments errors - my
logical replica had only quite small disk and during all that fiddling I
forgot to check disk usage. So postgresql on logical replica crashed due to
full disk. Since I use GCE I just resized root disk and after restart of
the instance got more space. But I also got back missing WAL segments
errors in connections with logical replication. My postgresql log on
replica is now full of sequence of these 3 lines:

2019-01-22 09:47:14.408 UTC [1946] LOG: logical replication apply
worker for subscription "mysubscription" has started
2019-01-22 09:47:14.429 UTC [1946] ERROR: could not receive data from
WAL stream: ERROR: requested WAL segment 000000010000099D0000007A has
already been removed
2019-01-22 09:47:14.431 UTC [737] LOG: background worker "logical
replication worker" (PID 1946) exited with exit code 1

Why logical replication depends on some old WAL segments? Today's data seem
to work perfectly although there cannot be all WAL segments for today
available on the logical master. But I am unable to publish older data...

Thanks for help.

Josef Machytka

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Achilleas Mantzios 2019-01-22 13:25:32 Re: PostgreSQL logical replication depends on WAL segments?
Previous Message Rangaraj G 2019-01-22 12:54:06 RE: Memory and hard ware calculation :