Re: Logical replication existing data copy

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical replication existing data copy
Date: 2017-03-01 19:16:28
Message-ID: 87b4b93e-7632-2ed5-9fa9-9acfebd68013@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28/02/17 20:36, Erik Rijkers wrote:
> This is the most frequent error that happens while doing pgbench-runs
> over logical replication: I run it continuously all day, and every few
> hours an error occurs of the kind seen below: a table (pgbench_history,
> mostly) ends up 1 row short (673466 instead of 673467). I have the
> script wait a long time before calling it an error (because in theory it
> could still 'finish', and end successfully (although that has not
> happened yet, once the system got into this state).
>

Yeah it's unlikely if it's just one row. It suggests there might still
be some snapbuild issue, but I don't immediately see one and I run
couple thousand repeats of the test without getting any inconsistency.
Will continue digging.

>
> I gathered some info in this (proabably deadlocked) state in the hope
> there is something suspicious in there:
>

Hmm that didn't really reveal much :(

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-03-01 19:19:36 Re: timeouts in PostgresNode::psql
Previous Message Andres Freund 2017-03-01 18:28:23 Re: ANALYZE command progress checker