Quick Links

Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	"federico(at)brandwatch(dot)com" <federico(at)brandwatch(dot)com>
Cc:	"pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject:	Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica
Date:	2013-06-02 00:17:44
Message-ID:	CAMkU=1x2zB2c8w=+nwQqXUBE0KvpYdJJ1iPYtD5ko12aP029Yw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Thursday, May 30, 2013, wrote:

> The following bug has been logged on the website:
>
> Bug reference: 8192
> Logged by: Federico Campoli
> Email address: federico(at)brandwatch(dot)com <javascript:;>
> PostgreSQL version: 9.2.4
> Operating system: Debian 6.0
> Description:
>
> /*
>
> Description:
>
> It seems on very large tables the concurrent update with vacuum (or
> autovacuum),
> when the slave is in hot standby mode, generates long loops in read on a
> single wal segment during the recovery process.
>
> This have two nasty effects.
> A massive read IO peak and the replay lag increasing as the recovery
> process
> hangs for long periods on a pointless loop.
>

Are you observing a loop, and if so how are you observing it? What is it
that is looping?

> SET client_min_messages='debug2';
> SET trace_sort='on';
>

Are these settings useful? What are they showing you?

>
> --in a new session and start an huge table update
> UPDATE t_vacuum set ts_time=now() WHERE i_id_row<20000000;
>
> --then vacuum the table
> VACUUM VERBOSE t_vacuum;
>

Are you running the update and vacuum concurrently or serially?

>
> --at some point the startup process will stuck recovering one single wal
> file and
> --the DISK READ column will show a huge IO for a while.
>

What is huge?

I don't know if I can reproduce this or not. I certainly get spiky lag,
but I see no reason to think it is anything other than IO congestion,
occurring during stretches of WAL records where compact records describe a
larger amount of work that needs to be done in terms of poorly-cached IO.
Perhaps the kernel's read-ahead mechanism is not working as well as it
theoretically could be. Also the standby isn't using a ring-buffer
strategy, but I see no reason to think it would help were it to do so.

The DISK READ column is not what I would call huge during this, often 10-15
MB/S, because much of the IO is scattered rather than sequential. The IO
wait % on the other hand is maxed out.

It is hard to consider it as a bug that the performance is not as high as
one might wish it to be. Is this behavior a regression from some earlier
version? What if hot-standby is turned off?

Cheers,

Jeff

In response to

BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica at 2013-05-30 16:43:25 from federico

Responses

Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica at 2013-06-04 12:57:58 from Federico Campoli

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Veres Lajos	2013-06-02 15:48:18	Re: BUG #8193: A few cosmetic misspell fixes.
Previous Message	Simon Riggs	2013-06-01 19:04:13	Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica