Re: xid wrap / optimize frozen tables?

From: Nils Goroll <slink(at)schokola(dot)de>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: xid wrap / optimize frozen tables?
Date: 2015-05-24 11:42:57
Message-ID: 5561B941.3080102@schokola.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jeff and all,

On 23/05/15 22:13, Jeff Janes wrote:
> Are you sure it is the read IO that causes the problem?

Yes. Trouble is here that we are talking about a 361 GB table

List of relations
Schema | Name | Type | Owner | Size |
Description
--------+-----------------------------+----------+----------+------------+-------------
public | *redacted*_y2015m04 | table | postgres | 361 GB |

and while we have

shared_buffers = 325GB
huge_pages = on

this is not the only table of this size (total db size ist 1.8tb) and more
current data got written to *redacted*_y2015m05 (the manually-partitioned table
for may), so most of the m04 data would have got evicted from the cache when
this issue surfaced initially.

There is one application pushing data (bulk inserts) and we have transaction
rates for this app in a log. The moment the vacuum started, these rates dropped.
Unfortunately I cannot present helpful log excerpts here as the autovacuum never
finished so far (because the admin killed the db), so we have zero logging about
past autovac events.

At the moment, the application is shut down and the machine is only running the
vacs:

query_start | 2015-05-22 19:33:52.44334+02
waiting | f
query | autovacuum: VACUUM public.*redacted*_y2015m04 (to prevent
wraparound)
query_start | 2015-05-22 19:34:02.46004+02
waiting | f
query | autovacuum: VACUUM ANALYZE public.*redacted*_y2015m05 (to
prevent wraparound)

so we know that any io must be caused by the vacs:

shell# uptime
13:33:33 up 1 day, 18:01, 2 users, load average: 5.75, 12.71, 8.43
shell# zpool iostat
capacity operations bandwidth
pool alloc free read write read write
--------------- ----- ----- ----- ----- ----- -----
tank1 358G 6.90T 872 55 15.1M 3.08M

Again, we know IO capacity is insufficient, the pool is on 2 magnetic disks only
atm, so an avg read rate of 872 IOPS averaged over 42 hours is not even bad.

> I don't know happened to that, but there is another patch waiting for review and
> testing:
>
> https://commitfest.postgresql.org/5/221/

This is really interesting, thank you very much for the pointer.

Cheers, Nils

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-05-24 12:31:32 Re: Run pgindent now?
Previous Message Christoph Berg 2015-05-24 09:04:04 Re: fsync-pgdata-on-recovery tries to write to more files than previously