Re: Rearchitecting for storage

From: Matthew Pounsett <matt(at)conundrum(dot)com>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Rearchitecting for storage
Date: 2019-07-18 20:09:07
Message-ID: CAAiTEH_VY_hPVBFNrkdD4YHHst5WJff9=gCt2cE3B0xeFV_2Hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 18 Jul 2019 at 13:34, Kenneth Marshall <ktm(at)rice(dot)edu> wrote:

> Hi Matt,
>

Hi! Thanks for your reply.

> Have you considered using the VDO compression for tables that are less
> update intensive. Using just compression you can get almost 4X size
> reduction. For a database, I would forgo the deduplication function.
> You can then use a non-compressed tablespace for the heavier I/O tables
> and indexes.
>

VDO is a RedHat-only thing, isn't it? We're not running RHEL... Debian.
Anyway, the bulk of the data (nearly 80%) is in a single table and its
indexes. ~6TB to the table, and ~12TB to its indices. Even if we switched
over to RedHat, there's no value in compressing lesser-used tables.

>
> > My understanding of the standard
> > upgrade process is that this requires that the data directory be smaller
> > than the free storage (so that there is room to hold two copies of the
> data
> > directory simultaneously).
>
> The link option with pg_upgrade does not require 2X the space, since it
> uses hard links instead of copying the files to the new cluster.
>

That would likely keep the extra storage requirements small, but still
non-zero. Presumably the upgrade would be unnecessary if it could be done
without rewriting files. Is there any rule of thumb for making sure one
has enough space available for the upgrade? I suppose that would come
down to what exactly needs to get rewritten, in what order, etc., but the
pg_upgrade docs don't seem to have that detail. For example, since we've
got an ~18TB table (including its indices), if that needs to be rewritten
then we're still looking at requiring significant extra storage. Recent
experience suggests postgres won't necessarily do things in the most
storage-efficient way.. we just had a reindex on that database fail (in
--single-user) because 17TB was insufficient free storage for the db to
grow into.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2019-07-18 20:55:22 Re: Possible Values of Command Tag in PG Log file
Previous Message Kumar, Virendra 2019-07-18 19:58:14 Possible Values of Command Tag in PG Log file