Re: Disable WAL logging to speed up data loading

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: osumi(dot)takamichi(at)fujitsu(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, tsunakawa(dot)takay(at)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Disable WAL logging to speed up data loading
Date: 2020-10-01 02:51:36
Message-ID: 20201001.115136.288898409051085426.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 1 Oct 2020 10:01:56 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2020/09/30 12:10, osumi(dot)takamichi(at)fujitsu(dot)com wrote:
> > Hello, Ashutosh
> >
> >> Can they use a database with all unlogged tables?
> > Unfortunately, no. They want to switch a cluster condition to "without
> > WAL logging"
> > only when they execute night bulk loading for their data warehouse.
> > In other words, they would like to keep their other usual operations
> > with WAL.
> > In addition, using all tables as unlogged brings about
> > the risk to lose data warehouse's data caused by an unexpected server
> > crash or power outage.
>
> But the same issue can happen even in the proposed approach
> because Tsunakawa-san explains as follows?
>
> > The server refuses to start (pg_ctl start fails) after an abnormal
> > shutdown due to power outage, pg_ctl's immediate shutdown, etc,
> > showing a straightforward message like MySQL.

Exactly. Crash safety cannot be a rationale for this feature.

The proposed feature seems to give an advantage to a massive loading
onto a already very large table. Currently turning a large logged
table into an unlogged one runs a table copy that takes a long
time. After the bulk-loading, to put back the table huge amount of WAL
is emitted and a table-copy takes a long time. If we had the
wal_level=none feature, the preparation time is not needed and the
put-back time can be offloaded to online backing-up. The crash-unsafe
period is inevitable but would be shorten by using storage snapshots.
On the other hand, a massive loading to an empty table is efficiently
performed by wal_level=minimal.

As I mentioned in another thread, if ALTER TABLE LOGGED/UNLOGGED
doesn't need a table copy (consequently no insertion WAL are emitted),
it would work. I'm not which is better, or easier to realize.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Keisuke Kuroda 2020-10-01 02:51:54 Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables
Previous Message Amit Kapila 2020-10-01 02:48:50 Re: [Patch] Optimize dropping of relation buffers using dlist