Re: Disable WAL logging to speed up data loading

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disable WAL logging to speed up data loading
Date: 2020-09-29 12:40:42
Message-ID: CAExHW5vz2pTHyoE4uetasuA6G5f03B4twmxgpK2mcPGu-L5VSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Can they use a database with all unlogged tables?

On Tue, Sep 29, 2020 at 1:58 PM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
>
> Hello,
>
>
> We'd like to propose a feature to disable WAL to speed up data loading. This was inspired by a feature added in the latest MySQL. I wish you won't fear this feature...
>
>
> BACKGROUND
> ========================================
>
> This branches off from [1] as mentioned therein. Briefly speaking, a customer wants to shorten the time for nightly loading of data into their data warehouse as much as possible to be prepared for using the data warehouse for new things.
>
> Currently, they are using Oracle's SQL*Loader with its no-logging feature. They want a similar feature to migrate to Postgres. Other than the data loading performance, they don't want to be concerned about the storage for large volumes of WAL.
>
> In [1], we thought about something like Oracle's per-table no-logging feature, but it seems difficult (or at least not easy.) Meanwhile, I found another feature added in the latest MySQL 8.0.21 [2]. This proposal follows it almost directly. That satisfies the customer request.
>
> As an aside, it's also conceivable that in the near future, users could see the WAL bottleneck (WAL buffer or disk) when they utilize the parallel COPY that is being developed in the community.
>
>
> FUNCTIONAL SPECIFICATION
> ========================================
>
> Add a new value 'none' to the server configuration parameter wal_level. With this setting:
>
> * No WAL is emitted.
>
> * The server refuses to start (pg_ctl start fails) after an abnormal shutdown due to power outage, pg_ctl's immediate shutdown, etc, showing a straightforward message like MySQL.
>
> * Features like continuous archiving, pg_basebackup, and streaming/logical replication that requires wal_level >= replica are not available.
>
> * The user can use all features again if you shut down the server successfully after data loading and reset wal_level to a value other than none. He needs to take a base backup or rebuild the replication standby after restarting the server.
>
>
> In addition to the cosmetic modifications to the manual articles that refer to wal_level, add a clause or paragraphs to the following sections to let users know the availability of this feature.
>
> 14.4. Populating a Database
> 18.6.1. Upgrading Data via pg_dumpall
>
>
> PROGRAM DESIGN (main point only)
> ========================================
>
> As in the bootstrap mode (during initdb), when wal_level = none, XLogInsert() does nothing and just returns a fixed value, which is the tail of the last shutdown checkpoint WAL record. As a result, the value is set to the relation page header's LSN field.
>
> In addition, it might be worth having XLogBeginInsert() and XLogRec...() to check wal_level and just return. I don't expect much from this, but it may be interesting to give it a try to see the squeezed performance.
>
> StartupXLOG() checks the wal_level setting in pg_control and quits the startup with ereport(FATAL) accordingly.
>
>
> [1]
> Implement UNLOGGED clause for COPY FROM
> https://www.postgresql.org/message-id/OSBPR01MB488887C0BDC5129C65DFC5E5ED640@OSBPR01MB4888.jpnprd01.prod.outlook.com
>
> [2]
> Disabling Redo Logging
> https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-disable-redo-logging
>
>
> Regards
> Takayuki Tsunakawa
>
>
>

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-09-29 13:00:34 Re: Parallel copy
Previous Message Heikki Linnakangas 2020-09-29 12:00:13 Re: Corner-case bug in pg_rewind