RE: Disable WAL logging to speed up data loading

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Simon Riggs' <simon(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "masao(dot)fujii(at)oss(dot)nttdata(dot)com" <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "laurenz(dot)albe(at)cybertec(dot)at" <laurenz(dot)albe(at)cybertec(dot)at>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Disable WAL logging to speed up data loading
Date: 2021-01-01 03:51:26
Message-ID: TYAPR01MB2990E9EDA7F5CD0F88959B37FED50@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
> Agreed, it is a footgun. -1 to commit the patch as-is.
>
> The patch to avoid WAL is simple but it is dangerous for both the user
> and the PostgreSQL project.
>
> In my experience, people will use this option and when it crashes and
> they lose their data, they will claim PostgreSQL broke and that they
> were not stupid enough to use this option. Data loss has always been
> the most serious error for PostgreSQL and our reputation for
> protecting data has been hard won; it can easily be lost in a moment
> of madness. Please consider how the headlines will read, bearing in
> mind past incidents and negative press. Yes, we did think of this
> feature already and rejected it.

Could you share the negative press that blames Postgres due to the user's misuse of some feature like fsync?

> If we ever did allow such an option, it must contain these things (IMHO):
> * the option should be called "unsafe" or "allows_data_loss", not just
> "none" (anything less than "minimal" must be insufficient or
> unsafe...)

One idea I proposed is "wal_level = unrecoverable" because it clearly states the bad consequence and Oracle also uses UNRECOVERABLE as a data loading option (and I found it intuitive.) But others here commented that "none" would be OK. I don't have a strong opinion on naming, as I think what's important is warn the user in the documentation.

> * the option must be set in the control file and be part of the same
> patch, so users cannot easily edit things to hide their unsafe usage

wal_level setting is stored in the control file as before. If the server crashes while wal_level is set to none, the server refuses to start saying that's because wal_level is set to none, which is like MySQL. So, the user cannot hide their misuse.

> * we must not support the option of going down to "unsafe" and then
> back up again. It must be a one-way transition from "unsafe" to a
> higher level, so if people want to use this for temp reporting servers
> or initial loading, great, but they can't use it as a quick speed-up
> for databases containing needs-to-be-safe data. Possibly the state
> change might be "unsafe" -> "needs_backup" -> "minimal"... or some
> other way to signal to backup.

I'm afraid I don't get a clear image. Could you elaborate on that? But anyway, I think that could be an overreaction and a prominent caution would suffice (like the one in this patch.)

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-01-01 04:50:47 Key management with tests
Previous Message tsunakawa.takay@fujitsu.com 2021-01-01 03:27:31 RE: Disable WAL logging to speed up data loading