Re: Make drop database safer

From: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alexandra Wang <lewang(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Make drop database safer
Date: 2019-03-06 19:45:21
Message-ID: CALfoeit-okEabGKJV_CSwN8J1DjY2qLuniVKGPTjX=-rD2xwYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 6, 2019 at 7:56 AM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

>
>
> On 2/12/19 12:55 AM, Ashwin Agrawal wrote:
> >
> > Thanks for the response and inputs.
> >
> > On Sat, Feb 9, 2019 at 4:51 AM Andres Freund <andres(at)anarazel(dot)de
> > <mailto:andres(at)anarazel(dot)de>> wrote:
> >
> > Hi,
> >
> > On 2019-02-08 16:36:13 -0800, Alexandra Wang wrote:
> > > Current sequence of operations for drop database (dropdb())
> > > 1. Start Transaction
> > > 2. Make catalog changes
> > > 3. Drop database buffers
> > > 4. Forget database fsync requests
> > > 5. Checkpoint
> > > 6. Delete database directory
> > > 7. Commit Transaction
> > >
> > > Problem
> > > This sequence is unsafe from couple of fronts. Like if drop
> database,
> > > aborts (means system can crash/shutdown can happen) right after
> > buffers are
> > > dropped step 3 or step 4. The database will still exist and fully
> > > accessible but will loose the data from the dirty buffers. This
> > seems very
> > > bad.
> > >
> > > Operation can abort after step 5 as well in which can the entries
> > remain in
> > > catalog but the database is not accessible. Which is bad as well
> > but not as
> > > severe as above case mentioned, where it exists but some stuff
> goes
> > > magically missing.
> > >
> > > Repo:
> > > ```
> > > CREATE DATABASE test;
> > > \c test
> > > CREATE TABLE t1(a int); CREATE TABLE t2(a int); CREATE TABLE t3(a
> > int);
> > > \c postgres
> > > DROP DATABASE test; <<====== kill the session after
> > DropDatabaseBuffers()
> > > (make sure to issue checkpoint before killing the session)
> > > ```
> > >
> > > Proposed ways to fix
> > > 1. CommitTransactionCommand() right after step 2. This makes it
> > fully safe
> > > as the catalog will have the database dropped. Files may still
> > exist on
> > > disk in some cases which is okay. This also makes it consistent
> > with the
> > > approach used in movedb().
> >
> > To me this seems bad. The current failure mode obviously isn't good,
> but
> > the data obviously isn't valuable, and just loosing track of an
> entire
> > database worth of data seems worse.
> >
> >
> > So, based on that response seems not loosing track to the files
> > associated with the database is design choice we wish to achieve. Hence
> > catalog having entry but data directory being deleted is fine behavior
> > to have and doesn't need to be solved.
> >
>
> What about adding 'is dropped' flag to pg_database, set it to true at
> the beginning of DROP DATABASE and commit? And ensure no one can connect
> to such database, making DROP DATABASE the only allowed operation?
>
> ISTM we could then continue doing the same thing we do today, without
> any extra checkpoints etc.
>

Sure, adding 'is dropped' column flag to pg_database and committing that
update before dropping database buffers can give us the safety and also
allows to keep the link. But seems very heavy duty way to gain the desired
behavior with extra column in pg_database catalog table specifically just
to protect against this failure window. If this solution gets at least one
more vote as okay route to take, I am fine implementing it.

I am surprised though that keeping the link to database worth of data and
not losing track of it is preferred for dropdb() but not cared for in
movedb(). In movedb(), transaction commits first and then old database
directory is deleted, which was the first solution proposed for dropdb().

FWIW I don't recall why exactly we need the checkpoints, except perhaps
> to ensure the file copies see the most recent data (in CREATE DATABASE)
> and evict stuff for the to-be-dropped database from shared bufers. I
> wonder if we could do that without a checkpoint somehow ...
>

Checkpoint during CREATE DATABASE is understandable. But yes during
dropdb() seems unnecessary. Only rational seems for windows based on
comment in code "On Windows, this also ensures that background procs don't
hold any open files, which would cause rmdir() to fail." I think we can
avoid the checkpoint for all other platforms in dropdb() except windows.
Even for windows if have way to easily ensure no background procs have open
files, without checkpoint then can be avoided even for it.

> Considering the design choice we must meet, seems approach 2, moving
> > Checkpoint from step 5 before step 3 would give us the safety desired
> > and retain the desired link to the database till we actually delete the
> > files for it.
> >
>
> Ummm? That essentially means this order:
>
> 1. Start Transaction
> 2. Make catalog changes
> 5. Checkpoint
> 3. Drop database buffers
> 4. Forget database fsync requests
> 6. Delete database directory
> 7. Commit Transaction
>
> I don't see how that actually fixes any of the issues? Can you explain?
>

Since checkpoint is performed, all the dirty buffers make to the disk and
avoid loosing the data for this database if command fails after this and
database doesn't get dropped, problem we started this thread with. Drop
database buffers step after it will just be removing read-only buffers.
Forget database fsync requests step ideally can be removed as not needed
with that sequence.

> Not to mention we might end up doing quite a bit of I/O to checkpoint
> buffers from the database that is going to disappear shortly ...
>

Yes, that's the downside it comes with.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2019-03-06 19:50:57 Re: Optimization of some jsonb functions
Previous Message Tomas Vondra 2019-03-06 19:43:25 Re: performance issue in remove_from_unowned_list()