Re: Complete data erasure

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "asaba(dot)takanori(at)fujitsu(dot)com" <asaba(dot)takanori(at)fujitsu(dot)com>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Complete data erasure
Date: 2020-02-03 17:30:38
Message-ID: 20200203173038.wawcyw2rzmihj45z@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 03, 2020 at 09:07:09AM -0500, Stephen Frost wrote:
>Greetings,
>
>* Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
>> On Tue, Jan 28, 2020 at 02:34:07PM -0500, Stephen Frost wrote:
>> >We certainly can't run external commands during transaction COMMIT, so
>> >this can't be part of a regular DROP TABLE.
>>
>> IMO the best solution would be that the DROP TABLE does everything as
>> usual, but instead of deleting the relfilenode it moves it to some sort
>> of queue. And then a background worker would "erase" these relfilenodes
>> outside the COMMIT.
>
>That sounds interesting, though I'm a bit worried that it's going to
>lead to the same kind of complications and difficulty that we have with
>deleted columns- anything that's working with the system tables will
>need to see this new "dropped but pending delete" flag. Would we also
>rename the table when this happens? Or change the schema it's in?
>Otherwise, your typical DROP IF EXISTS / CREATE could end up breaking.
>

That's not really what I meant - let me explain. When I said DROP TABLE
should do everything as usual, that includes catalog changes. I.e. after
the commit there would not be any remaining entries in system catalogs
or anything like that.

The only thing we'd do differently is that instead of unlinking the
relfilenode segments, we'd move the relfilenode to a persistent queue
(essentially a regular table used as a queue relfilenodes). The
background worker would watch the queue, and when it gets a new
relfilenode it'd "delete" the data and then remove the relfilenode from
the queue.

So essentially others would not be able to even see the (now dropped)
object, they could create new object with the same name etc.

I imagine we might provide a way to wait for the deletion to actually
complete (can't do that as part of the DROP TABLE, though), so that
people can be sure when the data is actually gone (for scripts etc.).
A simple function waiting for the queue to get empty might be enough, I
guess, but maybe not.

>> And yes, we need to do this in a way that works with replicas, i.e. we
>> need to WAL-log it somehow. And it should to be done in a way that works
>> when the replica is on a different type of filesystem.
>
>I agree it should go through WAL somehow (ideally without needing an
>actual zero'd or whatever page for every page in the relation), but why
>do we care about the filesystem on the replica? We don't have anything
>that's really filesystem specific in WAL replay today and I don't see
>this as needing to change that..
>

I think this depends on what our requirements are.

My assumption is that when you perform this "secure data erasure" on the
primary, you probably also want to erase the data on the replica. But if
the instances use different file systems (COW vs. non-COW, ...) the
exact thing that needs to happen may be different. Or maybe the replica
does not need to do anything, making it noop?

In which case we don't need to WAL-log the exact change for each page,
it might even be fine to not even WAL-log anything except for the final
removal from the queue. I mean, the data is worthless and not used by
anyone at this point, there's no point in replicating it ...

I haven't thought about this very hard. It's not clear what should
happen if we complete the erasure on primary, remove the relfilenode
from the queue, and then restart the replica before it finishes the
local erasure. The queue (if represented by a simple table) will be
replicated, so the replica will forget it still has work to do.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2020-02-03 17:41:56 Re: Portal->commandTag as an enum
Previous Message Andres Freund 2020-02-03 15:59:11 Re: [PoC] Non-volatile WAL buffer