Re: Anonymized database dumps

From: Kiriakos Georgiou <kg(dot)postgresql(at)olympiakos(dot)com>
To: Janning Vygen <vygen(at)kicktipp(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Anonymized database dumps
Date: 2012-03-19 05:24:52
Message-ID: 17951721-E0B8-4CEA-808F-E837FF6C7443@olympiakos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I would store sensitive data encrypted in the database. Check the pgcrypto module.

Kiriakos

On Mar 18, 2012, at 1:00 PM, Janning Vygen wrote:

> Hi,
>
> I am working on postgresql 9.1 and loving it!
>
> Sometimes we need a full database dump to test some performance issues with real data.
>
> Of course we don't like to have sensible data like bunches of e-mail addresses on our development machines as they are of no interest for developers and should be kept secure.
>
> So we need an anonymized database dump. I thought about a few ways to achieve this.
>
> 1. Best solution would be a special db user and some rules which fire on reading some tables and replace privacy data with some random data. Now doing a dump as this special user doesn't even copy the sensible data at all. The user just has a different view on this database even when he calls pg_dump.
>
> But as rules are not fired on COPY it can't work, right?
>
> 2. The other solution I can think of is something like
>
> pg_dump | sed > pgdump_anon
>
> where 'sed' does a lot of magical replace operations on the content of the dump. I don't think this is going to work reliable.
>
> 3. More reliable would be to dump the database, restore it on a different server, run some sql script which randomize some data, and dump it again. hmm, seems to be the only reliable way so far. But it is no fun when dumping and restoring takes an hour.
>
> Does anybody has a better idea how to achieve an anonymized database dump?
>
> regards
> Janning
>
>
>
>
>
> --
> Kicktipp GmbH
>
> Venloer Straße 8, 40477 Düsseldorf
> Sitz der Gesellschaft: Düsseldorf
> Geschäftsführung: Janning Vygen
> Handelsregister Düsseldorf: HRB 55639
>
> http://www.kicktipp.de/
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kiriakos Georgiou 2012-03-19 05:47:31 Re: How to isolate the result of SELECT's?
Previous Message Aleksey Tsalolikhin 2012-03-19 04:06:02 nice'ing the postgres COPY backend process to make pg_dumps run more "softly"