Re: Mark a transaction uncommittable

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
Cc: Gurjeet Singh <gurjeet(at)singh(dot)im>, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Mark a transaction uncommittable
Date: 2023-04-22 23:33:30
Message-ID: 20230422233330.hdzb5g44qppyc2ui@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Sat, Apr 22, 2023 at 12:53:23PM -0400, Isaac Morland wrote:
>
> I have an application for this: creating various dev/test versions of data
> from production.
>
> Start by restoring a copy of production from backup. Then successively
> create several altered versions of the data and save them to a place where
> developers can pick them up. For example, you might have one version which
> has all data old than 1 year deleted, and another where 99% of the
> students/customers/whatever are deleted. Anonymization could also be
> applied. This would give you realistic (because it ultimately originates
> from production) test data.
>
> This could be done by starting a non-committable transaction, making the
> adjustments, then doing a pg_dump in the same transaction (using --snapshot
> to allow it to see that transaction). Then rollback, and repeat for the
> other versions. This saves repeatedly restoring the (probably very large)
> production data each time.

There already are tools to handle those use cases. Looks for instance at
https://github.com/mla/pg_sample to backup a consistent subset of the data, or
https://github.com/rjuju/pg_anonymize to transparently pg_dump (or
interactively query) anonymized data.

Both tool also works when connected on a physical standby, while trying to
update data before dumping them wouldn't.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-04-23 01:09:14 Bufferless buffered files
Previous Message Andres Freund 2023-04-22 22:47:21 Re: Should we remove vacuum_defer_cleanup_age?