Skip site navigation (1) Skip section navigation (2)

PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the fastest way to clean each non-empty table and reset unique identifier column of empty ones.

From: Stanislaw Pankevich <s(dot)pankevich(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the fastest way to clean each non-empty table and reset unique identifier column of empty ones.
Date: 2012-07-03 15:22:43
Message-ID: CAFXpGYbgmZYij4TgCbOF24-usoiDD0ASQeaVAkYtB7E2TYm8Wg@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hello,

My question below is almost exact copy of the on on SO:
http://stackoverflow.com/questions/11311079/postgresql-db-30-tables-with-number-of-rows-100-not-huge-the-fastest-way

The post on SO caused a few answers, all as one stating "DO ONLY TRUNCATION
- this is the fast".

Also I think I've met some amount of misunderstanding of what exactly do I
want. I would appreciate it great, if you try, as people whom I may trust
in performance question.

Here goes the SO subject, formulating exact task I want to accomplish, this
procedure is intended to be run beetween after or before each test, ensure
database is cleaned enough and has reset unique identifiers column (User.id
of the first User should be nor the number left from previous test in a
test suite but 1). Here goes the message:

==== PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the
fastest way to clean each non-empty table and reset unique identifier
column of empty ones ====

I wonder, what is the fastest way to accomplish this kind of task in
PostgreSQL. I am interested in the fastest solutions ever possible.

I found myself such kind of solution for MySQL, it performs much faster
than just truncation of tables one by one. But anyway, I am interested in
the fastest solutions for MySQL too. See my result here, of course it it
for MySQL only: https://github.com/bmabey/database_cleaner/issues/126

I have following assumptions:

    I have 30-100 tables. Let them be 30.

    Half of the tables are empty.

    Each non-empty table has, say, no more than 100 rows. By this I mean,
tables are NOT large.

    I need an optional possibility to exclude 2 or 5 or N tables from this
procedure.

    I cannot! use transactions.

I need the fastest cleaning strategy for such case working on PostgreSQL
both 8 and 9.

I see the following approaches:

1) Truncate each table. It is too slow, I think, especially for empty
tables.

2) Check each table for emptiness by more faster method, and then if it is
empty reset its unique identifier column (analog of AUTO_INCREMENT in
MySQL) to initial state (1), i.e to restore its last_value from sequence
(the same AUTO_INCREMENT analog) back to 1, otherwise run truncate on it.

I use Ruby code to iterate through all tables, calling code below on each
of them, I tried to setup SQL code running against each table like:

DO $$DECLARE r record;
BEGIN
  somehow_captured = SELECT last_value from #{table}_id_seq
  IF (somehow_captured == 1) THEN
    == restore initial unique identifier column value here ==
  END

  IF (somehow_captured > 1) THEN
    TRUNCATE TABLE #{table};
  END IF;
END$$;

Manipulating this code in various aspects, I couldn't make it work, because
of I am unfamiliar with PostgreSQL functions and blocks (and variables).

Also my guess was that EXISTS(SELECT something FROM TABLE) could somehow be
used to work good as one of the "check procedure" units, cleaning procedure
should consist of, but haven't accomplished it too.

I would appreciate any hints on how this procedure could be accomplished in
PostgreSQL native way.

Thanks!

UPDATE:

I need all this to run unit and integration tests for Ruby or Ruby on Rails
projects. Each test should have a clean DB before it runs, or to do a
cleanup after itself (so called teardown). Transactions are very good, but
they become unusable when running tests against particular webdrivers, in
my case the switch to truncation strategy is needed. Once I updated that
with reference to RoR, please do not post here the answers about
"Obviously, you need DatabaseCleaner for PG" and so on and so on.

==== post ends ====

Thanks,

Stanislaw.

Responses

pgsql-performance by date

Next:From: PVDate: 2012-07-03 15:43:54
Subject: Re: static virtual columns as result?
Previous:From: Kevin GrittnerDate: 2012-07-03 13:44:50
Subject: Re: static virtual columns as result?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group