Re: PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the fastest way to clean each non-empty table and reset unique identifier column of empty ones.

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Stanislaw Pankevich *EXTERN*" <s(dot)pankevich(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the fastest way to clean each non-empty table and reset unique identifier column of empty ones.
Date: 2012-07-06 13:39:03
Message-ID: D960CB61B694CF459DCFB4B0128514C2081BF034@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Stanislaw Pankevich wrote:
> ==== PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the fastest way to clean each
> non-empty table and reset unique identifier column of empty ones ====
>
> I wonder, what is the fastest way to accomplish this kind of task in PostgreSQL. I am interested in
> the fastest solutions ever possible.

> I have following assumptions:
>
> I have 30-100 tables. Let them be 30.
>
> Half of the tables are empty.
>
> Each non-empty table has, say, no more than 100 rows. By this I mean, tables are NOT large.
>
> I need an optional possibility to exclude 2 or 5 or N tables from this procedure.
>
> I cannot! use transactions.

Why? That would definitely speed up everything.

> I need the fastest cleaning strategy for such case working on PostgreSQL both 8 and 9.
>
> I see the following approaches:
>
> 1) Truncate each table. It is too slow, I think, especially for empty tables.

Did you actually try it? That's the king's way to performance questions!
Truncating a single table is done in a matter of microseconds, particularly
if it is not big.
Do you have tens of thousands of tables?

> 2) Check each table for emptiness by more faster method, and then if it is empty reset its unique
> identifier column (analog of AUTO_INCREMENT in MySQL) to initial state (1), i.e to restore its
> last_value from sequence (the same AUTO_INCREMENT analog) back to 1, otherwise run truncate on it.

That seems fragile an won't work everywhere.

What if the table has no primary key with a DEFAULT that uses a sequence?
What if it has such a key, but the DEFAULT was not used for an INSERT?
What if somebody manually reset the sequence?

Besides, how do you find out what the sequence for a table's primary key
is? With a SELECT, I guess. That SELECT is probably not faster than
a simple TRUNCATE.

> Also my guess was that EXISTS(SELECT something FROM TABLE) could somehow be used to work good as one
> of the "check procedure" units, cleaning procedure should consist of, but haven't accomplished it too.

You could of course run a SELECT 1 FROM table LIMIT 1, but again I don't
think that this will be considerably faster than just truncating the table.

> I would appreciate any hints on how this procedure could be accomplished in PostgreSQL native way.
>
> Thanks!
>
> UPDATE:
>
> I need all this to run unit and integration tests for Ruby or Ruby on Rails projects. Each test should
> have a clean DB before it runs, or to do a cleanup after itself (so called teardown). Transactions are
> very good, but they become unusable when running tests against particular webdrivers, in my case the
> switch to truncation strategy is needed. Once I updated that with reference to RoR, please do not post
> here the answers about "Obviously, you need DatabaseCleaner for PG" and so on and so on.

I completely fail to understand what you talk about here.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Misa Simic 2012-07-06 13:43:58 Re: Paged Query
Previous Message Craig Ringer 2012-07-06 13:38:44 Re: PostgreSQL db, 30 tables with number of rows < 100 (not huge) - the fastest way to clean each non-empty table and reset unique identifier column of empty ones.