Re: Number of tables

From: Craig James <craig_james(at)emolecules(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: fabio(dot)lafarcioli(at)molinoalimonti(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Number of tables
Date: 2009-08-20 22:18:59
Message-ID: 4A8DCBD3.6060503@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greg Stark wrote:
> What you want is a multi-column primary key where userid is part of
> the key. You don't want to have a separate table for each user unless
> each user has their own unique set of columns.

Not always true.

>> When the user logs back in, a hidden part of the login process gets a table
>> from the pool of available tables, assigns it to this user, and copies the
>> user's data from the archive into this personal table. They are now ready
>> to work. This whole process takes just a fraction of a second for most
>> users.
>
> And what does all this accomplish?

The primary difference is between

delete from big_table where userid = xx

vesus

truncate user_table

There are also significant differences in performance for large inserts, because a single-user table almost never needs indexes at all, whereas a big table for everyone has to have at least one user-id column that's indexed.

In our application, the per-user tables are "hitlists" -- scratch lists that are populated something like this. The hitlist is something like this:

create table hitlist_xxx (
row_id integer,
sortorder integer default nextval('hitlist_seq_xxx')
)

truncate table hitlist_xxx;
select setval(hitlist_seq_xxx, 1, false);
insert into hitlist_xxx (row_id) (select some_id from ... where ... order by ...);

Once the hitlist is populated, the user can page through it quickly with no further searching, e.g. using a web app.

We tested the performance using a single large table in Postgres, and it was not nearly what we needed. These hitlists tend to be transitory, and the typical operation is to discard the entire list and create a new one. Sometimes the user will sort the entire list based on some criterion, which also requires a copy/delete/re-insert using a new order-by.

With both Oracle and Postgres, truncate is MUCH faster than delete, and the added index needed for a single large table only makes it worse. With Postgres, the repeated large delete/insert makes for tables that need a lot of vacuuming and index bloat, further hurting performance.

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Slava Moudry 2009-08-20 22:59:43 Re: number of rows estimation for bit-AND operation
Previous Message Greg Stark 2009-08-20 21:41:57 Re: Number of tables