Re: tables with 300+ partitions

From: Pablo Alcaraz <pabloa(at)laotraesquina(dot)com(dot)ar>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Steven Flatt <steven(dot)flatt(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: tables with 300+ partitions
Date: 2007-10-31 20:15:34
Message-ID: 4728E266.6030709@laotraesquina.com.ar
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott Marlowe wrote:
> On 10/31/07, Pablo Alcaraz <pabloa(at)laotraesquina(dot)com(dot)ar> wrote:
>
>> Steven Flatt wrote:
>>
>> On 10/30/07, Pablo Alcaraz <pabloa(at)laotraesquina(dot)com(dot)ar> wrote:
>>
>>> I did some testing. I created a 300 partitioned empty table. Then, I
>>> inserted some rows on it and the perfomance was SLOW too.
>>>
>> Is the problem with inserting to the partitioned table or selecting from it?
>> It sounds like inserting is the problem in which case I ask: how are you
>> redirecting inserts to the appropriate partition? If you're using rules,
>> then insert performance will quickly degrade with number of partitions as
>> *every* rule needs to be evaluated for *every* row inserted to the base
>> table. Using a trigger which you can modify according to some schedule is
>> much faster, or better yet, use some application-level logic to insert
>> directly to the desired partition.
>>
>> Steve I was a program inserting into the base table. The program ran in 200+
>> threads and every thread insert data on it. Every thread inserts a row every
>> 3 seconds aprox.(or they used to do it), but when I put more partitions the
>> insert speed went to 1 insert every 2 minutes.
>>
>> The selects that need to evaluate all partitions were slow too, but I think
>> I can wait for them. :D
>>
>> I wonder if the update are slow too. I do not know that.
>>
>> Do I need to do a trigger for insert only or I need a trigger to update and
>> delete too?
>>
>
> You need a trigger for any update / delete / insert you don't want to
> be really slow. Basically, if a rule is doing it now, you need a
> trigger to do it to speed it up.
>
> My experience has been that at 200 to 1000 partitions, the speed of
> the smaller tables still makes selects faster than with one big table
> for certain kinds of access. At some point, the cost of planning a
> lookup against thousands of tables will be more than the savings of
> looking in a really small table.
>
> The nice thing about triggers is that you can use maths to figure out
> the name of the table you'll be writing to instead of a static table
> like most rules use. So, all you have to do is make sure the new
> tables get added under the parent and poof, you're ready to go, no
> need for a new trigger.
>
>

Currently I have a insert rule only and the updates are right solved. I
think the UPDATEs use the constraint because the program use the base
table everywhere.

This is the base table structure:

-- Table: t

-- DROP TABLE t;

CREATE TABLE t
(
idt bigint NOT NULL,
idtpadre bigint NOT NULL,
e integer NOT NULL,
dmodi timestamp without time zone NOT NULL DEFAULT now(),
p integer NOT NULL DEFAULT 0,
m text NOT NULL
)
WITHOUT OIDS;
ALTER TABLE t OWNER TO e;

-- Rule: "t_update_00003 ON t"

-- DROP RULE t_update_00003 ON t;

CREATE OR REPLACE RULE t_update_00003 AS
ON INSERT TO t
WHERE new.idt >= 1::bigint AND new.idt <= 30000000::bigint DO
INSTEAD INSERT INTO t_00003 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);

-- Rule: "t_update_00006 ON t"

-- DROP RULE t_update_00006 ON t;

CREATE OR REPLACE RULE t_update_00006 AS
ON INSERT TO t
WHERE new.idt >= 30000001::bigint AND new.idt <= 60000000::bigint DO
INSTEAD INSERT INTO t_00006 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);

-- Rule: "t_update_00009 ON t"

-- DROP RULE t_update_00009 ON t;

CREATE OR REPLACE RULE t_update_00009 AS
ON INSERT TO t
WHERE new.idt >= 60000001::bigint AND new.idt <= 90000000::bigint DO
INSTEAD INSERT INTO t_00009 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);

-- Rule: "t_update_00012 ON t"

-- DROP RULE t_update_00012 ON t;

CREATE OR REPLACE RULE t_update_00012 AS
ON INSERT TO t
WHERE new.idt >= 90000001::bigint AND new.idt <= 120000000::bigint DO
INSTEAD INSERT INTO t_00012 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);

etc ... 300 hundred partitions

The partitions are created like:

CREATE TABLE t_00003
(
CONSTRAINT t_00003_pkey PRIMARY KEY (idt),
CONSTRAINT t_00003_idt_check CHECK (idt >= 1::bigint AND idt <=
30000000::bigint)
) INHERITS (t)
WITHOUT OIDS;
ALTER TABLE t_00003 OWNER TO e;

CREATE INDEX t_00003_e
ON t_00003
USING btree
(e);

etc

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Sami Dalouche 2007-10-31 21:22:19 [Fwd: Re: Outer joins and Seq scans]
Previous Message Magnus Hagander 2007-10-31 19:27:07 Re: Hardware for PostgreSQL