Re: Partitioning on Table with Foreign Key

From: Yan Cheng Cheok <yccheok(at)yahoo(dot)com>
To: Vick Khera <vivek(at)khera(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Partitioning on Table with Foreign Key
Date: 2010-01-27 03:16:47
Message-ID: 695492.82155.qm@web65710.mail.ac4.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello Vick,

Can I get some advice from your side?

Currently, I have table :

1 lot is pointing to many units
1 unit is pointing to many measurements

Say, let say, I want "Give me all the measurements result within this lot".

For a single lot, with 100,000 units, with each unit having 48 different measurements (total 4,800,000 measurements). It will takes ~30 seconds to return result.

I use the following stored procedure to help me do so.

======================================================
CREATE OR REPLACE FUNCTION get_measurements(IN bigint)
RETURNS TABLE(_type text, _value double precision, _unit text) AS
$BODY$DECLARE
_lotID ALIAS FOR $1;
BEGIN
RETURN QUERY EXECUTE 'SELECT measurement_type.value, measurement.value, measurement_unit.value
FROM
measurement INNER JOIN unit ON (fk_unit_id = unit_id)
WHERE fk_lot_id = $1'
USING _lotID;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION get_measurements(bigint) OWNER TO postgres;
======================================================

However, when time goes on and more lots being added, the performance for the above stored procedure getting worst.

I plan to use table partition, to solve the above problem.

(1) Store unit, according to their unit_id (primary key). For example,

unit_id from 0 till 999999 will store in table unit_0.
unit_id from 1000000 till 1999999 will store in table unit_1.

(2) Store measurement, according to their measurment_id (primary key). For example,

measurement_id from 0 till 999999 will store in table measurement_0.
measurement_id from 1000000 till 1999999 will store in table measurement_1.

However, does the above stored procedure, will gain advantage from my (1) and (2) strategy?

As from my point of view, PostgreSQL seems doesn't get enough hint, where he need to go to unit_? and measurement_? table to perform read. In my query, I didn't specific rule like

"WHERE unit_id / 1000000 = 0"

Advice and comment are very much appreciated.

The below are my table partition rule :
========================================
CREATE TABLE measurement_0 (
CONSTRAINT pk_measurement_id PRIMARY KEY (measurement_id),
CONSTRAINT fk_unit_id FOREIGN KEY (fk_unit_id)
REFERENCES unit (unit_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,

CHECK ( measurement_id / 1000000 = 0 )
) INHERITS (measurement);

CREATE TABLE measurement_1 (
CONSTRAINT pk_measurement_id PRIMARY KEY (measurement_id),
CONSTRAINT fk_unit_id FOREIGN KEY (fk_unit_id)
REFERENCES unit (unit_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,

CHECK ( measurement_id / 1000000 = 1 )
) INHERITS (measurement);

......

CREATE TABLE unit_0 (
CONSTRAINT pk_unit_id PRIMARY KEY (unit_id),
CONSTRAINT fk_lot_id FOREIGN KEY (fk_lot_id)
REFERENCES lot (lot_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,

CHECK ( unit_id / 1000000 = 0 )
) INHERITS (measurement);

CREATE TABLE unit_1 (
CONSTRAINT pk_unit_id PRIMARY KEY (unit_id),
CONSTRAINT fk_lot_id FOREIGN KEY (fk_lot_id)
REFERENCES lot (lot_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,

CHECK ( unit_id / 1000000 = 1 )
) INHERITS (measurement);

...

Here are my table defination :
==============================
IF NOT EXISTS(SELECT * FROM information_schema.tables WHERE table_name = 'unit') THEN
CREATE TABLE unit
(
unit_id bigserial NOT NULL,
fk_lot_id bigint NOT NULL,
"cycle" bigint NOT NULL,
CONSTRAINT pk_unit_id PRIMARY KEY (unit_id),
CONSTRAINT fk_lot_id FOREIGN KEY (fk_lot_id)
REFERENCES lot (lot_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE
);

CREATE INDEX idx_fk_lot_id
ON unit
USING btree
(fk_lot_id);
END IF;

IF NOT EXISTS(SELECT * FROM information_schema.tables WHERE table_name = 'measurement') THEN
CREATE TABLE measurement
(
measurement_id bigserial NOT NULL,
fk_unit_id bigint NOT NULL,
"value" double precision,
CONSTRAINT pk_measurement_id PRIMARY KEY (measurement_id),
CONSTRAINT fk_unit_id FOREIGN KEY (fk_unit_id)
REFERENCES unit (unit_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE
);

CREATE INDEX idx_fk_unit_id
ON measurement
USING btree
(fk_unit_id);

CREATE INDEX idx_measurement_value
ON measurement
USING btree
(value) WHERE value IS NULL;
END IF;

Thanks and Regards
Yan Cheng CHEOK

--- On Thu, 1/21/10, Yan Cheng Cheok <yccheok(at)yahoo(dot)com> wrote:

> From: Yan Cheng Cheok <yccheok(at)yahoo(dot)com>
> Subject: Re: [GENERAL] Partitioning on Table with Foreign Key
> To: "Vick Khera" <vivek(at)khera(dot)org>
> Cc: pgsql-general(at)postgresql(dot)org
> Date: Thursday, January 21, 2010, 11:31 PM
> Make sense to me. Thanks for the
> advice. I will try that out.
>
> Thanks and Regards
> Yan Cheng CHEOK
>
>
> --- On Thu, 1/21/10, Vick Khera <vivek(at)khera(dot)org>
> wrote:
>
> > From: Vick Khera <vivek(at)khera(dot)org>
> > Subject: Re: [GENERAL] Partitioning on Table with
> Foreign Key
> > To: "Yan Cheng Cheok" <yccheok(at)yahoo(dot)com>
> > Cc: pgsql-general(at)postgresql(dot)org
> > Date: Thursday, January 21, 2010, 11:30 PM
> > On Thu, Jan 21, 2010 at 3:46 AM, Yan
> > Cheng Cheok <yccheok(at)yahoo(dot)com>
> > wrote:
> > > table measurement will have a *lot* of row
> (millions).
> > I want to speed up write and read access. Hence, I
> use
> > partition technique.
> > >
> > > CREATE TABLE measurement_y2006m02 (
> > >    CHECK ( date >= DATE '2006-02-01' AND
> date
> > < DATE '2006-03-01' )
> > > ) INHERITS (measurement);
> > >
> > > Opps! But measurement do not have date column.
> How I
> > can refer to measurement's lot's date?
> > >
> >
> > Split on your lot ID number instead of date, using a
> modulo
> > operation.
> >  I've done this by splitting a table 100 ways
> using the
> > constraing (id
> > % 100 == N) for N = 0 .. 99.
> >
> > We don't know what your data retention or
> distribution
> > needs are, so
> > it is hard to offer solid advice.
> >
>
>
>
>

Browse pgsql-general by date

  From Date Subject
Next Message Yan Cheng Cheok 2010-01-27 03:22:42 Re: Primary Key Increment Doesn't Seem Correct Under Table Partition
Previous Message Mike Bresnahan 2010-01-27 02:18:59 Re: SMP Read-only Performance