Re: Why do I need more time with partition table?

From: AI Rumman <rummandba(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why do I need more time with partition table?
Date: 2012-07-24 11:35:54
Message-ID: CAGoODpdd9j+78DhAY7JC0+DacopoQn1t8JC+-88TTuKJNnu=HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In addition to the previous mail, I am adding here that -
My Postgresql version is 9.1.2.

And one more thing, executing the following query I got two query plan
where the second one looked strange to me.
If showed to take 20950.579 ms, but investigating both the plan I found
that it took less time in every step of second plan.

explain analyze
> select *
> from table1 as c
> inner join table2 as a on c.crmid = a.activityid and deleted = 0
> where module ='Leads';
>
> QUERY PLAN
>
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Hash Join (cost=108101.50..175252.57 rows=313256 width=506) (actual
> time=5194.683..5194.683 rows=0 loops=1)
> Hash Cond: (a.activityid = c.crmid)
> -> Seq Scan on table2 a (cost=0.00..18337.34 rows=681434 width=139)
> (actual time=0.062..823.380 rows=681434 loops=1)
> -> Hash (cost=89195.80..89195.80 rows=313256 width=367) (actual
> time=2813.000..2813.000 rows=287365 loops=1)
> Buckets: 1024 Batches: 128 Memory Usage: 226kB
> -> Append (cost=0.00..89195.80 rows=313256 width=367) (actual
> time=0.062..2352.646 rows=287365 loops=1)
> -> Seq Scan on table1 c (cost=0.00..89187.53 rows=313255
> width=367) (actual time=0.060..1820.331 rows=287365 loops=1)
> Filter: ((deleted = 0) AND ((module)::text =
> 'Leads'::text))
> -> Index Scan using crmentity_leads_deleted_idx on
> table1_leads c (cost=0.00..8.27 rows=1 width=280) (actual
> time=11.076..11.076 rows=0 loops=1)
> Index Cond: (deleted = 0)
> Filter: ((module)::text = 'Leads'::text)
> Total runtime: 5195.117 ms
> (12 rows)
>

Executing the query again -

*\g*
>
> QUERY PLAN
>
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Hash Join (cost=108101.50..175252.57 rows=313256 width=506) (actual
> time=20950.161..20950.161 rows=0 loops=1)
> Hash Cond: (a.activityid = c.crmid)
> -> Seq Scan on table2 a (cost=0.00..18337.34 rows=681434 width=139)
> (actual time=0.092..835.241 rows=681434 loops=1)
> -> Hash (cost=89195.80..89195.80 rows=313256 width=367) (actual
> time=2774.250..2774.250 rows=287365 loops=1)
> Buckets: 1024 Batches: 128 Memory Usage: 226kB
> -> Append (cost=0.00..89195.80 rows=313256 width=367) (actual
> time=0.061..2318.759 rows=287365 loops=1)
> -> Seq Scan on table1 c (cost=0.00..89187.53 rows=313255
> width=367) (actual time=0.059..1799.937 rows=287365 loops=1)
> Filter: ((deleted = 0) AND ((module)::text =
> 'Leads'::text))
> -> Index Scan using crmentity_leads_deleted_idx on
> table1_leads c (cost=0.00..8.27 rows=1 width=280) (actual
> time=0.011..0.011 rows=0 loops=1)
> Index Cond: (deleted = 0)
> Filter: ((module)::text = 'Leads'::text)
> Total runtime: 20950.579 ms
> (12 rows)
>

On Tue, Jul 24, 2012 at 4:42 PM, AI Rumman <rummandba(at)gmail(dot)com> wrote:

> I partitioned a table, but didn't find any improvement in query timing.
>
> The basic table was like as follows :-
>
> \d table1
>> Table "public.table1_old"
>> Column | Type | Modifiers
>> --------------+-----------------------------+--------------------
>> crmid | integer | not null
>> smcreatorid | integer | not null default 0
>> smownerid | integer | not null default 0
>> modifiedby | integer | not null default 0
>> module | character varying(30) | not null
>> description | text |
>> createdtime | timestamp without time zone | not null
>> modifiedtime | timestamp without time zone | not null
>> viewedtime | timestamp without time zone |
>> status | character varying(50) |
>> version | integer | not null default 0
>> presence | integer | default 1
>> deleted | integer | not null default 0
>> Indexes:
>> "table1_pkey" PRIMARY KEY, btree (crmid)
>> "table1_createdtime_idx" btree (createdtime)
>> "table1_modifiedby_idx" btree (modifiedby)
>> "table1_modifiedtime_idx" btree (modifiedtime)
>> "table1_module_idx" btree (module) WHERE deleted = 0
>> "table1_smcreatorid_idx" btree (smcreatorid)
>> "table1_smownerid_idx" btree (smownerid)
>> "ftx_en_table1_description" gin (to_tsvector('vcrm_en'::regconfig,
>> for_fts(description)))
>> "table1_deleted_idx" btree (deleted)
>
>
>
> \d table2
>> Table "public.table2"
>> Column | Type |
>> Modifiers
>>
>> -------------------------+------------------------+-------------------------------------------
>> table2id | integer | not null default 0
>> subject | character varying(250) | not null
>> semodule | character varying(20) |
>> table2type | character varying(200) | not null
>> date_start | date | not null
>> due_date | date |
>> time_start | character varying(50) |
>> time_end | character varying(50) |
>> sendnotification | character varying(3) | not null default
>> '0'::character varying
>> duration_hours | character varying(2) |
>> duration_minutes | character varying(200) |
>> status | character varying(200) |
>> eventstatus | character varying(200) |
>> priority | character varying(200) |
>> location | character varying(150) |
>> notime | character varying(3) | not null default
>> '0'::character varying
>> visibility | character varying(50) | not null default
>> 'all'::character varying
>> recurringtype | character varying(200) |
>> end_date | date |
>> end_time | character varying(50) |
>> duration_seconds | integer | not null default 0
>> phone | character varying(100) |
>> vip_name | character varying(200) |
>> is_offline_call | smallint | default 0
>> campaign_id | bigint |
>> table2_classification | character varying(255) |
>> Indexes:
>> "table2_pkey" PRIMARY KEY, btree (table2id)
>> "table2_table2type_idx" btree (table2type)
>> "table2_date_start_idx" btree (date_start)
>> "table2_due_date_idx" btree (due_date)
>> "table2_eventstatus_idx" btree (eventstatus)
>> "table2_status_idx" btree (status)
>> "table2_subject_idx" btree (subject)
>> "table2_time_start_idx" btree (time_start)
>> "ftx_en_table2_subject" gin (to_tsvector('vcrm_en'::regconfig,
>> for_fts(subject::text)))
>
>
>
> As most of the queries were executed based on module.
>
> select module,count(*) from table1 group by module;
>> module | count
>> -----------------------+--------
>> Leads | 463237
>> Calendar | 431041
>> Accounts | 304225
>> Contacts | 299211
>> Emails | 199876
>> HelpDesk | 135977
>> Potentials | 30826
>> Emails Attachment | 28249
>> Notes | 1029
>> Accounts Attachment | 1015
>
>
>
> I paritioned the table based on module. And created index on each separate
> tables.
> After parition the table structure as follows :-
>
> \d+ table1
>> Table "public.table1"
>> Column | Type | Modifiers |
>> Storage | Description
>>
>> --------------+-----------------------------+--------------------+----------+-------------
>> crmid | integer | not null | plain
>> |
>> smcreatorid | integer | not null default 0 | plain
>> |
>> smownerid | integer | not null default 0 | plain
>> |
>> modifiedby | integer | not null default 0 | plain
>> |
>> module | character varying(30) | not null |
>> extended |
>> description | text | |
>> extended |
>> createdtime | timestamp without time zone | not null | plain
>> |
>> modifiedtime | timestamp without time zone | not null | plain
>> |
>> viewedtime | timestamp without time zone | | plain
>> |
>> status | character varying(50) | |
>> extended |
>> version | integer | not null default 0 | plain
>> |
>> presence | integer | default 1 | plain
>> |
>> deleted | integer | not null default 0 | plain
>> |
>> Indexes:
>> "table1_pkey1" PRIMARY KEY, btree (crmid)
>> Child tables: table1_accounts,
>> table1_calendar,
>> table1_emails,
>> table1_helpdesk,
>> table1_leads,
>> table1_others
>> Has OIDs: no
>
>
>
>
> *Without parition :-*
>
> explain analyze
>> select *
>> from table1 as c
>> inner join table2 as a on c.crmid = a.table2id and deleted = 0
>> where module ='Leads'
>> ;
>>
>> QUERY PLAN
>>
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>> Hash Join (cost=93557.89..160291.06 rows=112087 width=506) (actual
>> time=4013.152..4013.152 rows=0 loops=1)
>> Hash Cond: (a.table2id = c.crmid)
>> -> Seq Scan on table2 a (cost=0.00..18337.34 rows=681434 width=139)
>> (actual time=0.028..834.189 rows=681434 loops=1)
>> -> Hash (cost=73716.32..73716.32 rows=328765 width=367) (actual
>> time=1620.810..1620.810 rows=287365 loops=1)
>> Buckets: 1024 Batches: 128 Memory Usage: 226kB
>> -> Bitmap Heap Scan on table1 c (cost=9489.85..73716.32
>> rows=328765 width=367) (actual time=83.092..1144.159 rows=287365 loops=1)
>> Recheck Cond: (((module)::text = 'Leads'::text) AND
>> (deleted = 0))
>> -> Bitmap Index Scan on table1_module_idx
>> (cost=0.00..9407.66 rows=328765 width=0) (actual time=79.232..79.232
>> rows=287365 loops=1)
>> Index Cond: ((module)::text = 'Leads'::text)
>> Total runtime: 4013.932 ms
>> (10 rows)
>
>
>
> *With Parition :- *
>
>
>
>>
>> explain analyze
>>> select *
>>> from table1 as c
>>> inner join table2 as a on c.crmid = a.table2id and deleted = 0
>>> where module ='Leads';
>>>
>>> QUERY PLAN
>>>
>>>
>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> Hash Join (cost=108101.50..175252.57 rows=313256 width=506) (actual
>>> time=8430.588..8430.588 rows=0 loops=1)
>>> Hash Cond: (a.table2id = c.crmid)
>>> -> Seq Scan on table2 a (cost=0.00..18337.34 rows=681434 width=139)
>>> (actual time=0.054..870.554 rows=681434 loops=1)
>>> -> Hash (cost=89195.80..89195.80 rows=313256 width=367) (actual
>>> time=2751.950..2751.950 rows=287365 loops=1)
>>> Buckets: 1024 Batches: 128 Memory Usage: 226kB
>>> -> Append (cost=0.00..89195.80 rows=313256 width=367) (actual
>>> time=0.034..2304.191 rows=287365 loops=1)
>>> -> Seq Scan on table1 c (cost=0.00..89187.53
>>> rows=313255 width=367) (actual time=0.032..1783.075 rows=287365 loops=1)
>>> Filter: ((deleted = 0) AND ((module)::text =
>>> 'Leads'::text))
>>> -> Index Scan using table1_leads_deleted_idx on
>>> table1_leads c (cost=0.00..8.27 rows=1 width=280) (actual
>>> time=0.010..0.010 rows=0 loops=1)
>>> Index Cond: (deleted = 0)
>>> Filter: ((module)::text = 'Leads'::text)
>>> Total runtime: 8432.024 ms
>>> (12 rows)
>>
>>
> I set constraint_exclusion to partition.
>
> Why do I need more time with parition?
> Any experts please let me know.
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jan Otto 2012-07-24 11:46:28 Re: Why do I need more time with partition table?
Previous Message Aleksei Arefjev 2012-07-24 11:14:35 transactions start time