Re: Index creation takes more time?

From: "Herouth Maoz" <herouth(at)unicell(dot)co(dot)il>
To: "Craig Ringer" <ringerc(at)ringerc(dot)id(dot)au>
Cc: <pgsql-general(at)postgresql(dot)org>, <tv(at)fuzzy(dot)cz>
Subject: Re: Index creation takes more time?
Date: 2012-09-17 11:07:49
Message-ID: FB6D59EFE2A13D4B9A3E85F0C3363EED04E4E970@mail.UniCell.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Yes, thank you, I did notice it, but I decided to wait a week to the next archive schedule, to see if the problem persists, especially since the previous time ran with relatively low disk space because we kept the old database files around. We have removed them during the week.

Unfortunately, the problem persists.

So here is the information I could glean.

First, the variable "maintenance_work_mem" has not been changed between the old and new postgresql. In fact, it is commented out, so I assume it's the default 16MB for both installations.

The server that runs the 9.1 is generally better and faster than the PC that runs the 8.3 (it does every other operation - inserts, updates, selects - much faster than the PC). More specifically:

Server running 9.1:
3373252k of memory
Two hard disks, separate for system and database. The database disk is 15000RPM, 825G.
CPU: Xeon, 2.0GHz, 4 cores (or two CPUs with 2 cores, I'm not sure)

PC running 8.3:
3073344k of memory
One SATA hard disk (used for both system and database), 7200RPM, 915G.
CPU: Pentium dual-core 2.80GHz

In both machines postgreSQL is set up with shared_buffers of 1800M.

Now, the table itself:

Column | Type | Modifiers
-----------------------------+-----------------------------+-----------
service | smallint |
billing_priority | smallint |
account_number | integer |
msisdn | character varying(16) |
sme_reference | integer |
smsc_reference | numeric(21,0) |
gateway_id | smallint |
user_reference | numeric(21,0) |
user_time | timestamp without time zone |
time_arrived | timestamp without time zone |
time_submitted | timestamp without time zone |
time_final_state | timestamp without time zone |
status | integer |
time_notified | timestamp without time zone |
user_id | character varying(45) |
price | double precision |
sms_engine_id | character varying(15) |
smsc_session_id | character varying(64) |
external_billing_reference | character varying(128) |
multipart_reference | numeric(21,0) |
multipart_nr_segments | integer |
multipart_segment_nr | integer |
requested_target_network_id | character(1) |
actual_target_network_id | character(1) |
sm_type | character(2) |

There are no triggers, no foreign keys etc.

The index definitions:

CREATE INDEX billinga_user_id ON sms.billing__archive(user_id) ;
CREATE INDEX billinga_status ON sms.billing__archive(status) ;
CREATE INDEX billinga_time_arrived ON sms.billing__archive(time_arrived) ;
CREATE INDEX billinga_msisdn_sme_reference ON sms.billing__archive(msisdn,sme_reference) ;
CREATE INDEX billinga_account ON sms.billing__archive(account_number) ;
CREATE INDEX billinga_user_ref ON sms.billing__archive(user_reference) ;
CREATE INDEX billinga_smsc_ref ON sms.billing__archive (smsc_reference) ;
CREATE INDEX billinga_time_submitted ON sms.billing__archive(time_submitted) ;

Statistics collection:

For the sake of experimentation, I dropped and created the billinga_msisdn_sme_reference in both machines, timed it, and ran vmstat, iostat and sar in the background at intervals of 1 minute.

On the PC, the creation of the index took 40 minutes 35 seconds.

The server (9.1) has not finished yet. I set up stats to run for an hour, and I'm sending this hour's worth of stats.

I'm attaching the stats files in tarballs. I'm not sure what I'm supposed to look at.

Thanks for your time,
Herouth

-----הודעה מקורית-----
מאת: Craig Ringer [mailto:ringerc(at)ringerc(dot)id(dot)au]
נשלח: ב 17/09/2012 06:56
אל: Herouth Maoz
עותק לידיעה: pgsql-general(at)postgresql(dot)org; tv(at)fuzzy(dot)cz
נושא: Re: [GENERAL] Index creation takes more time?

Herouth,

I don't know if you saw Tomas Vondra's follow-up, as it was only to the
list and not CC'd to you. Here's the archive link:

http://archives.postgresql.org/message-id/e87a2f7a91ce1fca7143bcadc4553a0b@fuzzy.cz

The short version: "More information required".

On 09/09/2012 05:25 PM, Herouth Maoz wrote:
> We have tables which we archive and shorten every day. That is - the main table that has daily inserts and updates is kept small, and there is a parallel table with all the old data up to a year ago.
>
> In the past we noticed that the bulk transfer from the main table to the archive table takes a very long time, so we decided to do this in three steps: (1) drop indexes on the archive table, (2) insert a week's worth of data into the archive table. (3) recreate the indexes. This proved to take much less time than having each row update the index.
>
> However, this week we finally upgraded from PG 8.3 to 9.1, and suddenly, the archiving process takes a lot more time than it used to - 14:30 hours for the most important table, to be exact, spent only on index creation.
>
> The same work running on the same data in 8.3 on a much weaker PC took merely 4:30 hours.
>
> There are 8 indexes on the archive table.
>
> The size of the main table is currently (after archive) 7,805,009 records.
> The size of the archive table is currently 177,328,412 records.
>
> Has there been a major change in index creation that would cause 9.1 to do it this much slower? Should I go back to simply copying over the data or is the whole concept breaking down?
>
>
> TIA,
> Herouth
>

Attachment Content-Type Size
pc.tgz application/x-gzip 4.3 KB
server.tgz application/x-gzip 5.1 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Yvon Thoraval 2012-09-17 11:26:30 Re: SQLSTATE[08006] [7] server closed the connection unexpectedly
Previous Message Chris Travers 2012-09-17 09:13:50 Re: Slow counting still true?