Re: 121+ million record table perf problems

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: cyber-postgres(at)midnightfantasy(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 121+ million record table perf problems
Date: 2007-05-18 18:51:04
Message-ID: 464DF598.20706@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

cyber-postgres(at)midnightfantasy(dot)com wrote:
> I need some help on recommendations to solve a perf problem.
>
> I've got a table with ~121 million records in it. Select count on it
> currently takes ~45 minutes, and an update to the table to set a value
> on one of the columns I finally killed after it ran 17 hours and had
> still not completed. Queries into the table are butt slow, and

Scanning 121 million rows is going to be slow even on 16 disks.

>
> System: SUSE LINUX 10.0 (X86-64)
> Postgresql: PostgreSQL 8.2.1
> Index type: btree

You really should be running 8.2.4.

>
> A select count took ~48 minutes before I made some changes to the
> postgresql.conf, going from default values to these:
> shared_buffers = 24MB

This could be increased.

> work_mem = 256MB
> maintenance_work_mem = 512MB
> random_page_cost = 100
> stats_start_collector = off
> stats_row_level = off
>
> As a test I am trying to do an update on state using the following queries:
> update res set state=5001;

You are updating 121 million rows, that takes a lot of time considering
you are actually (at a very low level) marking 121 million rows dead and
inserting 121 million more.

> The update query that started this all I had to kill after 17hours. It
> should have updated all 121+ million records. That brought my select
> count down to 19 minutes, but still a far cry from acceptable.

Not quite sure what you would considerable acceptable based on what you
are trying to do.

Sincerely,

Joshua D. Drake

>
> Here is the schema for the table giving me problems:
>
> CREATE TABLE res
> (
> res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass),
> res_client_id integer NOT NULL,
> "time" real DEFAULT 0,
> error integer DEFAULT 0,
> md5 character(32) DEFAULT 0,
> res_tc_id integer NOT NULL,
> state smallint DEFAULT 0,
> priority smallint,
> rval integer,
> speed real,
> audit real,
> date timestamp with time zone,
> gold_result_id integer,
> CONSTRAINT result_pkey PRIMARY KEY (res_id),
> CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id)
> )
> WITHOUT OIDS;
> ALTER TABLE res OWNER TO postgres;
>
> CREATE INDEX index_audit
> ON res
> USING btree
> (audit);
>
> CREATE INDEX index_event
> ON res
> USING btree
> (error);
>
> CREATE INDEX index_priority
> ON res
> USING btree
> (priority);
>
> CREATE INDEX index_rval
> ON res
> USING btree
> (rval);
>
> CREATE INDEX index_speed
> ON res
> USING btree
> (speed);
>
> CREATE INDEX index_state
> ON res
> USING btree
> (state);
>
> CREATE INDEX index_tc_id
> ON res
> USING btree
> (res_tc_id);
>
> CREATE INDEX index_time
> ON res
> USING btree
> ("time");
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Harris 2007-05-18 18:51:58 Re: reading large BYTEA type is slower than expected
Previous Message Andrew Sullivan 2007-05-18 18:30:18 Re: 121+ million record table perf problems