Re: Postgres becoming slow, only full vacuum fixes it

From: Julien Cigar <jcigar(at)ulb(dot)ac(dot)be>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres becoming slow, only full vacuum fixes it
Date: 2012-09-24 12:21:09
Message-ID: 50605035.8020107@ulb.ac.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

1) upgrade your PostgreSQL installation, there have been numerous
bugfixes releases since 8.4.2
2) you'll have to show us an explain analyze of the slow queries. If I
take a look at those you provided everything run i less than 1ms.
3) with 200 records you'll always have a seqscan
4) how much memory do you have ? shared_buffers = 256MB and
effective_cache_size = 512MB looks OK only if you have between 1 and 2GB
of RAM
5) synchronous_commit = off should only be used if you have a
battery-backed write cache.
6) autovacuum_naptime should be changed only if autovacuum is constantly
running (so if you have dozen of databases in your cluster)
7) are you sure the problem isn't related to Bucardo ?

Julien

On 09/24/2012 13:55, Kiriakos Tsourapas wrote:
> Sorry, forgot to mention the most obvious and important information :
> My postgres is 8.4.2
>
> On Sep 24, 2012, at 13:33, Kiriakos Tsourapas wrote:
>
>> Hi,
>>
>> The problem : *Postgres is becoming slow, day after day, and only a
>> full vacuum fixes the problem*.
>>
>> Information you may need to evaluate :
>>
>> The problem lies on all tables and queries, as far as I can tell, but
>> we can focus on a single table for better comprehension.
>>
>> The queries I am running to test the speed are :
>> INSERT INTO "AWAITINGSTATUSSMPP" VALUES('143428', '1111', 1,
>> '2012-06-16 13:39:19', '111');
>> DELETE FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND
>> "CONNECTIONID" = 1;
>> SELECT * FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND
>> "CONNECTIONID" = 1;
>>
>> After a full vacuum, they run in about 100ms.
>> Today, before the full vacuum, they were taking around 500ms.
>>
>> Below is an explain analyze of the commands AFTER a full vacuum. I
>> did not run it before, so I can not post relevant info before the
>> vacuum. So, after the full vacuum :
>>
>> explain analyze INSERT INTO "AWAITINGSTATUSSMPP" VALUES('143428',
>> '1111', 1, '2012-06-16 13:39:19', '111');
>> "Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002
>> rows=1 loops=1)"
>> "Trigger for constraint FK_AWAITINGSTATUSSMPP_MESSAGES: time=0.131
>> calls=1"
>> "Trigger bucardo_add_delta: time=0.454 calls=1"
>> "Trigger bucardo_triggerkick_MassSMs: time=0.032 calls=1"
>> "Total runtime: 0.818 ms"
>>
>> explain analyze DELETE FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" =
>> '1111' AND "CONNECTIONID" = 1;"Seq Scan on "AWAITINGSTATUSSMPP"
>> (cost=0.00..2.29 rows=1 width=6) (actual time=0.035..0.035 rows=0
>> loops=1)"
>> " Filter: ((("SMSCMSGID")::text = '1111'::text) AND ("CONNECTIONID"
>> = 1))"
>> "Trigger bucardo_triggerkick_MassSMs: time=0.066 calls=1"
>> "Total runtime: 0.146 ms"
>>
>> explain analyze SELECT * FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID"
>> = '1111' AND "CONNECTIONID" = 1;
>> "Seq Scan on "AWAITINGSTATUSSMPP" (cost=0.00..2.29 rows=1 width=557)
>> (actual time=0.028..0.028 rows=0 loops=1)"
>> " Filter: ((("SMSCMSGID")::text = '1111'::text) AND ("CONNECTIONID"
>> = 1))"
>> "Total runtime: 0.053 ms"
>>
>> Below are the metadata of the table :
>> =====================================
>> CREATE TABLE "AWAITINGSTATUSSMPP"
>> (
>> "MESSAGEID" bigint NOT NULL,
>> "SMSCMSGID" character varying(50) NOT NULL,
>> "CONNECTIONID" smallint NOT NULL,
>> "EXPIRE_TIME" timestamp without time zone NOT NULL,
>> "RECIPIENT" character varying(20) NOT NULL,
>> "CLIENT_MSG_ID" character varying(255),
>> CONSTRAINT "PK_AWAITINGSTATUSSMPP" PRIMARY KEY ("SMSCMSGID",
>> "CONNECTIONID"),
>> CONSTRAINT "FK_AWAITINGSTATUSSMPP_MESSAGES" FOREIGN KEY ("MESSAGEID")
>> REFERENCES "MESSAGES" ("ID") MATCH SIMPLE
>> ON UPDATE NO ACTION ON DELETE CASCADE
>> )
>> WITH (
>> OIDS=FALSE
>> );
>> ALTER TABLE "AWAITINGSTATUSSMPP" OWNER TO postgres;
>> GRANT ALL ON TABLE "AWAITINGSTATUSSMPP" TO "MassSMsUsers";
>>
>> CREATE INDEX "IX_AWAITINGSTATUSSMPP_MSGID_RCP"
>> ON "AWAITINGSTATUSSMPP"
>> USING btree
>> ("MESSAGEID", "RECIPIENT");
>>
>> CREATE TRIGGER bucardo_add_delta
>> AFTER INSERT OR UPDATE OR DELETE
>> ON "AWAITINGSTATUSSMPP"
>> FOR EACH ROW
>> EXECUTE PROCEDURE bucardo."bucardo_add_delta_SMSCMSGID|CONNECTIONID"();
>>
>> CREATE TRIGGER "bucardo_triggerkick_MassSMs"
>> AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE
>> ON "AWAITINGSTATUSSMPP"
>> FOR EACH STATEMENT
>> EXECUTE PROCEDURE bucardo."bucardo_triggerkick_MassSMs"();
>> =====================================
>>
>> The table only has about 200 records because it is being used a
>> temporary storage and records are constantly inserted and deleted.
>> BUT please don't get hold on this fact, because as I already said,
>> the speed problem is not restricted to this table. The same problems
>> appear on the following query
>> UPDATE "MESSAGES" SET "SENT" = "SENT" + 1 WHERE "ID" = 143447;
>> and MESSAGES table has mainly inserts and few deletes...
>>
>> My postgresql.conf file :
>> ======================
>> port = 5433 # (change requires restart)
>> max_connections = 100 # (change requires restart)
>> shared_buffers = 256MB # min 128kB. DoubleIP -
>> Default was 32MB
>> synchronous_commit = off # immediate fsync at commit.
>> DoubleIP - Default was on
>> effective_cache_size = 512MB # DoubleIP - Default was 128MB
>> log_destination = 'stderr' # Valid values are
>> combinations of
>> logging_collector = on # Enable capturing of stderr
>> and csvlog
>> silent_mode = on # Run server silently.
>> log_line_prefix = '%t %d %u ' # special values:
>> log_autovacuum_min_duration = 0 # -1 disables, 0 logs all
>> actions and
>> autovacuum_naptime = 28800 # time between autovacuum
>> runs. DoubleIP - default was 1min
>> autovacuum_vacuum_threshold = 100 # min number of row updates
>> before
>> autovacuum_vacuum_scale_factor = 0.0 # fraction of table size
>> before vacuum. DoubleIP - default was 0.2
>> datestyle = 'iso, mdy'
>> lc_messages = 'en_US.UTF-8' # locale for system
>> error message
>> lc_monetary = 'en_US.UTF-8' # locale for monetary
>> formatting
>> lc_numeric = 'en_US.UTF-8' # locale for number
>> formatting
>> lc_time = 'en_US.UTF-8' # locale for time
>> formatting
>> default_text_search_config = 'pg_catalog.english'
>> =======================
>>
>> As you will see, I have altered the shared_buffers
>> and synchronous_commit values.
>> The shared_buffers had the default value 32Mb. When I changed it to
>> 256Mb the problem still appears but it takes more time to appear (3-4
>> days). With 32MB, it appeared faster, probably after 24 hours.
>> Also, I have changed the autovacuum daemon to work every 8 hours but
>> I changed its values to make sure it vacuums pretty much all tables
>> (the ones for which at least 100 rows have changed).
>> Please note, though, that my problem existed even before playing
>> around with the autovacuum. This is why I tried to change its values
>> in the first place.
>>
>> The server is synchronized with another server using bucardo. Bucardo
>> process is running on the other server.
>> The same problem appears on the 2nd server too... after 3-4 days,
>> postgres is running slower and slower.
>>
>> Our server configuration :
>> DELL PowerEdge T610 Tower Chassis for Up to 8x 3.5" HDDs
>> 2x Intel Xeon E5520 Processor (2.26GHz, 8M Cache, 5.86 GT/s QPI,
>> Turbo, HT), 1066MHz Max Memory
>> 8GB Memory,1333MHz
>> 2 x 146GB SAS 15k 3.5" HD Hot Plug
>> 6 x 1TB SATA 7.2k 3.5" Additional HD Hot Plug
>> PERC 6/i RAID Controller Card 256MB PCIe, 2x4 Connectors
>> SUSE Linux Enterprise Server 10, SP2
>>
>> The 2 HDs are set up with RAID-1
>> The 6 HDs are set up with RAID-5
>>
>> Linux is running on the RAID-1 configuration
>> Postgres is running on the RAID-5 configuration
>>
>>
>> Finally a top before and after the full vacuum :
>> top - 11:27:44 up 72 days, 13:27, 37 users, load average: 1.05,
>> 1.31, 1.45
>> Tasks: 279 total, 3 running, 276 sleeping, 0 stopped, 0 zombie
>> Cpu(s): 3.6%us, 0.8%sy, 0.0%ni, 95.5%id, 0.0%wa, 0.0%hi,
>> 0.1%si, 0.0%st
>> Mem: 8166432k total, 7963116k used, 203316k free, 115344k buffers
>> Swap: 2097144k total, 2097020k used, 124k free, 2337636k cached
>>
>> top - 11:30:58 up 72 days, 13:31, 38 users, load average: 1.53,
>> 1.59, 1.53
>> Tasks: 267 total, 2 running, 265 sleeping, 0 stopped, 0 zombie
>> Cpu(s): 1.3%us, 0.4%sy, 0.0%ni, 98.0%id, 0.3%wa, 0.0%hi,
>> 0.1%si, 0.0%st
>> Mem: 8166432k total, 6016268k used, 2150164k free, 61092k buffers
>> Swap: 2097144k total, 2010204k used, 86940k free, 2262896k cached
>>
>>
>> I hope I have provided enough info and hope that someone can point me
>> to the correct direction.
>>
>>
>> Thank you very much even for reading up to here !
>>
>> Best regards,
>> Kiriakos
>

--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

Attachment Content-Type Size
jcigar.vcf text/x-vcard 292 bytes

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message MirrorX 2012-09-24 12:23:13 Re: Postgres becoming slow, only full vacuum fixes it
Previous Message Kiriakos Tsourapas 2012-09-24 11:55:01 Re: Postgres becoming slow, only full vacuum fixes it