a heavy duty operation on an "unused" table kills my server

From: Eduardo Piombino <drakorg(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: a heavy duty operation on an "unused" table kills my server
Date: 2010-01-13 04:59:11
Message-ID: e24c1d9d1001122059ic5d4721o98ccf85033b1fead@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi list, I'm having a problem when dealing with operations that asks too
much CPU from the server.
The scenario is this:

I have a multithreaded server, each thread with its own connection to the
database. Everything is working fine, actually great, actually
outstandingly, in normal operation.

I've a table named "a" with 1.8 million records, and growing, but I'm ok
with it, at least for the moment. Maybe in the near future we will cut it
down, backup old data, and free it up. But this is not the issue, as I said,
everything is working great. I have a cpl of indexes to help some queries,
and that's it.

Now my problem started when I tried to do some model refactoring on this
production table.

First I tried a dumb approach.
I connected from pgadmin, opened a new session.
I tried an ALTER TABLE on this table just to turn a char(255) field into
char(250), and it locked up my system.

No surprise, since I had many threads waiting for this alter table to
finish. What I did not foresee was that this alter table would take up so
much time. Ok, my fault, for not having calculated the time that it would
take the ALTER TABLE to complete.

Now, with this experience, I tried a simple workaround.
Created an empty version of "a" named "a_empty", identical in every sense.
renamed "a" to "a_full", and "a_empty" to "a". This procedure costed me like
0 seconds of downtime, and everything kept working smoothly. Maybe a cpl of
operations could have failed if they tried to write in the very second that
there was actually no table named "a", but since the operation was
transactional, the worst scenario was that if the operation should have
failed, the client application would just inform of the error and ask the
user for a retry. No big deal.

Now, this table, that is totally unattached to the system in every way (no
one references this table, its like a dumpster for old records), is not
begin accessed by no other thread in the system, so an ALTER table on it, to
turn a char(255) to char(250), should have no effect on the system.

So, with this in mind, I tried the ALTER TABLE this time on the "a_full"
(totally unrelated) table.
The system went non-responsive again, and this time it had nothing to do
with threads waiting for the alter table to complete. The pgAdmin GUI went
non-responsive, as well as the application's server GUI, whose threads kept
working on the background, but starting to take more and more time for every
clients request (up to 25 seconds, which are just ridiculous and completely
unacceptable in normal conditions).

This resulted in my client applications to start disconnecting after their
operations failed due to timeout, and the system basically went down again,
from a users point of view.

This time, since I saw no relation between my operation on a totally
unrelated table, and the server BIG slowdown, I blamed the servers memory.

After some tests, I came up to the conclusion that any heavy duty operation
on any thread (ALTER TABLE on 1.8 million records tables, updates on this
table, or an infinite loop, just to make my point), would affect the whole
server.

Bottom line is, I can't seem to do any heavy processing on the database (or
any operation that would require the server to enter into high CPU usage),
and still expect the server to behave normally. Whatever heavy duty
operation, DDL, DML, on whatever table (related, or unrelated), on whatever
thread, would tear down my servers integrity.

My question then is: is there a way to limit the CPU assigned to a specific
connection?
I mean, I don't care if my ALTER TABLE takes 4 days instead of 4 hours.

Something like:
pg_set_max_cpu_usage(2/100);

and rest assured that no matter what that thread is asking the database to
do, it just wont affect the other running threads. Obviosly, assuring that
the process itself does not involve any locking of the other threads.

Is something like that possible?

Thanks in advance,
Eduardo.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2010-01-13 05:01:51 Re: performance config help
Previous Message Matthew Wakeling 2010-01-12 18:29:44 Re: performance config help