7.1.3 w/ Perl/DBI application hangs

From: "Doran L(dot) Barton" <fozz(at)iodynamics(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: 7.1.3 w/ Perl/DBI application hangs
Date: 2001-11-12 19:06:56
Message-ID: 20011112120656.A2531@iodynamics.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I've got a fairly mission-critical Perl/mod_perl application built on top of
PostgreSQL that is causing me grief. I've been battling it for a week or so
and have determined I think the problem is a PostgreSQL bug.

The application is a web-based event-scheduling system for an office of
accountants. The server is an Athlon 1.0Ghz CPU running Red Hat 7.2 and Red
Hat's PostgreSQL 7.1.3 build. The hard disks are UW SCSI disks connected
via software RAID level 1 (complements of Red Hat Linux). There is 256M of
RAM available. I've tweaked the max_connections, shared_buffers, and
sort_mem variables with no change in behavior resulting.

I recently revised the application to make better use of complex queries
in PostgreSQL instead of doing most of the work in Perl - which was making
things quite slow.

The problem manifests itself when these more-complex queries seem to "hang" -
never returning any kind of result whatsoever. When this happens, a 'ps' on
the server shows SELECT queries spinning - taking up CPU and memory. When
frustrated users refresh their browsers, this creates more of these
processes - until someone pages me and I log in and the load average is
something like 12.2.

I do VACUUM ANALYZE daily - even though the data in the database doesn't
change that much. There are approximately 16,000 records in the events
database which comprises about three years of events data (this application
has been running for a couple years). Doing a VACUUM ANALYZE when the
database is "hanging" doesn't change anything. The VACUUM ANALYZE does
return a successful result.

Stopping the postgresql service and restarting also does not fix the
proble. It does kill off the "hanging" processes, but once the database
server is running again, new queries hang just as before.

The ONLY (temporary) solution I've found is:

(1) shut down Apache
(2) restart PostgreSQL
(3) pg_dump the database
(4) stop Postgresql
(5) rename the data/ directory and create a new data/ directory
(6) Initalize the database in the new data/ directory
(7) recreate users and my database
(8) restore the data from the previous dump
(9) restart Apache

After I do the above steps, the application works fine for about 24 hours.

Then the problem begins manifesting itself again. Another thing I've
noticed is that this problem devlops independently of actual application
usage. For example, I will go through the above steps on a Saturday evening
and users will complain of not being able to use the application on Monday
morning. The application is not used at all on Sundays.

Now, I'm not familiar with some more the more down-and-dirty tools of
debugging PostgreSQL (although I've been using PG for various web-based
application projects since 1995), so please bear with me. I'll run
whatever commands I am instructed. If a developer needs ssh access to
the machine to witness first-hand what is going on, that can be arranged.

This is a serious bug and it is affecting a potentially cool application.
Please help me. :) I'm a PostgreSQL poster child.

BTW: The queries I'm submitting look like this:

-- Find weekly/biweekly recurring events for user
SELECT event.* FROM
event, u_participant WHERE
event.event_id = u_participant.event_id AND
u_participant.username = 'rld' AND
(
event.recur = 'Bi-Weekly' OR
event.recur = 'Weekly' ) AND
event.t_begin < '2001-08-05' AND
event.recur_end >= '2001-07-29'
UNION
-- Find weekly/biweekly recurring events for groups user is in
SELECT event.* FROM
event, g_participant, group_user WHERE
event.event_id = g_participant.event_id AND
group_user.group_id = g_participant.group_id AND
group_user.username = 'rld' AND
(
event.recur = 'Bi-Weekly' OR
event.recur = 'Weekly' ) AND
event.t_begin < '2001-08-05' AND
event.recur_end >= '2001-07-29'
UNION
-- Find monthly recurring events for user
SELECT event.* FROM
event, u_participant WHERE
event.event_id = u_participant.event_id AND
u_participant.username = 'rld' AND
event.recur = 'Monthly' AND
(
(
date_part('month', DATE '2001-07-29') =
date_part('month', DATE '2001-08-05') AND
(
date_part('day', event.t_begin) >=
date_part('day', DATE '2001-07-29') AND
date_part('day', event.t_begin) <
date_part('day', DATE '2001-08-05'))) OR
(
(
date_part('month', DATE '2001-07-29') =
date_part('month', event.t_begin) AND
date_part('day', event.t_begin) >=
date_part('day', DATE '2001-07-29')) OR
(
date_part('month', DATE '2001-08-05') =
date_part('month', event.t_begin) AND
date_part('day', event.t_begin) <
date_part('day', DATE '2001-08-05')))) AND

event.t_begin < '2001-08-05' AND
event.recur_end >= '2001-07-29'
UNION ...

... You get the picture.

-=Fozz

--
-------------------------------------------------------------------------
Doran L. Barton <fozz(at)iodynamics(dot)com> - Chief Super Hero - Iodynamics LLC
< http://www.iodynamics.com/ > - Linux solutions and dynamic websites
"Cars will not have intercourse on this bridge."
-- Seen in a Tokyo traffic handbook

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Hiroshi Inoue 2001-11-13 00:23:55 Re: Bug #514: Backend crashes periodically
Previous Message Tom Lane 2001-11-12 17:23:30 Re: Bug #514: Backend crashes periodically