Re: Speeding up a query.

From: "Hartman, Matthew" <Matthew(dot)Hartman(at)krcc(dot)on(dot)ca>
To: <pgsql-performance(at)postgresql(dot)org>
Cc: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "Grzegorz Jaskiewicz" <gryzman(at)gmail(dot)com>, "Alberto Dalmaso" <dalmaso(at)clesius(dot)it>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Subject: Re: Speeding up a query.
Date: 2009-06-17 14:58:52
Message-ID: 366642367C5B354197A1E0D27BC175BD0225971B@KGHMAIL.KGH.ON.CA
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I promised to provide more details of the query (or the function as it is). Here goes.

Scenario:

A chemotherapy regimen requires chair time and nursing time. A patient might sit in the chair for three hours but the nurse only has to be with them for the first hour. Therefore, nurses can manage multiple chairs at a time. Each regimen has a different time requirement.

To efficiently manage our chair and nursing resources, we want to schedule against these constraints. Our room currently has 17 chairs and around 8 nurses per day. We administer several hundred different regimens and the time for each regimen varies based on the day of the regimen as well as the course. All of these variables are entered and maintained through a web application I wrote.

Scheduling algorithm:

Written in PostgreSQL (naturally), the algorithm is a single function call. It gathers the data for a day into a temporary table and cycles through each appointment. Appointments are scheduled in the following order: locked appointments (previously scheduled and assigned to a nurse and chair), reserved appointments (a desired time slot has been selected), open appointments (ordered by the required chair time descending and the required nurse time descending). Here's the busy part that loops through each appointment. The table definition follows. Anything beginning with an underscore is a declared variable.

-- Reserved and unscheduled appointments.

FOR _APPOINTMENT IN SELECT * FROM MATRIX_UNSCHEDULED WHERE APPT_STATUS <> 'L' ORDER BY ROW_NUM

LOOP

-- Initialize the variables for this record.

RAISE NOTICE 'Status ''%'' - %', _APPOINTMENT.APPT_STATUS, _APPOINTMENT;

_AVAILABLE := null;

select into _UNIT_INTERVALS, _NURSE_INTERVALS, _UNIT_REQUIRED, _NURSE_REQUIRED

_APPOINTMENT.total_unit_time / 5,

_APPOINTMENT.total_nurse_time / 5,

(_APPOINTMENT.total_unit_time || ' minutes')::INTERVAL,

(_APPOINTMENT.total_nurse_time || ' minutes')::INTERVAL;

-- Find the first available row for the required unit and nurse time.

select into _AVAILABLE unit.row_num

from (

select m1.row_num

from matrix m1,

matrix m2

where m1.unit_id = m2.unit_id

and m1.nurse_id = m2.nurse_id

and m1.unit_scheduled = false

and m2.unit_scheduled = false

and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time)

and m2.timeslot between m1.timeslot and (m1.timeslot + _UNIT_REQUIRED)

group by m1.row_num

having count(m2.row_num) = _UNIT_INTERVALS + 1

) unit,

(

select m1.row_num

from matrix m1,

matrix m2

where m1.unit_id = m2.unit_id

and m1.nurse_id = m2.nurse_id

and m1.nurse_scheduled = false

and m2.nurse_scheduled = false

and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time)

and m2.timeslot between m1.timeslot and (m1.timeslot + _NURSE_REQUIRED)

group by m1.row_num

having count(m1.row_num) = _NURSE_INTERVALS + 1

) nurse

where nurse.row_num = unit.row_num

order by unit.row_num

limit 1;

-- Assign the time, unit, and nurse to the unscheduled appointment.

update matrix_unscheduled set

appt_time = matrix.timeslot,

unit_id = matrix.unit_id,

nurse_id = matrix.nurse_id,

appt_status = 'S'

from matrix

where schedule_appt_id = _APPOINTMENT.schedule_appt_id

and matrix.row_num = _AVAILABLE;

-- Mark the unit as scheduled for that time.

update matrix set

unit_scheduled = true

from (select timeslot, unit_id from matrix where row_num = _AVAILABLE) m2

where matrix.unit_id = m2.unit_id

and matrix.timeslot between m2.timeslot and (m2.timeslot + _UNIT_REQUIRED);

-- Mark the nurse as scheduled for that time.

update matrix set

nurse_scheduled = true

from (select timeslot, nurse_id from matrix where row_num = _AVAILABLE) m2

where matrix.nurse_id = m2.nurse_id

and matrix.timeslot between m2.timeslot and (m2.timeslot + _NURSE_REQUIRED);

END LOOP;

CREATE TABLE matrix_unscheduled

(

row_num serial NOT NULL,

schedule_appt_id integer NOT NULL,

appt_time timestamp without time zone,

reserved_time timestamp without time zone,

appt_status character(1) NOT NULL,

unit_id integer,

nurse_id integer,

total_unit_time integer NOT NULL,

total_nurse_time integer NOT NULL,

CONSTRAINT pk_matrix_unscheduled PRIMARY KEY (row_num)

)

WITH (OIDS=FALSE);

CREATE TABLE matrix

(

row_num serial NOT NULL,

timeslot timestamp without time zone NOT NULL,

unit_id integer NOT NULL,

nurse_id integer NOT NULL,

unit_scheduled boolean NOT NULL,

nurse_scheduled boolean NOT NULL,

CONSTRAINT pk_matrix PRIMARY KEY (row_num)

)

WITH (OIDS=FALSE);

There are indexes on "matrix" for "timeslot,unit_id", "timeslot,nurse_id", and "unit_id,nurse_id".

Matthew Hartman
Programmer/Analyst
Information Management, ICP
Kingston General Hospital
(613) 549-6666 x4294

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-06-17 15:07:18 Re: Speeding up a query.
Previous Message Hartman, Matthew 2009-06-17 14:58:50 Re: Speeding up a query.