proposal: contrib module - generic command scheduler

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: proposal: contrib module - generic command scheduler
Date: 2015-05-12 07:25:50
Message-ID: CAFj8pRAKzyF6JwARMqycVztEpp8mk0DgwdcWivbeqsw3HNRzfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Generic simple scheduler to contrib
===================================
Job schedulers are important and sometimes very complex part of any
software. PostgreSQL miss it. I propose new contrib module, that can be
used simply for some tasks, and that can be used as base for other more
richer schedulers. I prefer minimalist design - but strong enough for
enhancing when it is necessary. Some complex logic can be implemented in PL
better than in C. Motto: Simply to learn, simply to use, simply to
customize.

Motivation
----------
Possibility to simplify administration of repeated tasks. Possibility to
write complex schedulers in PL/pgSQL or other PL.

Design
------
Any scheduled command will be executed in independent worker. The number
workers for one command can be limited. Total number of workers will be
limited. Any command will be executed under specified user with known
timeout in current database. Next step can be implementation global
scheduler - but we have not a environment for running server side global
scripts, so I don't think about it in this moment.

This scheduler does not guarantee number of executions. Without available
workers the execution will be suspended, after crash the execution can be
repeated. But it can be solved in upper layer if it is necessary. It is not
designed as realtime system. Scheduled task will be executed immediately
when related worker will be free, but the execution window is limited to
next start.

This design don't try to solve mechanism for repeating tasks when tasks
hash a crash. This can be solved better in PL on custom layer when it is
necessary.

Scheduled time is stored to type scheduled_time:

create type scheduled_time as (second int[], minute int[], hour int[], dow
int[], month int[]);

(,"{1,10,20,30,40,50}",,,) .. run every 10 minutes.
(,"{5}",,,) .. run once per hour

The core is table pg_scheduled_commands

Oid: 1
name: xxxx
user: pavel
stime: (,"{5}",,,)
max_workers: 1
timeout: 10s
command: SELECT plpgsql_entry(scheduled_time(), scheduled_command_oid())

set timeout to 0 ~ unlimited, -1 default statement_timeout
set max_workers to 0 ~ disable tasks

API
---
pg_create_scheduled_command(name,
stime,
command,
user default current_user,
max_workers default 1,
timeout default -1);

pg_drop_scheduled_command(oid)
pg_drop_scheduled_command(name);

pg_update_scheduled_command(oid | name, ...

Usage:
------
pg_create_scheduled_command('delete obsolete data', '(,,"{1}",,)', $$DELETE
FROM data WHERE inserted < current_timestamp - interval '1month'$$);
pg_update_scheduled_command('delete obsolete data', max_workers => 2,
timeout :=> '1h');
pg_drop_scheduled_command('delete obsolete data');

select * from pg_scheduled_commands;

Comments, notices?

Regards

Pavel

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-05-12 07:37:44 Re: Multixid hindsight design
Previous Message Peter Geoghegan 2015-05-12 07:12:08 Re: Multi-xacts and our process problem