Postgres "failures" dataset for machine learning

From: Ben Simmons <simmons(dot)a(dot)ben(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Postgres "failures" dataset for machine learning
Date: 2019-04-10 18:41:46
Message-ID: CACHBLfjpKe6hLRMwzbHw7yFm5Nm3t4n5AEuZusjxN=5+a73ehA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I was wondering if there exists either a test suite of pathological failure
cases for postgres, or a dataset of failure scenarios. I'm not exactly sure
what such a dataset would look like, possibly a bunch of snapshots of test
databases when undergoing a bunch of different failure scenarios?

I'm experimenting with machine learning and I had an idea to build a
classifier to determine if a running postgres database is having issues.
Right now "issues" is very ambiguously defined, but I'm thinking of
problems I've encountered at work, such as resource saturation, long
running transactions, lock contention, etc. I know a lot of this is already
covered by existing monitoring solutions, but I'm specifically interested
to see if a ML model can learn monitoring rules on its own.

If the classifier turns out to be feasible then my hope would to be to
expand the ML model to have some diagnostic capabilities -- I've had
difficulty in the past figuring out exactly what is going wrong with
postgres when my workplace's production environment was having database
issues.

Thanks,

Ben Simmons

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-04-10 18:52:23 Re: Should the docs have a warning about pg_stat_reset()?
Previous Message Robert Haas 2019-04-10 18:38:43 Re: block-level incremental backup