Re: Predicting query runtime

From: Vinicius Segalin <vinisegalin(at)gmail(dot)com>
To: Istvan Soos <istvan(dot)soos(at)gmail(dot)com>
Cc: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Predicting query runtime
Date: 2016-09-13 00:06:35
Message-ID: CAAeH1nCNaPFfUQm+w51xJqqXFVf0-WjC-bK9JeFahHhVMBwS6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2016-09-12 18:22 GMT-03:00 Istvan Soos <istvan(dot)soos(at)gmail(dot)com>:

> Hi Vinicius,
>
> At Heap we have non-trivial complexity in our analytical queries, and
> some of them can take a long time to complete. We did analyze features
> like the query planner's output, our query properties (type,
> parameters, complexity) and tried to automatically identify factors
> that contribute the most into the total query time. It turns out that
> you don't need to use machine learning for the basics, but at this
> point we were not aiming for predictions yet.
>

And how did you do that? Manually analyzing some queries?
But I think, as you said, it wouldn't apply for predictions, but instead
for making long queries run faster, right?

>
> As a spoiler: queries take long time because they do a lot of IO.
> Features like reachback depth and duration (e.g. what period is the
> analytical query about) can contribute a lot to the amount of IO,
> thus, the query time. I have a blog post in my queue about our
> analysis, would gladly bump its priority if there is interest in such
> details.
>

If it's not too much work, I would like to have some details on your
process. It looks it's not exactly what I'm trying to do, but would
certainly help me with my work.

> I'm also curious: if you had a great way to predict the time/cost of
> the queries, how would you use it?

I'm working on something for my master degree (it's the idea, and I really
hope I can make it possible) where I'll help the user choosing the
resources for the database that will give him the best performance (or at
least the performance he thinks it's good enough). So the idea would be to
train each machine (with different resources) and then be able to predict
for an specific query what the performance would be.

Thank you all for the answers so far. I hope we can clear my mind about
this issue.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rich Shepard 2016-09-13 00:07:02 Re: Server crashed, now cannot start postgres [FIXED]
Previous Message Tom Lane 2016-09-12 23:57:10 Re: Server crashed, now cannot start postgres