Re: LIMITing number of results in a VIEW with global variables

From: Thomas Girault <toma(dot)girault(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LIMITing number of results in a VIEW with global variables
Date: 2011-10-15 12:52:19
Message-ID: CAMVHftTXnZy5s0r-H4ELa5MaL=6_NF7o-yCWq=z2EduJMgpZUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Florian,

It seems dangerous for a cast to modify global state such a MU. The
> evaluation
> order of functions during query execute isn't always easy to guess, and may
> change depending on the execution plan.

I supposed that fuzzy2bool is called just before the terminal evaluation of
the WHERE clause which needs a Boolean. My first tests showed that this
hypothesis is right but it might be wrong in the case of alternative
execution plans.

> > With this implicit cast, the query
> > SELECT age, young(age) FROM set_alpha(0.1), employees WHERE young(age);
> > is equivalent to
> > SELECT age, young(age) FROM set_alpha(0.1), employees WHERE
> > fuzzy2bool(young(age));
>
> Those set_alpha() calls seem equally dangerous. If this alpha is supposed
> to be a global parameter, why not set it *before* issuing the query?

Alternatively, we could also set the alpha value before the query :

SELECT set_alpha(0.1); SELECT age, young(age) FROM employees WHERE
young(age);

I would be very interested to know if there is smarter way to set global
variables.

> I can sort the results in the view 'sorted_employees' according to
> > value MU of a fuzzy predicate thanks to fuzzy2bool cast function.
> >
> > CREATE OR REPLACE VIEW sorted_employees AS
> > SELECT *, get_mu() as mu
> > FROM employees
> > ORDER BY mu DESC;
>
> Are you aware that an ORDER BY clause in a VIEW is only going to work
> if you do "SELECT .. .FROM view".

I It really the first time I am using views, I didn't know that !

> If the outer query is more complex
> than that, I wouldn't bet on the results coming back in the expected order.
>

I don't mind if the ordering is wrong : it is just a way to process
filtering according to K and ALPHA.

Usually, you'd attach ORDER BY to the outermost query (or to subqueries
> containing a LIMIT clause).
>
> > The following query
> > SELECT age, mu FROM set_k(5), set_alpha(0.1), filtered_employees
>
> Again, you seem to rely on these set_k(), set_alpha() calls being
> done before the query is executed. But postgres doesn't guarantee
> that.
>
> > WHERE young(age);
> > gives the results :
> > age mu
> > 24 1
> > 16 1
> > instead of :
> > age mu
> > 16 1
> > 21 0.899999976158142
> > 24 0.600000023841858
> > 26 0.400000005960464
> > 26 0.400000005960464
> >
> > It seems that the 'LIMIT K' instruction have side effects on the MU
> value.
>
> The execution plan may very well vary depending on the limit. Off-hand,
> I'd guess that with a small K, one of these set_whatever() calls in one
> of your FROM clauses gets executed after the computation it'd supposed
> to affect has already happened.
>
> > Why is it not working ? How to fix this issue ?
>
> Don't rely on the execution order of function calls in a SELECT statement.
> Divide your functions into two classes.
>
> The ones which have side-effects (i.e. change global state). These should
> *never* be called from SQL statements, except in the trivial case of
> "SELECT my_func(...);". Also, they should be marked with VOLATILE
>
> And the ones without side effects. Those should be marked with IMMUTABLE,
> STABLE of VOLATILE, depending on how they're influenced by global state
> changes. Read the documentation on these flags.
>
> Finally, don't assume that ORDER BY inside a view influences the output
> order of queries using the view. (Except in the trivial case of
> "SELECT * FROM view"). As a rule of thumb, ORDER BY in a view only makes
> sense if there's also a LIMIT clause. You are then guaranteed that the
> view returns the first <limit> rows according to the specified order.
> Don't assume they're necessarily returned in ascending order, though.
>
> best regards,
> Florian Pflug
>

Thank you very much ! I will try to follow your advices.

Thomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2011-10-15 13:37:00 Re: about EDITOR_LINENUMBER_SWITCH
Previous Message Thomas Girault 2011-10-15 11:55:35 Re: LIMITing number of results in a VIEW with global variables