Re: analyzing query results

From: "Lonni J Friedman" <netllama(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: analyzing query results
Date: 2007-08-13 21:51:03
Message-ID: 7c1574a90708131451o195aeb67o9cc417c87f5509bd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 8/13/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Lonni J Friedman" <netllama(at)gmail(dot)com> writes:
> > I tried to call DISTINCT ON, with the following query:
>
> > SELECT DISTINCT ON(date_created, cudacode) id,current_status FROM
> > cudaapps WHERE (SELECT now() - interval '24 hours' <
> > date_created::timestamp)='t' ORDER BY date_created;
>
> No, you're not grokking how to use DISTINCT ON. The DISTINCT part
> specifies what rows you want to group together --- here, all the ones
> with the same cudacode --- and then ORDER BY has to list all those same
> columns *followed by* the ordering condition that determines which row
> you want out of the group. What I think you want is
>
> SELECT DISTINCT ON (cudacode) id,current_status
> FROM cudaapps
> WHERE (SELECT now() - interval '24 hours' < date_created::timestamp)='t'
> ORDER BY cudacode, date_created DESC;
>
> which gives you the latest current_status for each cudacode, and then
> you filter the uninteresting rows in an outer select:
>
> SELECT id
> FROM
> (SELECT DISTINCT ON (cudacode) id,current_status
> FROM cudaapps
> WHERE (SELECT now() - interval '24 hours' < date_created::timestamp)='t'
> ORDER BY cudacode, date_created DESC) ss
> WHERE current_status = 'FAILED';

Thanks, this is exactly what I need. Only one question. What is the
'ss' that you have in that query? I tried googling, but getting any
useful hits on just "postgres ss" returns a ton of useless results.

>
> >> BTW, why are you writing the timestamp filter condition in such a
> >> bizarre way?
>
> > I'm doing this because date_created is of type char(20) rather than a
> > normal/expected date datatype. I know, its dumb.
>
> Well, so you have to have the cast, but I'd still think that
> WHERE (now() - interval '24 hours') < date_created::timestamptz
> would be the best way to express it. The sub-SELECT and the comparison
> to 't' are just obscurantism.

Thanks, that change is definitely equivalent.

>
> Another issue here is whether the date format was chosen to ensure that
> textual sorting of the values would give the same result as datewise
> sorting. You might get some fairly bizarre misbehavior if the data is
> sloppy about spaces instead of zero-fill, for instance. Sometimes it's
> worth biting the bullet and fixing the column type ...

Thankfully the date is always inserted via an automated script, so its
not going to get mangled or changed over time. The issue that I was
trying to address was a 3rd party web app which does a horrible job of
formatting date fields. I'm basically working around this apps bad
behavior by feeding it the date in the format that I need, rather than
letting the app mangle it.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman netllama(at)gmail(dot)com
LlamaLand http://netllama.linux-sxs.org

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2007-08-13 22:28:35 Re: analyzing query results
Previous Message Tom Lane 2007-08-13 20:42:22 Re: analyzing query results