Re: SQL Count Magic Required....

From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: <operationsengineer1(at)yahoo(dot)com>, "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: SQL Count Magic Required....
Date: 2006-06-01 16:03:00
Message-ID: C0A487F4.C4E1%sdavis2@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 6/1/06 11:47 AM, "operationsengineer1(at)yahoo(dot)com"
<operationsengineer1(at)yahoo(dot)com> wrote:

>>> i have the following query that yields a series of
>>> true or false results:
>>>
>>> -- distinct on is pgsql extension --
>>> SELECT DISTINCT ON (t_inspect_result.inspect_id)
>>> t_inspect_result.inspect_result_id,
>>>
>>> t_inspect_result.inspect_result_pass,
>>> t_inspect_area.inspect_area,
>>>
>>> t_inspect_result.inspect_result_timestamp
>>> --,t_inspect.serial_number_id,
>>> t_inspect.inspect_id
>>> FROM t_inspect_result, t_inspect, t_inspect_area,
>>> t_serial_number,
>>> t_link_contract_number_job_number, t_job_number,
>>> t_product
>>> WHERE t_inspect.inspect_area_id =
>>> t_inspect_area.inspect_area_id
>>> AND t_inspect.serial_number_id =
>>> t_serial_number.serial_number_id
>>> AND
>> t_serial_number.link_contract_number_job_number_id
>>> =
>>>
>>>
>>
> t_link_contract_number_job_number.link_contract_number_job_number_id
>>> AND
>> t_link_contract_number_job_number.job_number_id =
>>> t_job_number.job_number_id
>>> AND t_product.product_id =
>>> t_job_number.product_id
>>> AND t_inspect.inspect_id =
>>> t_inspect_result.inspect_id
>>> AND t_inspect.serial_number_id = '200'
>>> ORDER BY t_inspect_result.inspect_id DESC,
>>> t_inspect_result.inspect_result_timestamp ASC
>>> -- used to get first pass yield pass / fail (true
>> /
>>> false) data.
>>> -- inspect_id desc impacts end result. time desc
>>> impacts the groups prior to being distinctly
>> listed
>>>
>>> the simplified output may look like
>>>
>>> f,t,t,f,f,f,t,f,t,f
>>>
>>> the COUNT magic comes into play b/c i want to
>> count
>>> the result set's "t"s and total, but i have no
>> clue
>>> how to get this done.
>>>
>>> #ts: 4
>>> #total: 10
>>>
>>> when i have this data, i can apply some math and
>> come
>>> up with a 40% yield.
>>
>> You can do a query like (untested, and needs to be
>> translated into your
>> monster query):
>>
>> select
>> a.id,a.total,b.failed,(a.total::numeric)/b.total as
>> yield
>> from (select count(test_result) as total from
>> table) as a,
>> (select count(test_result) as failed from
>> table where
>> test_result='f') as b where a.id = b.id;
>>
>> The point is to do the queries separately as
>> subqueries and join them on
>> some primary key so that you get the count "total"
>> and the count "failed".
>> Then you can do the math as above. Note that you
>> have to cast at least one
>> of the integers to numeric if you want a numeric
>> result.
>>
>> Sean
>
> Sean, t_test_result can get quite large quite quickly.
> iow, this query could be running on a million or more
> rows in a year or two.
>
> is this a performance killer query?
>
> the reason i ask is i could add a column to t_inspect
> called first_pass and store the pass/fail data there
> and then just complete a much simpler query counting
> the t_inspect.first_pass true values and then the
> getting the total.
>
> i'd imagine that would be much faster and much simpler
> to conceptualize. the downside is that i would have
> repeating information.

I don't know how fast this will be. You will have to benchmark some. An
alternative is to use a "historic" table that includes infrequently accessed
historical data (for serial numbers of products no longer produced, for
example). Or you could create a materialized view of the summarized data.
Or put in a trigger that updates counts to another table when you insert
into the main table. But, I would probably prove to yourself that your
query performance is unacceptable, first. If this is going to be used to
generate a "report" once a month, then if it takes 15 minutes to run, no big
deal. Also, if you are looking up only one item at a time (by serial
number, for example, for a web interface) so that indexing can be used,
things will be very fast, I would imagine. You will have to benchmark.

Sean

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Alex du Plessis 2006-06-01 16:44:12 Server side communication
Previous Message operationsengineer1 2006-06-01 15:47:53 Re: SQL Count Magic Required....