Re: Refactored queries needing validation of syntactic equivalence

From: Mike Adams <madams55075(at)comcast(dot)net>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-sql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Refactored queries needing validation of syntactic equivalence
Date: 2007-10-20 05:45:11
Message-ID: 471995E7.2050902@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Richard Huxton wrote:
> Mike Adams wrote:
>> So.....
>> The first query should pull all 'MOM' records that have one or more
>> corresponding, and possibly orphaned, unassigned receiving records
>> belonging to the same po_cd and item_cd.
>>
>> The second query should pull all unassigned, and possibly orphaned
>> receiving records that have one or more corresponding 'MOM' records once
>> again matching on po_cd and item_cd.
>>
>> Using the results of both queries to double check each other, I can
>> figure out which (if any) open records are, in fact, orphans and do an
>> "after the fact" assignment to the "SPLIT IN MOM" invoice to reduce our
>> accrual.
>>
>> Of course, our ERMS should take care of this automagically; but,
>> tragically, it seems "real" inventory cost flow was attached to the
>> system using duct tape, hot glue, and a couple of thumb tacks.
>>
>> So, given all the administriva above, have I actually refactored them
>> correctly?
>
> Well, clearly you could have multiple possible matches, because apart
> from anything else you could in theory have multiple entries with the
> same item-code on the same purchase-order-code. In practice it will be
> rare, but it could happen.
>

Yep! and it's not rare: if we receive 20 serialized items, we *will* get
20 entries of same "itm_cd,po_cd" as serialized items are individually
posted in inventory (in accounting speak, they have a "specific item"
costing basis, whereas "non serialized" items (parts etc) are (by us)
considered to have a "FIFO" costing basis and can be lumped into "lots").

Yesterday I ran both the "legacy" and "refactor" versions of each query
after the AP clerk (for once) let me know that her assistant had
"joined" a number of receivings (did the reverse of a split for some
reason).

The "orphans" query (select o.co_cd, ...) came back with the same result
set for both the legacy and refactor versions. The "moms" query (select
m.co_cd, ...) did not!

What I had for the "moms" result sets were (fake products replacing the
real ones in the results below):

legacy | refactor
--------------------+--------------------------------
2 hotplate | 2 hotplate
6 scooper | 2 hotplate
| 6 scooper
| 6 scooper
| 6 scooper
| 6 scooper
| 6 scooper
| 6 scooper

The "orphans" result sets were the same (faked products in results below):

result set
-----------------
1 hotplate
1 hotplate
1 scooper
1 scooper
1 scooper
1 scooper
1 scooper
1 scooper

In truth those eight records returned by both "orphans" versions *were*
actually orphaned by the *2* "moms" records that /do/ exist and were
correctly reported by the legacy version... Oops! the refactored "moms"
query is an unintentional (by me) cross product!

> However, since the purpose is to provide you with a list so you can make
> manual changes there's no problem with that.
>

Except for the unwanted cross productions! Well, there isn't an
available "natural" way to prevent that as the table /doesn't/ have a
pkey or even a good candidate key.

What I did, and it did fix the result set to reflect reality, was change the
select o.co_cd, ... from ...
to
select distinct o.co_cd, ..., o.rowid from ...

rowid being Oracle's version of ctid and is the only "unique" item "in"
the table ( oh the shame ).

> What I might be tempted to do is restrict the dates more - you had <=
> '31 Oct 2007' I'd also apply >= '1 Aug 2007' (or whatever period is
> reasonable). You can always run an unconstrained match once a month to
> catch any that slip through the net, but presumably most will fall
> within a 90-day period.
>
> HTH

I may restrict the dates more, however the refactored queries both run
in under 1000 ms, and given the rcv_mo table currently has >5 && <7
years worth of historical data for them to plow through, and the plan is
to only keep the data in the table for 7 yrs max...

Thank you for the help! I've appreciated it greatly!

Mike.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Scott Marlowe 2007-10-20 09:01:20 Re: backup database tablespace with rsync?
Previous Message chester c young 2007-10-20 02:11:29 backup database tablespace with rsync?