From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | John Nix <maximum(at)shreve(dot)net>, pgsql-novice(at)postgresql(dot)org |
Subject: | Re: Number Sequence Query |
Date: | 2002-10-08 21:41:50 |
Message-ID: | 200210081441.50307.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
John,
> case_num
> ----------
> 020018
> 020020
> 020021
> 02002201
> 020024
>
> I'm only really looking at the numbers that start with 02 and the number
> sequence is the 4 numbers right after the 02. As you can see by my
> example, 020023 is missing. I would like to be able to run a query to
> see all the missing numbers in sequence. I'm not even sure where to start
> looking for information. I did searches from Google and in the groups
> (which normally answers my questions), but I can't find anything. Can
> someone help me with this? Thanks...
A classic SQL problem. Buy Celko's "SQL for Smarties"; you won't regret it.
Setp 1: Copy the first 6 digits of each number into a temp table and convert
them to INT. For the rest of the example, we will call that table
"case_numbers" and the colum "cnum".
Step2: Index "cnum" and VACUUM ANALYZE it.
Step 3: Run this:
SELECT (cbefore.cnum + 1) as begin_gap, (cafter.cnum - 1) as end_gap
FROM (SELECT cnum FROM case_numbers c1
WHERE NOT EXISTS (select cnum FROM case_numbers c2
WHERE c2.cnum = c1.cnum +1)
AND c1.cnum < (SELECT max(cnum) FROM case_numbers)) cbefore,
(SELECT cnum FROM case_numbers c3
WHERE NOT EXISTS (select cnum FROM case_numbers c4
WHERE c3.cnum = c4.cnum -1)
AND c1.cnum > (SELECT min(cnum) FROM case_numbers)) cafter
WHERE NOT EXISTS (SELECT cnum FROM case_numbers c5
WHERE c5.cnum BETWEEN cbefore.cnum AND cafter.cnum);
(check above for typos! This is off-the-cuff)
This should give you a list of all gaps in the sequence, in the form of:
begin_gap end_gap
20023 20023
20037 20041
20079 20079
etc.
What the query does is search for all sequence numbers that do not have a
number immediately following, and then all sequence numbers that do not have
a number immediately preceeding, and matches them up by testing if the gap is
continuous.
As you can imagine with all the sub-selects, it is a RAM-intensive query on
any large data set.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2002-10-08 22:58:23 | Re: What causes a table's oid to change? |
Previous Message | Richard Ray | 2002-10-08 21:34:01 | What causes a table's oid to change? |