Re: Question on a select

From: Madison Kelly <linux(at)alteeve(dot)com>
To: PgSQL General List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Question on a select
Date: 2005-01-02 06:51:00
Message-ID: 41D799D4.4020103@alteeve.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Vincent Hikida wrote:
> There are several ways. I am making the simplifying assumption that
> name, type and dir cannot be NULL in either table. If they are the query
> is a little more complicated.
>
> The following are a couple of many techniques.
>
> SELECT a.a_name
> , a.a_type
> , a.a_dir
> FROM a_table a
> WHERE NOT EXISTS
> ( SELECT NULL
> FROM b_table b
> WHERE b.b_name = a.a_name
> AND b.b_type = a.a_type
> AND b.b_dir = a.a_dir
> )
>
> SELECT a.a_name
> , a.a_type
> , a.a_dir
> FROM a_table a
> LEFT JOIN b_table b
> ON a.a_table = b.b_table
> AND a.a_type = b.b_type
> AND a.a_dir = b.b_type
> WHERE b.b_table IS NULL // assumes that
> b.b_table is a not null column.
>
> Let's say that dir could be null and dir is a string, then (assuming
> that dir can never be 'xyz') you could say something like
>
> COALESCE(a.a_dir,'xyz') = COALESCE(b.b_dir,'xyz')
>
> Since NULL never equal NULL, if you want NULL in one table to match a
> NULL in another table, you need to change it to something not NULL.
> However this depends on what you want in your application.
>
> Queries like this are used often to check the integrity of your data.
> Examples of this are 1) What orders don't have order items? 2) What
> books have no authors? etc.

This is wonderful! Thank you for responding so quickly! :) I should
mention that I am still very much learning so I apologise in advance if
I miss the obvious. ^.^;

They are all 'not null' and I am trying to do exactly the kind of
task you described. I tried the first example on my DB and got a syntax
error:

tle-bu=> SELECT a.file_name, a.file_parent_dir, a.file_type FROM
file_info_1 a WHERE NOT EXIST (SELECT NULL FROM file_set_1 b WHERE
b.fs_name=a.file_name, b.fs_parent_dir=a.file_parent_dir,
b.fs_type=a.file_type);
ERROR: syntax error at or near "SELECT" at character 88

The second example you gave seems to work perfectly (as I will show
near the bottom of this email). What are the benefits and down sides of
each method? Is there a simple reason why the first method failed
(probably a typo I imagine...)?

A little more specifics about my DB:

'file_info_1' and 'file_set_1' are two tables I use to store
information of files and directories (this is a backup program).

'file_info_#' stores dynamic info like file size, owner and such.
This table is dropped and recreated before a new scan of the partition
creates a mass 'COPY' load (the '_1' indicates the first partition).

'file_set_#' stores static information such as "has the file been
selected for backup" which is why I keep it in a separate table. I want
to run this select first to write entries for newly added files and
directories (the values will match the file's parent) and then again in
reverse to remove from 'file_set_#' entries that no longer exist on the
partition.

If it helps, here is the structure of the tables:

CREATE TABLE file_info_ID (
file_acc_time bigint not null,
file_group_name varchar(255) not null,
file_group_uid int not null,
file_mod_time bigint not null,
file_name varchar(255) not null,
file_parent_dir varchar(255) not null,
file_perm varchar(10) not null,
file_size bigint not null,
file_type varchar(2) not null default 'f',
file_user_name varchar(255) not null,
file_user_uid int not null
);

CREATE TABLE file_set_# (
fs_backup boolean not null default 't',
fs_display boolean not null default 'f',
fs_name varchar(255) not null,
fs_parent_dir varchar(255) not null,
fs_restore boolean not null default 'f',
fs_type varchar(2) not null default 'f'
);

And here is some sample data that I have to work with (yes, it's a
win2k partition... I use it to test other aspects of my program and, if
I blow it away, I won't be upset. ^.^; All of this is being done on a
Fedora Core 3 install in case it makes a difference):

tle-bu=> SELECT file_type, file_parent_dir, file_name FROM file_info_1
WHERE file_parent_dir='/' LIMIT 30;
file_type | file_parent_dir | file_name
-----------+-----------------+------------------------
d | / | .
d | / | downloads
d | / | Documents and Settings
d | / | Program Files
f | / | io.sys
f | / | msdos.sys
f | / | _NavCClt.Log
d | / | WUTemp
d | / | Recycled
f | / | pagefile.sys
d | / | winnt
f | / | ntldr
f | / | ntdetect.com
f | / | boot.ini
f | / | config.sys
f | / | autoexec.bat
f | / | t5r4e3w2q1.exe
f | / | 1q2w3e4r5t.exe
f | / | logon.exe
f | / | arcldr.exe
f | / | arcsetup.exe
(21 rows)

tle-bu=> SELECT fs_type, fs_parent_dir, fs_name FROM file_set_1 WHERE
fs_parent_dir='/' LIMIT 30;
fs_type | fs_parent_dir | fs_name
---------+---------------+------------------------
d | / | .
d | / | downloads
d | / | Documents and Settings
d | / | Program Files
d | / | WUTemp
d | / | Recycled
d | / | winnt
(7 rows)

In this example I deleted manually all the 'f' entries so that when I
do the select I should get:

file_type | file_parent_dir | file_name
-----------+-----------------+------------------------
f | / | io.sys
f | / | msdos.sys
f | / | _NavCClt.Log
f | / | pagefile.sys
f | / | ntldr
f | / | ntdetect.com
f | / | boot.ini
f | / | config.sys
f | / | autoexec.bat
f | / | t5r4e3w2q1.exe
f | / | 1q2w3e4r5t.exe
f | / | logon.exe
f | / | arcldr.exe
f | / | arcsetup.exe

Which is exactly what your second example provides:

tle-bu=> SELECT a.file_name, a.file_parent_dir, a.file_type FROM
file_info_1 a LEFT JOIN file_set_1 b ON a.file_name=b.fs_name AND
a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type WHERE
b.fs_name IS NULL;
file_name | file_parent_dir | file_type
----------------+-----------------+-----------
1q2w3e4r5t.exe | / | f
arcldr.exe | / | f
arcsetup.exe | / | f
autoexec.bat | / | f
boot.ini | / | f
config.sys | / | f
io.sys | / | f
logon.exe | / | f
msdos.sys | / | f
_NavCClt.Log | / | f
ntdetect.com | / | f
ntldr | / | f
pagefile.sys | / | f
t5r4e3w2q1.exe | / | f
(14 rows)

Thank you very much for your help!

Madison

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Madison Kelly 2005-01-02 06:58:20 Re: Question on a select
Previous Message Sim Zacks 2005-01-02 06:21:31 Re: ISO_8859_8 encoding