Skip site navigation (1) Skip section navigation (2)

More of a SQL question, I guess.

From: Howard Eglowstein <howard(at)yankeescientific(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: More of a SQL question, I guess.
Date: 2007-08-13 20:41:59
Message-ID: 46C0C217.4080908@yankeescientific.com (view raw or flat)
Thread:
Lists: pgsql-novice
I have a database with 150 fields per row. For historical reasons, it's 
broken into three databases, data_a, data_b and data_c. One of the 
fields is a timestamp and one of them is a serial number that came in 
from a remote machine that reports its status every day.

If I use a command like 'SELECT MAX(Timestamp) FROM data_a WHERE field1 
= '0001', I can get back the time of the latest report. I can then issue 
the command 'SELECT * from data_a, data_b, data_c WHERE 
data_a.id=data_b.id AND data_a.id=data_c.id and field1='0001' to get the 
150 fields for that report. It works fine, but it takes a while to 
respond when the database is the better part of a million records.

If it were just one serial number, the two queries would be okay because 
It really only takes a minute. The problem I have is that we're hoping 
to have thousands of machine in the field (this is a just a test 
database) and clearly this approach won't work for thousands of serial 
numbers. The second method I did was to simply use 'SE:ECT * from 
data_a.....' to get all million records, and have my C code look for 
serial numbers in each line and keep the latest by timestamp. That takes 
about as long as doing the first procedure 3 times, but it gives me the 
latest data for all of the serial numbers in the system. That's 
perfectly cool, except that it won't scale nicely. If the web code that 
does the search isn't on the same machine that holds the data, we'll 
have to ship gigabytes of data over the network for each search.

What I'd *like* is something that uses groups and MAX() to do this in 
one SQL command. It would group the data by the serial number (field1), 
find the record in each group with the maximum timestamp value and 
return all 150 fields. Is this possible?  I thought of using unions, but 
I think I have to issue a pretty long command for each group and the PG 
buffers probably will max out after some relatively small number, no?

Any thoughts would be appreciated.

Howard

Responses

pgsql-novice by date

Next:From: Tom LaneDate: 2007-08-13 20:42:22
Subject: Re: analyzing query results
Previous:From: Lonni J FriedmanDate: 2007-08-13 20:02:15
Subject: Re: analyzing query results

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group