Re: How many views...

From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-general(at)postgresql(dot)org
Subject: Re: How many views...
Date: 2004-11-29 07:27:57
Message-ID: 200411282327.57565.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 28 November 2004 10:49 pm, Greg Stark wrote:
> "Uwe C. Schroeder" <uwe(at)oss4u(dot)com> writes:
> > I could create a view for every node in the system which calculates the
> > distance in the result set, making it easy to handle for the application:
> > select * from <view> where distance <= 50
> > The problem is, that the data will possibly contain thousands of nodes.
> > I'd also need 2 or 3 views per node - which could lead to 50.000 or even
> > 100.000 views.
>
> Normalize.
>
> Make your view construct the distance for _every_ node. So your query looks
> like:
>
> select * from <view> where node_id = 99 and distance <= 50

Well, in my case a "node_id" would refer to a zipcode (for the basic version,
later on maybe even pushing it down to street level using more detailed gps
data). The problem I see is that the records the view sits on top of may and
will contain several similar records. Think of UPS: they would want to know
the distance to any recipient of a parcel, however a lot of those recipients
have the same zipcode. I just can't think of a view that retrieves a specific
person in that zipcode based on the zipcode. So there would have to be a
second parameter to it - or a view for each record.
To put it better: The application records customers.
table customer (
customer_id int4 primary key,
zipcode varchar(10),
other stuff about the customer
)

A normalized view just wouldn't return that specific customer plainly based on
the zipcode, because there could/will be a second or third customer in the
same zipcode. For the distance that wouldn't matter, but for the customer
info it would.
I'm just playing with options right now. Maybe/likely I have to revise the
database model. The stored proc works fine, it just could become slow with a
lot of customer records. I think Michael's prior post is the better answer -
limit the possible coordinates to a subset before starting to calculate the
actual distance.

> The danger here is you have to be absolutely sure that Postgres is pushing
> that clause down into your view or the performance will be bad. But if your
> view is at all sane then I think it will. And you'll certainly know if it
> isn't.

That's what I'm afraid of. The database will potentially contain 100.000
customer records once productive - in the US alone, leaving aside what has to
be done about the rest of the world. So, yeah - one will notice the drag on
an expensive calculation, particularly because the application has lists that
show ALL customers sorted by distance to one (changing) location.

UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFBqs99jqGXBvRToM4RAsv+AKCsM05f9JR0yMIXfbELrArJ6z9WKACeKfYa
nAsM0NRh09R+Zl7eu+FDS/g=
=DiJv
-----END PGP SIGNATURE-----

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-11-29 07:36:10 Re: [HACKERS] Adding Reply-To: <listname> to Lists configuration ...
Previous Message Andrew Dunstan 2004-11-29 07:17:30 Re: [HACKERS] Adding Reply-To: <listname> to Lists