How best to represent relationships in a database generically?

From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: pgsql-general(at)postgresql(dot)org
Subject: How best to represent relationships in a database generically?
Date: 2007-07-27 18:23:57
Message-ID: 200707271827.l6RIR166050712@smtp3.jaring.my
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

Sorry, this really isn't postgresql specific, but I figure there are
lots of smarter people around here.

Say I have lots of different objects (thousands or even millions?).
Example: cow, grass, tiger, goat, fish, penguin.

BUT I'm not so interested in defining things by linking them to
categories or giving them names, I'm trying to figure out a way to
define things by their relationships with other things, and more
importantly do searches and other processing by those relationships.

So, what would be the best way to store them so that a search for the
relationship like "grass is to cow", will also turn up cow is to
tiger, and goat is to tiger, and fish is to penguin (and penguin is
to bigger fish ;) ), and electricity is to computer. And a search for
cow is to goat, could turn up tiger is to lion, and goat is to cow.

Is the only way to store all the links explicitly? e.g. have a huge
link table storing stuff like obj => cow, subj => grass, type =>
consumes, probability=90% ( => means points/links to). Or even just
have one table (links are objects too).

Or is it possible to somehow put the objects in a multidimensional
space (1000 dimensions?) and keep trying to arrange the objects so
that their relationships/vectors with/from each other are fairly
consistent/reasonable based on "current knowledge"? Trouble is in
some cases the grass eventually eats the cow, so maybe that doesn't
work at all ;).

Or even do both? Maybe use the first as a cache, and the second for
"deeper" stuff ("flash of insight" or "got the punchline" = figure
out better arrangement/ joining of disparate maps).

My worry about the first approach is that the number of links might
go up very much faster as you add more objects. But perhaps this
won't be true in practice. The worry about the second approach is
that it might get "stuck", or run out of dimensions.

Is there a better way to do this? There must be right?

Wait for huge quantum computers and use qubits for each
multidimensional coordinate? ;).

Regards,
Link.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message pc 2007-07-27 18:35:21 query to match '\N'
Previous Message Stuart 2007-07-27 18:22:04 ascii() for utf8