efficient storing of urls

From: Shane Wegner <shane-keyword-pgsql(dot)a1e0d9(at)cm(dot)nu>
To: pgsql-general(at)postgresql(dot)org
Subject: efficient storing of urls
Date: 2004-02-27 22:53:54
Message-ID: 20040227225354.GB1054@cm.nu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello list,

I have a database where one of the tables stores urls and
it's getting to the point where queries are getting quite
slow. My urls table looks something like:

create table urls(
id serial,
url text,
unique(url),
primary key(id)
);

What I am thinking of doing is storing urls in a tree-like
structure

create table urls(
id serial,
url_part text,
parent_id int, -- references back to urls table
unique(parent_id,url_part)
);

So:
insert into urls (id,parent_id,url_part) (1, NULL,
'http://www.mydomain.com');
insert into url (id,parent_id,url_part) values(2, 1, '/images');

url id 2 would represent www.mydomain.com/images without
actually storing the full hostname and path for each url.

Is this a recommended way of storing urls or is there a
better way? Is it likely to result in faster joins as each
row will be smaller?

One final question, how would one get the full url back out
of the sql table referencing the parent back to the root
(null parent) for use by an sql like query and would that
procedure negate any performance benefits by this storage
method?

Thanks,
Shane

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sean Shanny 2004-02-27 23:00:36 Re: efficient storing of urls
Previous Message Tom Lane 2004-02-27 22:14:48 Re: field must appear in the GROUP BY clause or be used