Extracting hostname from URI column

From: ogjunk-pgjedan(at)yahoo(dot)com
To: pgsql-sql(at)postgresql(dot)org
Subject: Extracting hostname from URI column
Date: 2007-09-12 00:06:12
Message-ID: 335284.57764.qm@web50301.mail.re2.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi,

I'm trying to use substr() and position() functions to extract the full host name (and later a domain) from a column that holds URLs.
This is what I'm trying, but it clearly doesn't do the job.

=> select substr(href, position('://' in href)+3, position('://' in href)+3+position('/' in href)), href from url where id <10;
substr | href
----------------+----------------------------------------------------------
texturizer.net | http://texturizer.net/firebird/extensions/
texturizer.net | http://texturizer.net/firebird/themes/
forums.mozilla | http://forums.mozillazine.org/index.php?c=4
www.mozillazin | http://www.mozillazine.org/
devedge.netsca | http://devedge.netscape.com/viewsource/2002/bookmarks/
www.google.com | http://www.google.com/search?&q=%s
groups.google. | http://groups.google.com/groups?scoring=d&q=%s
www.google.com | http://www.google.com/search?q=%s&btnI=I'm+Feeling+Lucky
dictionary.ref | http://dictionary.reference.com/search?q=%s

The 3rd param to the substr function is clearly wrong. Is it even doable without writing a procedure?

Finally, is this the fastest way to get this data, or is there regex-based function that might be faster?

Thanks,
Otis

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Paul Lambert 2007-09-12 00:22:52 Re: Extracting hostname from URI column
Previous Message Richard Broersma Jr 2007-09-11 21:59:37 Re: Simple Query?