Re: Extracting hostname from URI column

From: Paul Lambert <paul(dot)lambert(at)autoledgers(dot)com(dot)au>
To: ogjunk-pgjedan(at)yahoo(dot)com, pgsql-sql(at)postgresql(dot)org
Subject: Re: Extracting hostname from URI column
Date: 2007-09-12 00:22:52
Message-ID: 46E7315C.3060006@autoledgers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

ogjunk-pgjedan(at)yahoo(dot)com wrote:
> Hi,
>
> I'm trying to use substr() and position() functions to extract the full host name (and later a domain) from a column that holds URLs.
> This is what I'm trying, but it clearly doesn't do the job.
>
> => select substr(href, position('://' in href)+3, position('://' in href)+3+position('/' in href)), href from url where id <10;
> substr | href
> ----------------+----------------------------------------------------------
> texturizer.net | http://texturizer.net/firebird/extensions/
> texturizer.net | http://texturizer.net/firebird/themes/
> forums.mozilla | http://forums.mozillazine.org/index.php?c=4
> www.mozillazin | http://www.mozillazine.org/
> devedge.netsca | http://devedge.netscape.com/viewsource/2002/bookmarks/
> www.google.com | http://www.google.com/search?&q=%s
> groups.google. | http://groups.google.com/groups?scoring=d&q=%s
> www.google.com | http://www.google.com/search?q=%s&btnI=I'm+Feeling+Lucky
> dictionary.ref | http://dictionary.reference.com/search?q=%s
>
> The 3rd param to the substr function is clearly wrong. Is it even doable without writing a procedure?
>
> Finally, is this the fastest way to get this data, or is there regex-based function that might be faster?
>
> Thanks,
> Otis
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>

Try this:

test=# select substr(href, position('://' in href)+3, position('/' in
substr(
href,position('://' in href)+3))-1), href from url;
substr | href

--------------------------+-----------------------------------------------------
-----
devedge.netscape.com |
http://devedge.netscape.com/viewsource/2002/bookmark
s/
texturizer.net | http://texturizer.net/firebird/extensions/
texturizer.net | http://texturizer.net/firebird/themes/
forums.mozillazine.org | http://forums.mozillazine.org/index.php?c=4
www.mozillazine.org | http://www.mozillazine.org/
devedge.netscape.com |
http://devedge.netscape.com/viewsource/2002/bookmark
s/
www.google.com | http://www.google.com/search?&q=%s
groups.google.com | http://groups.google.com/groups?scoring=d&q=%s
www.google.com |
http://www.google.com/search?q=%s&btnI=I'm+Feeling+L
ucky
dictionary.reference.com | http://dictionary.reference.com/search?q=%s
(10 rows)

--
Paul Lambert
Database Administrator
AutoLedgers

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message chester c young 2007-09-12 00:42:46 Re: Extracting hostname from URI column
Previous Message ogjunk-pgjedan 2007-09-12 00:06:12 Extracting hostname from URI column