From: | Dave Cramer <pg(at)fastcrypt(dot)com> |
---|---|
To: | "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org> |
Cc: | "D(dot) Dante Lorenso" <dante(at)lorenso(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: website doc search is extremely SLOW |
Date: | 2003-12-31 11:57:14 |
Message-ID: | 1072871834.2937.221.camel@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Marc,
No it doesn't spider, it is a specialized tool for searching documents.
I'm curious, what value is there to being able to count the number of
url's ?
It does do things like query all documents where CREATE AND TABLE are n
words apart, just as fast, I would think these are more valuable to
document searching?
I think the challenge here is what do we want to search. I am betting
that folks use this page as they would man? ie. what is the command for
create trigger?
As I said my offer stands to help out, but I think if the goal is to
search the entire website, then this particular tool is not useful.
At this point I am working on indexing the sgml directly as it has less
cruft in it. For instance all the links that appear in every summary are
just noise.
Dave
On Wed, 2003-12-31 at 00:44, Marc G. Fournier wrote:
> On Wed, 31 Dec 2003, Dave Cramer wrote:
>
> > I can modify mine to be client server if you want?
> >
> > It is a java app, so we need to be able to run jdk1.3 at least?
>
> jdk1.4 is available on the VMs ... does your spider? for instance, you
> mention that you have the docs indexed right now, but we are currently
> indexing:
>
> Server http://archives.postgresql.org/
> Server http://advocacy.postgresql.org/
> Server http://developer.postgresql.org/
> Server http://gborg.postgresql.org/
> Server http://pgadmin.postgresql.org/
> Server http://techdocs.postgresql.org/
> Server http://www.postgresql.org/
>
> will it be able to handle:
>
> 186_archives=# select count(*) from url;
> count
> --------
> 393551
> (1 row)
>
> as fast as you are finding with just the docs?
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664
>
--
Dave Cramer
519 939 0336
ICQ # 1467551
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Sabino Mullane | 2003-12-31 12:01:55 | Re: PostgreSQL speakers needed for OSCON 2004 |
Previous Message | Peter Eisentraut | 2003-12-31 11:20:28 | Re: 'like' refuses to use an index??? |