Re: website doc search is extremely SLOW

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: "D(dot) Dante Lorenso" <dante(at)lorenso(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: website doc search is extremely SLOW
Date: 2003-12-31 11:57:14
Message-ID: 1072871834.2937.221.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Marc,

No it doesn't spider, it is a specialized tool for searching documents.

I'm curious, what value is there to being able to count the number of
url's ?

It does do things like query all documents where CREATE AND TABLE are n
words apart, just as fast, I would think these are more valuable to
document searching?

I think the challenge here is what do we want to search. I am betting
that folks use this page as they would man? ie. what is the command for
create trigger?

As I said my offer stands to help out, but I think if the goal is to
search the entire website, then this particular tool is not useful.

At this point I am working on indexing the sgml directly as it has less
cruft in it. For instance all the links that appear in every summary are
just noise.

Dave

On Wed, 2003-12-31 at 00:44, Marc G. Fournier wrote:
> On Wed, 31 Dec 2003, Dave Cramer wrote:
>
> > I can modify mine to be client server if you want?
> >
> > It is a java app, so we need to be able to run jdk1.3 at least?
>
> jdk1.4 is available on the VMs ... does your spider? for instance, you
> mention that you have the docs indexed right now, but we are currently
> indexing:
>
> Server http://archives.postgresql.org/
> Server http://advocacy.postgresql.org/
> Server http://developer.postgresql.org/
> Server http://gborg.postgresql.org/
> Server http://pgadmin.postgresql.org/
> Server http://techdocs.postgresql.org/
> Server http://www.postgresql.org/
>
> will it be able to handle:
>
> 186_archives=# select count(*) from url;
> count
> --------
> 393551
> (1 row)
>
> as fast as you are finding with just the docs?
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664
>
--
Dave Cramer
519 939 0336
ICQ # 1467551

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Greg Sabino Mullane 2003-12-31 12:01:55 Re: PostgreSQL speakers needed for OSCON 2004
Previous Message Peter Eisentraut 2003-12-31 11:20:28 Re: 'like' refuses to use an index???