Re: Mnogosearch (Was: Re: website doc search is ... )

From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mnogosearch (Was: Re: website doc search is ... )
Date: 2004-01-01 22:30:33
Message-ID: 20040101182954.D913@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 1 Jan 2004, Tom Lane wrote:

> "Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> > On Thu, 1 Jan 2004, Tom Lane wrote:
> >> "Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> >>> what sort of impact does CLUSTER have on the system? For instance, an
> >>> index happens nightly, so I'm guessing that I'll have to CLUSTER each
> >>> right after?
> >>
> >> Depends; what does the "index" process do --- are ndict8 and friends
> >> rebuilt from scratch?
>
> > nope, but heavily updated ... basically, the indexer looks at url for what
> > urls need to be 're-indexed' ... if it does, it removed all words from the
> > ndict# tables that belong to that url, and re-adds accordingly ...
>
> Hmm, but in practice only a small fraction of the pages on the site
> change in any given day, no? I'd think the typical nightly run changes
> only a small fraction of the entries in the tables, if it is smart
> enough not to re-index pages that did not change.

that is correct, and I further restrict it to 10000 URLs a night ...

> My guess is that it'd be enough to re-cluster once a week or so.
>
> But this is pointless speculation until we find out whether clustering
> helps enough to make it worth maintaining clustered-ness at all. Did
> you get any results yet?

Its doing the CLUSTERing right now ... will post results once finished ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martin Marques 2004-01-01 22:31:52 Re: GetLastInsertID ?
Previous Message Tom Lane 2004-01-01 22:27:10 Re: Mnogosearch (Was: Re: website doc search is ... )