Meeting wrapup notes and files posted on website

From: "Selena Deckelmann" <selenamarie(at)gmail(dot)com>
To: "Postgresql PDX_Users" <pdxpug(at)postgresql(dot)org>
Subject: Meeting wrapup notes and files posted on website
Date: 2008-08-22 17:46:21
Message-ID: 2b5e566d0808221046n2c89ab62yd2afe84574089b51@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pdxpug

http://pugs.postgresql.org/node/460

This month's PDXPUG meeting featured Lloyd Albin from SCHARP, the
Statistical Center for HIV/AIDS Research & Prevention located near
Seattle, WA. Lloyd was nice enough to drive down for the evening and
tell us about some of the work he's done over the last six months.

We had a couple announcements - we need talk ideas for future
meetings! Please get in touch with Selena if you have ideas for a
talk. Tom Raney offered to give a talk on his Visual Planner tool next
month, and Len Shapiro offered to give a talk in January or February
on teaching database theory using PostgreSQL.

Also, PostgreSQL West is coming up. Get your talks submitted now! See:
http://www.postgresqlconference.org/west08/

Lloyd started off with a description of some of the problems his group
encountered that led them to use Tsearch. The researchers and
financial group both needed to be able to search through most of the
textual data, but didn't necessarily know ahead of time which tables
or even the columns that they'd be most interested in. So, Lloyd
helped construct search tables from the original tables that contained
all of the text and references back to the original tables.

SCHARP currently has about 1.5 TB of data spread across 6 PostgreSQL
instances, BerkeleyDB files and spreadsheets.

Using a series of triggers and materialized views (generated nightly
to improve search speed), the search tables improved the search speed
from 8-minutes per query to about 1/2 a second. Lloyd chose to use
GIST indexes instead of GIN for speed, and is using the built-in
dictionaries for now. He thinks in the future that they will have to
generate custom dictionaries as their data set grows.

Lloyd got permission to share his PL/Perl code and table structures.
All that information is attached to this post.

--
Selena Deckelmann
PDXPUG - http://pugs.postgresql.org/pdx
Me - http://www.chesnok.com/daily

Browse pdxpug by date

  From Date Subject
Next Message Susan Schmidt 2008-08-27 23:32:45 Please post this position to your group
Previous Message jterwill 2008-08-21 00:13:22 Re: August meeting tomorrow!