Re: Large table search question

From: Richard Huxton <dev(at)archonet(dot)com>
To: John Wells <jb(at)sourceillustrated(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Large table search question
Date: 2004-06-01 08:44:12
Message-ID: 40BC41DC.9050408@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

John Wells wrote:
> Guys,
>
> I have a general question about designing databases for large data sets.
>
> I was speaking with a colleague about an application we're preparing to
> build. One of the application's tables will potentially contain 2 million
> or more names, containing (at least) the fields first_name, last_name,
> middle_name and prefix.
>
> A common lookup the application will require is the full name, so prefix +
> first_name + middle_name + last_name.
>
> My friend's suggestion was to create a "lookup field" in the table itself,
> which would contain a concatenation of these fields created during insert.
> So, for each record, we'd having each individual field and then a
> full_name field that would contain the combination of the ind. fields.
> His argument is that this will make lookups in this manner extremely fast
> and efficient.

Might, might not. No figures to back up his argument. It'll certainly
make updates slower and less efficient. In fact, since each row will
store the data twice you'll get less rows per disk-page which means
(potentially) more disk reads when you need to get several rows.

> I agree with his assertion, but get the feeling that this is sort of an
> ugly design. Would a compound index on these fields really be less
> efficient?

Doubtful, I'd certainly not try his solution until I'd tried the simple
way first.

If you really want to try your friend's approach on PG you can build a
functional index. As of 7.4, these can be expressions rather than just
indexes so you can do something like:

CREATE INDEX my_idx_1 ON table1 ( prefix || ' ' || first_name ...);

If you're using 7.3.x you'll need to wrap that expression in a function
and index the function instead.

In your case though, I'd just build a compound index and leave it at that.

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2004-06-01 08:54:07 Re: Connecting to an External DB (repost)
Previous Message Richard Huxton 2004-06-01 08:29:25 Re: confused by superuser-definition