Re: indexes on varchar fields

From: Peter Nixon <listuser(at)peternixon(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: indexes on varchar fields
Date: 2002-11-08 12:27:38
Message-ID: 20021108142738.1d2622bc.listuser@peternixon.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 8 Nov 2002 12:03:15 +0000 (UTC)
Squire "Nigel J. Andrews" uttered the following:

> On Fri, 8 Nov 2002, Peter Nixon wrote:
>
> > Hi guys
> >
> > I have a large database (actually a bunch of large databases) that a
> > running as the backends of radius servers running voip accounting.
> >
> > The databases are getting quite large now and some queries are taking >
> > 5min to return. Now I read somewhere that indexing varchar fields is
> > not particularly worth it especially if you are only doing queries
> > occasionally. I am doing 200-500 inserts per min but only a query run
> > on this particular field when I want to compare my flat file logs
> > against whats in the DB to see if they match for billing purposes.
> >
> > Can someone inform me as to whether an index on the following field:
> >
> > h323ConfID varchar(64) DEFAULT '' NOT NULL
> >
> > Which contains data resembling the following:
> >
> > 41A4DCE3 8CF2D611 85170004 75AE73D4
> >
> > which is most likely unique, but not guaranteed so, is going to be
> > worth it?
>
> Well I would say that on the face of it it's a good index to keep.
> Without it you are going to have to do a seqscan over the entire table to
> select on that field. Even if there is only one select per 1000's of
> inserts you're going to find that select will be painfully long on a
> large table, and presumably just get longer as time goes on.

Yes. That is the case.

> However, do you know that the index is actually being used in the query?
> Try EXPLAINing the query in psql. If not may be that is why the query is
> taking so long. If it is then dropping the index may well make it worse.
>
> The idea of dropping an index for many insert/update per select is to
> avoid the work of managing the index for all those changes of data
> speeding up that side of things.
>
> One thing that does jump out at me is that those examples you give look
> like hexadecimal representation. Is this the case? I see you've imposed a
> limit of 64 characters on the field so it could be a very large number if
> so. However, as given, those data items look ideal to be stored as
> integers which I think may improve your searching speed a little.

Well, they are generated by cisco routers. They are the conference id for
each voip call and "should" be unique (at least to each router).
This (41A4DCE3 8CF2D611 85170004 75AE73D4) is the format they come out of
the router in, and I am just inserting the data directly to a text field.
You are right, they do seem to be hex numbers. Can postgres accept hex as a
numberic? if so that would speed things up ALOT...

--

Peter Nixon
http://www.peternixon.net/
PGP Key: http://www.peternixon.net/public.asc

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vince Vielhaber 2002-11-08 12:40:32 Re: inet/cidr data types
Previous Message Russell Aspinwall 2002-11-08 12:18:09 inet/cidr data types