Re: Postgresql capabilities question

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: John Wells <jb(at)sourceillustrated(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgresql capabilities question
Date: 2003-04-03 17:15:42
Message-ID: Pine.LNX.4.33.0304031009550.19813-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 2 Apr 2003, John Wells wrote:

> I have a M$ Sql Server db that I'm porting to postgresql. Approx. 24
> tables from this old db can be combined in the new database into one
> table, and it would be a bit more elegant to do this.
>
> However, the combined table would be around 95000 rows in size.
>
> Having never really used Postgresql in the past, and unable to find a
> datapoint on the web, I would really like to get input from current users.
> Is this an unreasonable table size to expect good performance when the
> PHP app driving it gets a reasonable amount of traffic? I know
> performance is also heavily dependent on indexes and query structure, but
> disregarding either of those for the sake of argument, would I be better
> off keeping the tables separate, or is 95000 not something to worry about?
> btw, most tables in this database are quite small (<2000). My redesign
> would create two tables in the +90000 range, but less than 100000.

We use postgresql for lots of stuff that's more than 90,000 rows. We have
a table we use to merge log files from multiple servers and then sort them
by time.

Here's an example of it:

\d logs
Table "logs"
Column | Type | Modifiers
----------+---------+----------------------------------------------------
logid | integer | not null default nextval('"logs_logid_seq"'::text)
linedate | integer |
line | text |
Primary key: logs_pkey

select count(*) from logs;
count
--------
602346
(1 row)

explain analyze select count(*) from logs;
NOTICE: QUERY PLAN:

Aggregate (cost=334868.46..334868.46 rows=1 width=0) (actual
time=2463.31..2463.31 rows=1 loops=1)
-> Seq Scan on logs (cost=0.00..33695.46 rows=602346 width=0) (actual
time=0.02..1592.28 rows=602346 loops=1)
Total runtime: 2463.43 msec

(2.5 seconds to seq scan the whole table);

-- select one record

explain analyze select logid from logs limit 1 offset 100000;
NOTICE: QUERY PLAN:

Limit (cost=5594.04..5594.09 rows=1 width=4) (actual time=394.75..394.76
rows=1 loops=1)
-> Seq Scan on logs (cost=0.00..33695.46 rows=602346 width=4) (actual
time=0.02..307.74 rows=100002 loops=1)
Total runtime: 394.86 msec

-- grab every row in the table

explain analyze select * from logs;
NOTICE: QUERY PLAN:

Seq Scan on logs (cost=0.00..33695.46 rows=602346 width=341) (actual
time=0.03..3284.99 rows=602346 loops=1)
Total runtime: 3881.38 msec

--

the size of this table on the hard drive is 226,689,024 bytes. It's one
of the smaller tables we play with on this server. Please note that this
is our well tuned production server. A workstation with default settings
and less memory might not be quite as fast as that, but you're still
looking at sub second response times on well indexed datasets with less
than say 5,000,000 rows or so.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message alex b. 2003-04-03 17:23:40 Re: images in database
Previous Message Jason Hihn 2003-04-03 17:11:52 Re: Help with array constraints