Re: Large databases, performance

From: "Shridhar Daithankar" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: pgsql-hackers(at)postgresql(dot)org, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Large databases, performance
Date: 2002-10-03 16:07:55
Message-ID: 3D9CB8B3.25411.A882D21@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-performance pgsql-sql

On 3 Oct 2002 at 8:54, Charles H. Woloszynski wrote:

> Can you comment on the tools you are using to do the insertions (Perl,
> Java?) and the distribution of data (all random, all static), and the
> transaction scope (all inserts in one transaction, each insert as a
> single transaction, some group of inserts as a transaction).

Most proably it's all inserts in one transaction spread almost uniformly over
around 15-20 tables. Of course there will be bunch of transactions..

> I'd be curious what happens when you submit more queries than you have
> processors (you had four concurrent queries and four CPUs), if you care
> to run any additional tests. Also, I'd report the query time in
> absolute (like you did) and also in 'Time/number of concurrent queries".
> This will give you a sense of how the system is scaling as the workload
> increases. Personally I am more concerned about this aspect than the
> load time, since I am going to guess that this is where all the time is
> spent.

I don't think so. Because we plan to put enough shared buffers that would
almost contain the indexes in RAM if not data. Besides number of tuples
expected per query are not many. So more concurrent queries are not going to
hog anything other than CPU power at most.

Our major concern remains load time as data is generated in real time and is
expecetd in database with in specified time period. We need indexes for query
and inserting into indexed table is on hell of a job. We did attempt inserting
8GB of data in indexed table. It took almost 20 hours at 1K tuples per second
on average.. Though impressive it's not acceptable for that load..
>
> Was the original posting on GENERAL or HACKERS. Is this moving the
> PERFORMANCE for follow-up? I'd like to follow this discussion and want
> to know if I should join another group?

Shall I subscribe to performance? What's the exat list name? Benchmarks? I
don't see anything as performance mailing list on this page..
http://developer.postgresql.org/mailsub.php?devlp

> P.S. Anyone want to comment on their expectation for 'commercial'
> databases handling this load? I know that we cannot speak about
> specific performance metrics on some products (licensing restrictions)
> but I'd be curious if folks have seen some of the databases out there
> handle these dataset sizes and respond resonably.

Well, if something handles such kind of data with single machine and costs
under USD20K for entire setup, I would be willing to recommend that to client..

BTW we are trying same test on HP-UX. I hope we get some better figures on 64
bit machines..

Bye
Shridhar

--
Clarke's Conclusion: Never let your sense of morals interfere with doing the
right thing.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lee Kindness 2002-10-03 16:10:16 Re: schemas in 7.3b1
Previous Message Bruce Momjian 2002-10-03 16:06:26 Re: schemas in 7.3b1

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Clift 2002-10-03 16:16:06 Re: [HACKERS] Large databases, performance
Previous Message Samuel A Horwitz 2002-10-03 16:06:41 Re: AIX compilation problems (was Re: Proposal ...)

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Clift 2002-10-03 16:16:06 Re: [HACKERS] Large databases, performance
Previous Message Robert Treat 2002-10-03 15:57:29 Re: Large databases, performance

Browse pgsql-sql by date

  From Date Subject
Next Message Justin Clift 2002-10-03 16:16:06 Re: [HACKERS] Large databases, performance
Previous Message Marie G. Tuite 2002-10-03 16:06:14 drop constraint primary key