Re: COPY v. java performance comparison

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Rob Sargent <robjsargent(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: COPY v. java performance comparison
Date: 2014-04-03 19:28:23
Message-ID: CAMkU=1wU9KMexzc-6Gi3LtQBejGfzbAC8z8aBoaHPU8J_-6BiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Apr 3, 2014 at 9:04 AM, Rob Sargent <robjsargent(at)gmail(dot)com> wrote:

> I have to straighten out my environment, which I admit I was hoping to
> avoid. I reset checkpoint_segments to 12 and restarted my server.
> I kicked of the COPY at 19:00. That generated a couple of the "too
> frequent" statements but 52 "WARNING: pgstat wait timeout" lines during
> the next 8 hours starting at 00:37 (5 hours in) 'til finally keeling over
> at 03:04 on line 37363768.
>

Those things are not necessarily problems. If there is a problem, those
tell you places to look, nothing more. In particular, "pgstat wait
timeout" just means "Someone is beating the snot out of your hard drives,
and the stat collector just happened to notice that fact". This is
uninformative, because you already know you are beating the snot out of
your hard drives. That, after all, is the point of the exercise, right?
If you saw this message when you weren't doing anything particularly
strenuous, then that would be interesting.

> That's the last line of the input so obviously I didn't flush my last
> println properly. I'm beyond getting embarrassed at this point.
>
> Is turning auto-vacuum off a reasonable way through this?
>

No, no, no, no! First of all, what is the "this" you are trying to get
through? Previously you said you were not trying to get the data in as
fast as possible, but only to see what you can expect. Well, now you see
what you can expect. You can expect to load at a certain speed given a
certain table size, and you can expect to see some log messages about
unusual activity. Is it fast enough, or is it not fast enough?

If it is fast enough, and if you can ignore a few dozen messages in the log
file, then you are done. (Although you will still want to assess how
queries against your tables are affected by the load process, assuming your
database is used for interactive queries)

If it is not fast enough, then randomly disabling important parts of the
system which have nothing to do with the bulk load is probably not the way
to improve things, but is an excellent way to shoot yourself in the foot.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Rees 2014-04-03 19:32:12 Re: SSD Drives
Previous Message santhosh kumar 2014-04-03 19:19:39 Need some help in postgres locking mechanism