Re: "Mysterious" issues with newly installed 8.3

From: "Scott Carey" <scott(at)richrelevance(dot)com>
To: "Carlos Moreno" <morenopg(at)mochima(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: "Mysterious" issues with newly installed 8.3
Date: 2008-10-10 00:34:55
Message-ID: a1ec7d000810091734k60cee7d7qb95470936ad461c1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Oct 9, 2008 at 4:51 PM, Carlos Moreno <morenopg(at)mochima(dot)com> wrote:

>
> Ok, I know that such an open and vague question like this one
> is... well, open and vague... But still.
>
> The short story:
>
> Just finished an 8.3.4 installation on a new machine, to replace
> an existing one; the new machine is superior (i.e., higher
> performance) in virtually every way --- twice as much memory,
> faster processor, faster drives, etc.
>
> I made an exact copy of the existing database on the new
> machine, and the exact same queries run on both reveal that
> the old machine beats the new one by a factor of close to 2 !!!!
> (i.e., the same queries run close to twice as fast on the old
> machine!!!)
>
> To make things worse: the old machine is in operation, under
> normal workload (and right now the system may be around
> peak time), and the new machine is there sitting doing nothing;
> just one user logged in using psql to run the queries --- *no-one
> and nothing* is connecting to the new server.
>
> So... What's going on???
>
>
> The details:
>
> CPU:
> New: Opteron DC 1218HE (1MB cache per core) @2.6GHz
> Old: Athlon64 X2 (512K cache per core) @2.2GHz
>
> RAM:
> New: 4GB
> Old: 2GB
>
> HD:
> Doesn't matter the capacity, but I have every reason to believe
> the new one is faster --- hdparm reports 105MB/sec transfer
> rate; the measurement for the old server is meaningless, since
> it is in operation (i.e., there is actual database activity), so it
> measures between 50MB/sec and 70MB/sec. Given its age, I
> would estimate 70 to 80 MB/sec
>
> OS:
> New: CentOS 5.2 (gcc 4.1.2)
> Old: FC6 (gcc 4.1.2)
>
> PG:
> New: 8.3.4 installed from source
> Old: 8.2.4 installed from source
>
> Presumably relevant configuration parameters --- shared_buffers
> was set to 250MB on the old one; I set it to 500MB on the new
> one (kinda makes sense, no? 1/8 of the physical memory in both
> cases).
>
> I set max_fsm_pages a little bit higher on the new one (409600
> instead of 307200 on the old one). The rest is pretty much
> identical (except for the autovacuum --- I left the defaults in the
> new one)
>
>
> The old machine is vacuum-analyzed once a day (around 4AM);
> on the new one, I ran a vacuumdb -z -f after populating it.
>
>
> Some interesting outputs:
>
> explain analyze select count(*) from users;
> New:
> Aggregate (cost=8507.11..8507.12 rows=1 width=0) (actual
> time=867.582..867.584 rows=1 loops=1)
> -> Seq Scan on users (cost=0.00..7964.49 rows=217049 width=0)
> (actual time=0.016..450.560 rows=217049 loops=1)
> Total runtime: 867.744 ms
>
> Old:
> Aggregate (cost=17171.22..17171.22 rows=1 width=0) (actual
> time=559.475..559.476 rows=1 loops=1)
> -> Seq Scan on users (cost=0.00..16628.57 rows=217057 width=0)
> (actual time=0.009..303.026 rows=217107 loops=1)
> Total runtime: 559.536 ms
>
> Running the same command again several times practically
> does not change anything.
>
>
> explain analyze select count(*) from users where username like 'A%';
> New:
> Aggregate (cost=6361.28..6361.29 rows=1 width=0) (actual
> time=87.528..87.530 rows=1 loops=1)
> -> Bitmap Heap Scan on users (cost=351.63..6325.33 rows=14376
> width=0) (actual time=6.444..53.426 rows=17739 loops=1)
> Filter: ((username)::text ~~ 'a%'::text)
> -> Bitmap Index Scan on c_username_unique (cost=0.00..348.04
> rows=14376 width=0) (actual time=5.383..5.383 rows=17739 loops=1)
> Index Cond: (((username)::text >= 'a'::text) AND
> ((username)::text < 'b'::text))
> Total runtime: 87.638 ms
>
> Old:
> Aggregate (cost=13188.91..13188.92 rows=1 width=0) (actual
> time=61.743..61.745 rows=1 loops=1)
> -> Bitmap Heap Scan on users (cost=392.07..13157.75 rows=12466
> width=0) (actual time=7.433..40.847 rows=17747 loops=1)
> Filter: ((username)::text ~~ 'a%'::text)
> -> Bitmap Index Scan on c_username_unique (cost=0.00..388.96
> rows=12466 width=0) (actual time=5.652..5.652 rows=17861 loops=1)
> Index Cond: (((username)::text >= 'a'::character varying)
> AND ((username)::text < 'b'::character varying))
> Total runtime: 61.824 ms
>
>
> Any ideas?
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

First, use iostat or another tool to view the disk usage on the new machine
during these queries and validate that it is not using the disk at all.
This is most likely the case.

Then, to be sure, set its config parameters to be equal to the old one, and
turn off auto-vacuum. This will also most likely have no effect.

Once this is confirmed, we can be pretty sure that the issue is restricted
to:
CPU / RAM / Motherboard on the hardware side. There may still be some
software effects in the OS or drivers, or PostgreSQL to account for, but
lets drill into the hardware and try and eliminate that first.

Sure, the processor should be faster, but Athlon64s / Opterons are very
sensitive to the RAM used and its performance and tuning.
So, you should find some basic CPU benchmarks and RAM benchmarks -- you'll
want to measure latency as well as bandwidth.
Athlon64 and Opteron both typically have two memory busses per processor,
and it is possible to populate the memory banks in such a way that the
system has half the bandwidth.
In any event, you'll first want to identify if simple benchmark software is
able to prove a disparity between the systems independant of postgres. This
may be a bit difficult to do on the live system however.

But it is my suspicion that Postgres performance is often more dependant on
the memory subsystem performance than the CPU Mhz (as are most databases)
and poor components, configuration, or tuning on that side would show up in
queries like the examples here.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Craig James 2008-10-10 02:45:25 Re: "Mysterious" issues with newly installed 8.3
Previous Message Carlos Moreno 2008-10-09 23:51:26 "Mysterious" issues with newly installed 8.3