Re: Really really slow select count(*)

From: felix <crucialfelix(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: sthomas(at)peak6(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Really really slow select count(*)
Date: 2011-02-07 01:52:01
Message-ID: AANLkTimnFc-hKG+PFEfwBPLAoK6q9kx=nc_4Fr++C_QB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Feb 6, 2011 at 4:23 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>wrote:

> Let's review:
>

1: No test or staging system used before production
>

no, I do not have a full ubuntu machine replicating the exact memory and
application load of the production server.

this was changing one configuration parameter. something I was advised to
do, read about quite a bit, tested on my development server (mac) and then
proceeded to do at 6 am on Sunday morning, our slowest time.

2: DB left in an unknown state (trying to shut down, not able)
>

what ?

I checked the site, everything was normal. I went in via psql and tried
some queries for about half an hour and continued to monitor the site. then
I went to bed at 7am (EU time).

Why did it shutdown so much later ?

I have never restarted postgres before, so this was all new to me. I
apologize that I wasn't born innately with such knowledge.

So is it normal for postgres to report that it failed to shut down, operate
for an hour and then go ahead and restart itself ?

3: No monitoring software to tell you when the site is down
>

of course I have monitoring software. both external and internal. but it
doesn't come and kick me out of bed. yes, I need an automated cel phone
call. that was the first thing I saw to afterwards.

4: I'm gonna just go ahead and guess no backups were taken either, or
> are regularly taken.
>

WTF ? of course I have backups. I just went through a very harsh down
period event. I fail to see why it is now necessary for you to launch such
an attack on me.

Perhaps the tone of my post sounded like I was blaming you, or at least you
felt that way. Why do you feel that way ?

Why not respond with: "ouch ! did you check this ... that...." say
something nice and helpful. correct my mistakes

> This website can't be very important, if that's the way you treat it.
>

just to let you know, that is straight up offensive

This is high traffic real estate site. Downtime is unacceptable. I had
less downtime than this when I migrated to the new platform.

I spent rather a large amount of time reading and questioning here. I asked
many questions for clarification and didn't do ANYTHING until I was sure it
was the correct solution. I didn't just pull some shit off a blog and start
changing settings at random.

I double checked opinions against different people and I searched for more
docs on that param. Amazingly none of the ones I found commented on the
shared memory issue and I didn't even understand the docs discussing shared
memory because it didn't seem to apply to what I was doing. that's my
misunderstanding. I come her to share my misunderstanding.

> And my phone starts complaining a minute after the site stops
> responding if something does go wrong the rest of the time. Do not
> lay this at anyone else's feet.
>

I didn't. There is not even the slightest hint of that in my post.

I came here and posted the details of where I went wrong and what confused
me about the documentation that I followed. That's so other people can
follow it and so somebody here can comment on it.

> changing that default
> > is brutally difficult and can only really be done by adjusting something
> in
> > the kernel.
>
> Please, that's a gross exaggeration. The sum totoal to changing them is:
>
> run sysctl -a|grep shm
> copy out proper lines to cahnge
> edit sysctl.conf
> put new lines in there with changes
> sudo sysctl -p # applies changes
> edit the appropriate postgresql.conf, make changes
> sudo /etc/init.d/postgresql-8.3 stop
> sudo /etc/init.d/postgresql-8.3 start
>

Considering how splendidly the experiment with changing fsm_max_pages went,
I think you can understand that I have no desire to experiment with kernel
settings.

It is easy for you because you ALREADY KNOW everything involved. I am not a
sysadmin and we don't have one. My apologies for that.

so does the above mean that I don't have to restart the entire server, just
postgres ? I assumed that changing kernel settings means rebooting the
server.

> I have clustered that table, its still unbelievably slow.
>
> Did you actually delete the old entries before clustering it? if it's
> still got 4G of old sessions or whatever in it, clustering ain't gonna
> help.
>

its a different table. the problem one has only 300k rows

the problem is not the size, the problem is the speed is catastrophic

> I still don't know if this bloat due to the small free space map has
> > anything to do with why the table is performing like this.
>
> Since you haven't show us what changes, if any, have happened to the
> table, neither do we :)
>

sorry, it didn't seem to be the most important topic when I got out of bed

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message felix 2011-02-07 01:55:57 Re: Really really slow select count(*)
Previous Message Greg Smith 2011-02-07 01:36:55 Re: Need some help analyzing some benchmarks