From: | felix <crucialfelix(at)gmail(dot)com> |
---|---|
To: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | sthomas(at)peak6(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Really really slow select count(*) |
Date: | 2011-02-07 01:52:01 |
Message-ID: | AANLkTimnFc-hKG+PFEfwBPLAoK6q9kx=nc_4Fr++C_QB@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Sun, Feb 6, 2011 at 4:23 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>wrote:
> Let's review:
>
1: No test or staging system used before production
>
no, I do not have a full ubuntu machine replicating the exact memory and
application load of the production server.
this was changing one configuration parameter. something I was advised to
do, read about quite a bit, tested on my development server (mac) and then
proceeded to do at 6 am on Sunday morning, our slowest time.
2: DB left in an unknown state (trying to shut down, not able)
>
what ?
I checked the site, everything was normal. I went in via psql and tried
some queries for about half an hour and continued to monitor the site. then
I went to bed at 7am (EU time).
Why did it shutdown so much later ?
I have never restarted postgres before, so this was all new to me. I
apologize that I wasn't born innately with such knowledge.
So is it normal for postgres to report that it failed to shut down, operate
for an hour and then go ahead and restart itself ?
3: No monitoring software to tell you when the site is down
>
of course I have monitoring software. both external and internal. but it
doesn't come and kick me out of bed. yes, I need an automated cel phone
call. that was the first thing I saw to afterwards.
4: I'm gonna just go ahead and guess no backups were taken either, or
> are regularly taken.
>
WTF ? of course I have backups. I just went through a very harsh down
period event. I fail to see why it is now necessary for you to launch such
an attack on me.
Perhaps the tone of my post sounded like I was blaming you, or at least you
felt that way. Why do you feel that way ?
Why not respond with: "ouch ! did you check this ... that...." say
something nice and helpful. correct my mistakes
> This website can't be very important, if that's the way you treat it.
>
just to let you know, that is straight up offensive
This is high traffic real estate site. Downtime is unacceptable. I had
less downtime than this when I migrated to the new platform.
I spent rather a large amount of time reading and questioning here. I asked
many questions for clarification and didn't do ANYTHING until I was sure it
was the correct solution. I didn't just pull some shit off a blog and start
changing settings at random.
I double checked opinions against different people and I searched for more
docs on that param. Amazingly none of the ones I found commented on the
shared memory issue and I didn't even understand the docs discussing shared
memory because it didn't seem to apply to what I was doing. that's my
misunderstanding. I come her to share my misunderstanding.
> And my phone starts complaining a minute after the site stops
> responding if something does go wrong the rest of the time. Do not
> lay this at anyone else's feet.
>
I didn't. There is not even the slightest hint of that in my post.
I came here and posted the details of where I went wrong and what confused
me about the documentation that I followed. That's so other people can
follow it and so somebody here can comment on it.
> changing that default
> > is brutally difficult and can only really be done by adjusting something
> in
> > the kernel.
>
> Please, that's a gross exaggeration. The sum totoal to changing them is:
>
> run sysctl -a|grep shm
> copy out proper lines to cahnge
> edit sysctl.conf
> put new lines in there with changes
> sudo sysctl -p # applies changes
> edit the appropriate postgresql.conf, make changes
> sudo /etc/init.d/postgresql-8.3 stop
> sudo /etc/init.d/postgresql-8.3 start
>
Considering how splendidly the experiment with changing fsm_max_pages went,
I think you can understand that I have no desire to experiment with kernel
settings.
It is easy for you because you ALREADY KNOW everything involved. I am not a
sysadmin and we don't have one. My apologies for that.
so does the above mean that I don't have to restart the entire server, just
postgres ? I assumed that changing kernel settings means rebooting the
server.
> I have clustered that table, its still unbelievably slow.
>
> Did you actually delete the old entries before clustering it? if it's
> still got 4G of old sessions or whatever in it, clustering ain't gonna
> help.
>
its a different table. the problem one has only 300k rows
the problem is not the size, the problem is the speed is catastrophic
> I still don't know if this bloat due to the small free space map has
> > anything to do with why the table is performing like this.
>
> Since you haven't show us what changes, if any, have happened to the
> table, neither do we :)
>
sorry, it didn't seem to be the most important topic when I got out of bed
From | Date | Subject | |
---|---|---|---|
Next Message | felix | 2011-02-07 01:55:57 | Re: Really really slow select count(*) |
Previous Message | Greg Smith | 2011-02-07 01:36:55 | Re: Need some help analyzing some benchmarks |