Skip site navigation (1) Skip section navigation (2)

Last night's meeting, next month's announcement

From: Selena Deckelmann <sdeckelmann(at)chrisking(dot)com>
To: Postgresql PDX_Users <pdxpug(at)postgresql(dot)org>
Subject: Last night's meeting, next month's announcement
Date: 2007-08-17 16:02:58
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pdxpug

Our meeting was awesome. Jeff is smart. I learned something about  
Linux I/O schedulers and rules that I DID NOT EXPECT. Three new  
people attended!

Next month's meeting: Relational Algebra, with 3 PhD candidates  
(James, Vassilis, Rafael).  We may need to serve alcohol at this  
meeting.  Or perhaps we will see a return of the cupcakes, since  
baking season will have started.

The August 16th meeting began with a short discussion of Rules vs.  
Triggers. I forgot to come up with an EXPLAIN operator.

When do you choose to create a rule or a trigger? Jeff explained that  
for table partitioning, the recommendation was to use triggers. In  
some cases, you can have a query whose predicate is altered inside of  
a rule and causes (in a difficult way to think about) the “window” of  
data’s result NULL.

We all talked about that for a while, tried to come up with an  
example case - which was hard. Then we tried to frame MySQL users for  
something. And then we moved on.

I would like to revisit rules vs. triggers, and come up with the  
example case to explain what we were talking about!

We had a few new faces - including the leader of the PHP group - Sam!  
Also, Jerry was looking for someone to help him out with some SQL  
questions. We hope he posts some questions to the list.

Jeff’s talk was largely about his patch, with a few bits about his  
development environment, a patch from Simon Riggs that was related  
but not dependent, and a little database theory thrown in.

The inspiration for Jeff’s Synchronized Scan patch was the idea that  
Sequence Scans can really start at any place between 0 and N, with N  
being the number of records in a table. Before his patch to 8.3, it  
was truly arbitrary that all Sequence Scans were starting at 0. In  
the past, DBAs would just need to plan for poor or unpredictable  
performance when multiple sequence scans occurred.

The patch implements a system where each process keeps track of where  
a sequence scan is at - in a tiny piece of shared memory. Then, when  
a new sequence scan starts up on the same table, it is given a hint  
as to where to start. The effect is that the second sequence scan now  
asks for data that is in the cache. For any tables that are larger  
than cache size, and whose queries are I/O-bound, this is a big  
performance benefit, with no performance penalties. So awesome!

(There's lots more detail that you should check out in Jeff's slides,  
as well as a nice diagram that really explains it)

Now in 8.3, results from queries are truly non-deterministic.  The  
documentation for PostgreSQL has always said this, but now, it is  
certain. Jeff's patch only kicks in when tables are of a certain  
size, but still: use ORDER BY if you want data returned in a certain  

Jeff also discussed Simon Riggs’ patch which implements a small ring  
cache to service Sequence Scans. This is also a big performance  
improvement because it prevents cache pollution by confining sequence  
scan data to a small space that can’t push other cached data around.  
Also, it is supposedly sized to fit in L2 cache, improving  
performance even more for certain hardware architectures. Jeff  
mentioned that PostgreSQL already does a pretty good job with cache  
management, but this patch makes the caching even more efficient.

Another topic that came up was the Linux I/O scheduling algorithms.  
Jeff had originally tested his patch using the Deadline, NOOP, and  
Anticipatory schedulers. When he tried it with CFQ more recently, it  
didn’t work so well. (a quick google search tells me that RHEL uses  
CFQ! AURGH.) He’d also tested ZFS, which seemed to work well but  
needed more testing.

Someone (sorry I can't remember your name!) brought up that it would  
be nice if the scheduling algorithm picked by a distribution and/or  
operating system by default was documented in one place. I agree!   
That would be useful.

Mark spoke up and mentioned that Deadline worked very well in  
general, non-deterministic cases with PostgreSQL.

There were tons of great questions, and even a few esoteric,  
theoretical arguments. Very good meeting, everyone!


Afterward, the Lucky Lab was crazy busy! We drank a couple pitchers,  
talked about the linux kernel, and I think there was a long argument  
about BSNF.

We did decide that someone was going to have to give a talk on  
“Hypercubes and Dungeons and Dragons: what you never thought they had  
in common”.

Next month’s meeting with be about relational algebra, with James,  
Vassilis and Rafael tag-teaming. Rafael has been teaching the intro  
to databases class this summer at PSU, so he is ready for some real  
heckling. I can only hope that Randal will be able to make it.

pdxpug by date

Next:From: Jeff DavisDate: 2007-08-17 23:03:44
Subject: rule weirdness
Previous:From: Randal L. SchwartzDate: 2007-08-17 00:53:48
Subject: Re: TODAY! Synchronized Scanning with Jeff Davis

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group