Tom Lane wrote: > Well, the bottom line that's concerning me here is whether throwing > errors is going to push anyone's application into an unfixable > corner. I'm somewhat encouraged that your Circuit Courts software > can adapt to it, since that's certainly one of the larger and more > complex applications out there. Or at least I would be if you had > actually verified that the CC code was okay with the recently- > proposed patch versions. Do you have any thorough tests you can run > against whatever we end up with? In spite of several attempts over the years to come up with automated tests of our applications at a level that would show these issues, we're still dependent on business analysts to run through a standard list of tests for each release, plus tests designed to exercise code modified for the release under test. For the release where we went to PostgreSQL triggers, that included running lists against the statistics tables to see which trigger functions had not yet been exercised in testing, until we had everything covered. To test the new version of this patch, we would need to pick an application release, and use the patch through the development, testing, and staging cycles, We would need to look for all triggers needing adjustment, and make the necessary changes. We would need to figure out which triggers were important to cover, and ensure that testing covered all of them. Given the discussions with my new manager this past week, I'm pretty sure we can work this into a release that would complete testing and hit pilot deployment in something like three months, give or take a little. I can't actually make any promises on that until I talk to her next week. >From working through all the BEFORE triggers with UPDATE or DELETE statements this summer, I'm pretty confident that the ones which remain can be handled by reissuing the DELETE (we didn't keep any of the UPDATEs with this pattern) and returning NULL. The most complicated and troublesome trigger code has to do with purging old data. I suspect that if we include testing of all purge processes in that release cycle, we'll shake out just about any issues we have. -Kevin
"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote: > Tom Lane wrote: >> Well, the bottom line that's concerning me here is whether >> throwing errors is going to push anyone's application into an >> unfixable corner. I'm somewhat encouraged that your Circuit >> Courts software can adapt to it, since that's certainly one of >> the larger and more complex applications out there. Or at least I >> would be if you had actually verified that the CC code was okay >> with the recently-proposed patch versions. Do you have any >> thorough tests you can run against whatever we end up with? > To test the new version of this patch, we would need to pick an > application release, and use the patch through the development, > testing, and staging cycles, We would need to look for all > triggers needing adjustment, and make the necessary changes. We > would need to figure out which triggers were important to cover, > and ensure that testing covered all of them. > > Given the discussions with my new manager this past week, I'm > pretty sure we can work this into a release that would complete > testing and hit pilot deployment in something like three months, > give or take a little. I can't actually make any promises on that > until I talk to her next week. After a couple meetings, I have approval to get this into an application release currently in development. Assuming that your patch from the 13th is good for doing the testing, I think I can post test results in about three weeks. I'll also work on a follow-on patch to add couple paragraphs and an example of the issue to the docs by then. -Kevin
"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes: > After a couple meetings, I have approval to get this into an > application release currently in development. Assuming that your > patch from the 13th is good for doing the testing, I think I can > post test results in about three weeks. I'll also work on a > follow-on patch to add couple paragraphs and an example of the issue > to the docs by then. Well, the issues about wording of the error message are obviously just cosmetic, but I think we'd better do whatever we intend to do to the callers of heap_lock_tuple before putting the patch through testing. I'm inclined to go ahead and make them throw errors, just to see if that has any effects we don't like. I'm up to my elbows in planner guts at the moment, but will try to fix up the patch this weekend if you want. regards, tom lane
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote: > I'm up to my elbows in planner guts at the moment, but will try to > fix up the patch this weekend if you want. They have scheduled testers to check on this issue next week, so it would be great to get as close as we can on the stuff that matters. -Kevin
OK, here's an updated version of the patch. I changed the error message texts as per discussion (except I decided to use one message string for all the cases rather than saddle translators with several variants). Also, I put in an error in GetTupleForTrigger, which fixes the self-reference case I illustrated before (now added to the regression test). However, I found out that changing the other two callers of heap_lock_tuple would open an entirely new can of worms, so for now they still have the historical behavior of ignoring self-updated tuples. The problem with changing ExecLockRows or EvalPlanQualFetch can be illustrated by the regression test case it breaks, which basically is BEGIN; DECLARE c1 CURSOR FOR SELECT * FROM table FOR UPDATE; UPDATE table SET ...; FETCH ALL FROM c1; COMMIT; When the FETCH comes to a row that's been updated by the UPDATE, it sees that row as HeapTupleSelfUpdated with a cmax greater than es_output_cid (which is the CID assigned to the DECLARE). So if we make these callers throw an error for the case, coding like the above will fail, which seems to me to be pretty darn hard to justify. It is not a corner case that could be caused only by questionable use of trigger side effects. So that seems to leave us with two choices: (1) ignore the row, or (2) attempt to lock the latest version instead of the visible version. (1) is our historical behavior but seems arguably wrong. I tried to make the patch do (2) but it crashed and burned because heap_lock_tuple spits up if asked to lock an "invisible" row. We could possibly finesse that by having EvalPlanQualFetch sometimes pass a CID later than es_output_cid to heap_lock_tuple, but it seems ticklish. More, I think it would also take some adjustments to the behavior of HeapTupleSatisfiesDirty, else we'll not see such tuples in the first place. So this looks messy, and also rather orthogonal to the current goals of the patch. Also, I'm not sure that your testing would exercise such cases at all, as you have to be using SELECT FOR UPDATE and/or READ COMMITTED mode to get to any of the relevant code. I gather your software mostly relies on SERIALIZABLE mode to avoid such issues. So I stopped with this. regards, tom lane
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote: http://archives.postgresql.org/pgsql-hackers/2012-01/msg01241.php > OK, here's an updated version of the patch. I was on vacation after PGCon and just got back to the office yesterday, so I have just now been able to check on the status of our testing of this after being asked about it by Tom at PGCon. He has this listed as an open bug, with testing of his fix by my organization as the hold-up. There was some testing of this in January while I was on another vacation, some triggers were found which worked as intended with the patch I had hacked together, but which got errors with the stricter (and safer) patch created by Tom. I pointed out to the developers some triggers which needed to be changed, and what should be considered safe coding techniques, but I was then assigned to several urgent issues and lost track of this one. I have arranged for testing over the next few days. I will post again with results when I have them. -Kevin