Re: WIP: Access method extendability

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: WIP: Access method extendability
Date: 2014-11-10 20:30:44
Message-ID: CAPpHfduWuAoi=d36LBC_KihuA73mRjudciVNUGhX77YBnpxR+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Thanks everybody for discussion. Sorry for delay. I'll try to address most
of questions arised in this discussion.

In general, I'm agree with concept of marking index as invalid in certain
cases.
I see following possible consequences of buggy WAL-logged custom AM:
1) Server crash during insert/update.
2) Server crash during select.
3) Invalid query answers.
4) Server crash during vacuum.
5) Server crash in recovery.

#5 is totally unacceptable. I've tried to design generic WAL record so that
it should be always possible to purely mechanically apply the record. It's
always possible to move piece of memory inside the page. It's always
possible to copy piece of memory from WAL-record to the page. Buggy WAL for
sure could cause an index corruption as well as any other bug in AM. WAL
support doesn't bring nothing special to this expect the fact that WAL is
quite hard to debug.

However, in current implementation I missed some hidden asserts about page
structure. Page could be so corrupted that we can't, for instance, safely
call XLogReadBufferForRedo(). All this cases must be worked out. If we
can't update page during recovery, index must be marked as invalid. It's
some amount of work, but it seems to be very feasible.

#4 seems problematic too. If autovacuum crashes on some index, then
postgres can enter a crash loop. So, if autovacuum crashes on extension
provided AM, that index should be marked as invalid.

Consequences #1, #3 and #3 could be easily caused by buggy opclass as well.
The reason why we're not knee-deep in extension provied bugs in GiST/GIN
opclasses is not easyness of writing correct GiST/GIN opclasses. Actual
reason is that we have only few GiST/GIN opclasses which weren't written by
GiST/GIN authors.

So, I don't expect PostgreSQL to be flooded with buggy AMs once we add AM
extendability. Our motivation behind this proposal is that we want to be
able to develop custom extensions with AMs. We want to be able to provide
our AMs to our customers whithout having to push that into PostgreSQL core
or fork PostgreSQL. Bugs in that AMs in our responsibility to out
customers. Some commercial companies could implement patented AMs (for
instance, fractal tree), and that is their responsibility to their
customers.

Also, I think it's OK to put adopting custom AMs to changes of AM interface
to authors of those custom AMs. AM extensions anyway should be adopted to
each new major release. AFAIR, interface of RelationOpen() function has
been changed not too long ago. Custom AM would use many functions which we
use to access relations. Their interface can be changed in the next release.

PostGIS GiST opclass has bugs which are reproducable, known and not fixed.
This is responsibility of PostGIS to their customers. We can feel sorry for
PostGIS that they are so slow on fixing this. But we shouldn't feel sorry
for GiST extendability, that woulde be redicilous.

Some recearches could write their own extensions. We can hope that they are
smart enough to not recommend it for production use. We can back our hope
with warning during installing extension provided AM. That warning could
say that all corruption caused by extension provided AM is up to AM
developer. This warning could make users to beware of using extension
provided AMs in production
unless they are fully trust extension developer (have support subscription
if it's commercial).

PostgreSQL doesn't have any kind of safe extensions. Every extension must
be trusted. Every extension can break (almost) everything.When one types
CREATE EXTENSION he must completely trust extension author. This applies to
every extension.

I would be very careful with discouraging commercial AM extensions. We
should always keen in the mind how many of PostgreSQL developers are
working for companies which own their commercial PostgreSQL forks and how
big their contribution is. Patented extensions looks scary for sure. But
it's up to software patents not up to PostgreSQL extendability...

Particular steps I'm going to do on these patches:
1) Make generic_redo never crash on broken pages.
2) Make autovacuum launcher mark index as invalid if vacuum process crashed
on custom AM index. Since, we can't write something into postgres cluster
when one process has crushed, ITSM autovacuum should have some separate
place to put this information. Thus, after restart postgres can read it and
mark index as invalid.

Don't allowing CREATE ACCESS METHOD command seems problematic for me. How
could it work with pg_upgrade? pg_dump wouldn't dump extra pg_am records.
So, pg_upgrade would break at creating operator classes on new cluster. So,
I agree with dropping create am command only if we let pg_dump to dump
extra pg_am records...

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-11-10 20:39:11 Re: BRIN indexes - TRAP: BadArgument
Previous Message Tom Lane 2014-11-10 20:28:02 Re: row_to_json bug with index only scans: empty keys!