Re: 8.2beta1 (w32): server process crash (tsvector)

From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Thomas H(dot)" <me(at)alternize(dot)com>, <pgsql-bugs(at)postgresql(dot)org>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: 8.2beta1 (w32): server process crash (tsvector)
Date: 2006-11-13 09:36:19
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCEA358E4@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Did yet some more stuff, mainly in line of shotgun-debugging :-)

If I add a "return NULL" right before:
res = (TSLexeme *)
DatumGetPointer(FunctionCall4(

&(dict->lexize_info),

PointerGetDatum(dict->dictionary),

PointerGetDatum(curVal->lemm),

Int32GetDatum(curVal->lenlemm),

PointerGetDatum(&ld->dictState)

));
in LexizeExec(), I get no crash and NULL values for all tsvectors.
If I add it right *after* that line, I get a crash. But I don't get the
warnings about overwriting.

Not entirely sure how much that helps. Don't have time to dig in more
right htis minute, but I'd definitly appreciate some pointers into what
to look at next.

//Magnus

> -----Original Message-----
> From: pgsql-bugs-owner(at)postgresql(dot)org
> [mailto:pgsql-bugs-owner(at)postgresql(dot)org] On Behalf Of Magnus Hagander
> Sent: den 13 november 2006 10:01
> To: Thomas H.; pgsql-bugs(at)postgresql(dot)org
> Cc: Tom Lane
> Subject: Re: [BUGS] 8.2beta1 (w32): server process crash (tsvector)
>
> Did some more tests, and it's even easier to repro this. No
> need to update, just SELECT to_tsvector('default',sometext) from test;
>
> It appears to be data dependent still, though, because just
> running SELECT to_tsvector('default','foo bar foo bar foo bar
> foo bar')
> fromgenerate_series(1,1000000) works withuot a hitch.
>
>
> Also, the "detected write past chunk end" comes from the
> pfree() in tsvector.c, function unqiueWORD() line 656. At
> this point, "ptr->word"
> points at the text "temp" followed by a bunch of junk. The
> callstack leading up to this is:
> postgres.exe!AllocSetFree(MemoryContextData *
> context=0x02c37bd8, void * pointer=0x02c65ed0) Line 775 C
> postgres.exe!pfree(void * pointer=0x02c65ed0) Line 585 + 0x17
> bytes C
> tsearch2.dll!uniqueWORD(TSWORD * a=0x02c66e60, int l=517) Line
> 656 + 0xc bytes C
> tsearch2.dll!makevalue(PRSTEXT * prs=0x00d4f44c) Line 692 +
> 0x12 bytes C
> tsearch2.dll!to_tsvector(FunctionCallInfoData *
> fcinfo=0x00d4f46c) Line 764 + 0x9 bytes C
> postgres.exe!DirectFunctionCall3(unsigned long
> (FunctionCallInfoData *)* func=0x02cd1131, unsigned long
> arg1=16473, unsigned long arg2=49941724, unsigned long
> arg3=0) Line 905 + 0xa bytes C
> tsearch2.dll!to_tsvector_name(FunctionCallInfoData *
> fcinfo=0x00d4f6b0) Line 787 + 0x20 bytes C
>
>
> Now, the parametsrs to to_tsvector() looks weird - the "text
> *in = PG_GETARG_TEXT_P(1)" appears to return a struct with
> vl_len set t 2139062143, which seems. Eh. Wrong? :) But this
> does not happen on the
> *first* call to to_tsquery() which shows a proper text, but
> subsequent calls show what lokos like garbage a couple of
> times, and then the WARNING about the write past end fires.
>
> //Magnus
>
> > -----Original Message-----
> > From: pgsql-bugs-owner(at)postgresql(dot)org
> > [mailto:pgsql-bugs-owner(at)postgresql(dot)org] On Behalf Of
> Magnus Hagander
> > Sent: den 12 november 2006 17:06
> > To: Thomas H.; pgsql-bugs(at)postgresql(dot)org
> > Cc: Tom Lane
> > Subject: Re: [BUGS] 8.2beta1 (w32): server process crash (tsvector)
> >
> > Ok, I've run this test on an assert enabled build (my msvc build,
> > actually, so I could get a debugger on it if needed).
> > It then outputs:
> >
> > WARNING: detected write past chunk end in ExprContext 02C6D0F8
> > WARNING: detected write past chunk end in ExprContext 02C6AEA0
> > WARNING: detected write past chunk end in ExprContext 02C6B200
> > WARNING: detected write past chunk end in ExprContext 02C44630
> > WARNING: detected write past chunk end in ExprContext 02C4C118
> > WARNING: problem in alloc set ExprContext: bogus aset link
> in block
> > 02C435A8, c hunk 02C44520
> > WARNING: detected write past chunk end in ExprContext 02C66440
> > WARNING: detected write past chunk end in ExprContext 02C3B9D0
> > WARNING: detected write past chunk end in ExprContext 02C3BDE8
> > WARNING: detected write past chunk end in ExprContext 02C4E7E0
> > WARNING: detected write past chunk end in ExprContext 02C47508
> > WARNING: problem in alloc set ExprContext: bogus aset link
> in block
> > 02C435A8, c hunk 02C47528
> > WARNING: detected write past chunk end in ExprContext 02C43800
> > WARNING: detected write past chunk end in ExprContext 02C66C90
> > WARNING: detected write past chunk end in ExprContext 02C68270
> > WARNING: detected write past chunk end in ExprContext 02C4F5D8
> > WARNING: problem in alloc set ExprContext: bogus aset link
> in block
> > 02C4E6F8, c hunk 02C4F5F8
> > WARNING: detected write past chunk end in ExprContext 02C7B680
> > WARNING: detected write past chunk end in ExprContext 02C45190
> > WARNING: detected write past chunk end in ExprContext 02C46AC8
> > WARNING: detected write past chunk end in ExprContext 02C3C538
> > WARNING: detected write past chunk end in ExprContext 02C67B90
> > WARNING: detected write past chunk end in ExprContext 02C438F0
> > WARNING: problem in alloc set ExprContext: bad single-chunk
> > 02C43FB8 in block 0
> > 2C435A8
> > WARNING: problem in alloc set ExprContext: bogus aset link
> in block
> > 02C435A8, c hunk 02C43FB8 server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> >
> >
> >
> >
> > The actual crash happens on line 1142 of AllocSetCheck.
> Full callstack
> > is:
> > > postgres.exe!AllocSetCheck(MemoryContextData *
> > context=0x02c455b8) Line 1142 + 0x11 bytes C
> > postgres.exe!AllocSetReset(MemoryContextData *
> > context=0x02c455b8) Line 409 + 0x9 bytes C
> > postgres.exe!MemoryContextReset(MemoryContextData *
> > context=0x02c455b8) Line 129 + 0xf bytes C
> > postgres.exe!ExecScan(ScanState * node=0x02c227d0,
> TupleTableSlot *
> > (ScanState *)* accessMtd=0x00530b70) Line 91 + 0xc
> > bytes C
> > postgres.exe!ExecSeqScan(ScanState * node=0x02c227d0) Line 130
> > + 0xe bytes C
> > postgres.exe!ExecProcNode(PlanState * node=0x02c227d0) Line 349
> > + 0x9 bytes C
> > postgres.exe!ExecutePlan(EState * estate=0x02c223e8,
> PlanState *
> > planstate=0x02c227d0, CmdType operation=CMD_UPDATE, long
> > numberTuples=0, ScanDirection direction=ForwardScanDirection,
> > _DestReceiver *
> > dest=0x02bf8d30) Line 1081 + 0x9 bytes C
> > postgres.exe!ExecutorRun(QueryDesc * queryDesc=0x02c44078,
> > ScanDirection direction=ForwardScanDirection, long count=0)
> Line 246
> > +
> > 0x20 bytes C
> > postgres.exe!ProcessQuery(Query * parsetree=0x02bddab0, Plan *
> > plan=0x02bf7e28, ParamListInfoData * params=0x00000000,
> _DestReceiver
> > * dest=0x02bf8d30, char *
> > completionTag=0x00d4fb24) Line 157 + 0xd bytes C
> > postgres.exe!PortalRunMulti(PortalData * portal=0x02c48138,
> > _DestReceiver * dest=0x02bf8d30, _DestReceiver * altdest=0x02bf8d30,
> > char * completionTag=0x00d4fb24) Line 1148 + 0x1c bytes C
> > postgres.exe!PortalRun(PortalData * portal=0x02c48138, long
> > count=2147483647, _DestReceiver * dest=0x02bf8d30, _DestReceiver *
> > altdest=0x02bf8d30, char *
> > completionTag=0x00d4fb24) Line 700 + 0x15
> > bytes C
> > postgres.exe!exec_simple_query(const char *
> > query_string=0x02bdd4c8) Line 943 + 0x23 bytes C
> > postgres.exe!PostgresMain(int argc=4, char * * argv=0x02ba2478,
> > const char * username=0x025e7af8) Line 3414 + 0xc bytes C
> > postgres.exe!BackendRun(Port * port=0x00d4fd2c) Line 2853 +
> > 0x17 bytes C
> > postgres.exe!SubPostmasterMain(int argc=3, char * *
> > argv=0x025e5b30) Line 3288 + 0xc bytes C
> > postgres.exe!main(int argc=3, char * * argv=0x025e5b30) Line
> > 165 + 0xd bytes C
> >
> >
> >
> >
> > Can someone point me towards what next to check? :-)
> >
> > Oh, and per Thomas, this does NOT appear to happen in C locale, but
> > happens in both de_CH and en_US. ANd I've tested in Swedish_Sweden,
> > which also crashes.
> >
> > Note that this appears not to be index related at all,
> because during
> > this test there is only a standard btree index on "someint", and
> > nothing on the tsvector column.
> >
> > //Magnus
> >
> >
> > > -----Original Message-----
> > > From: pgsql-bugs-owner(at)postgresql(dot)org
> > > [mailto:pgsql-bugs-owner(at)postgresql(dot)org] On Behalf Of Thomas H.
> > > Sent: den 12 november 2006 15:06
> > > To: pgsql-bugs(at)postgresql(dot)org
> > > Subject: Re: [BUGS] 8.2beta1 (w32): server process crash
> (tsvector)
> > >
> > > this bug (please see below) is unfortunately still present in
> > > beta3 (win32 build). test case still crashes the child
> process and
> > > lets postmaster kill & reload everything.
> > >
> > > it is not GiST-related, i've just validated the same
> problem using
> > > GIN.
> > >
> > > this breaks tsearch2 functionality on our win32 system as no
> > > tsvector-indexing of new/existing rows is possible (crash
> after ~10
> > > processed rows). searching already indexed rows works fine.
> > >
> > > best regards,
> > > thomas
> > >
> > > ----- Original Message -----
> > > From: <me(at)alternize(dot)com>
> > > To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> > > Cc: <pgsql-bugs(at)postgresql(dot)org>
> > > Sent: Tuesday, October 17, 2006 9:19 PM
> > > Subject: Re: [BUGS] 8.2beta1 (w32): server process crash
> (tsvector)
> > >
> > >
> > > >>> the following query will crash the server process:
> > > >>> INSERT INTO news.news
> > > >>> SELECT * FROM news.news2;
> > > >>
> > > >> This is undoubtedly data-dependent. Can you supply some
> > > sample data
> > > >> that makes it happen?
> > > >
> > > > it's not only happening with INSERTS, but also updates. as
> > > thats easier to
> > > > test, here's how i can reproduce the error:
> > > >
> > > > 1. create new database (encoding: UTF8) with tsearch2 on
> > > 8.2b1 win32
> > > > (system
> > > > locale: de_CH.1252)
> > > > 2. insert the data from the zip file
> > > > [http://alternize.com/pgsql/tsearch2test.zip] (be sure to
> > > also update
> > > > pg_ts_cf /
> > > > pg_ts_cfgmap as we have WIN1252 locale) 3. execute UPDATE
> > test SET
> > > > idxFTI = to_tsvector('default',
> > > "sometext"); or
> > > > similar queries
> > > > 4. hopefully see the process crashing as i do ;-)
> > > >
> > > >
> > > > 2006-10-17 17:23:44 LOG: server process (PID 4584)
> > exited with exit
> > > > code -1073741819
> > > > 2006-10-17 17:23:44 LOG: terminating any other active
> > > server processes
> > > > 2006-10-17 17:23:44 WARNING: terminating connection
> > > because of crash of
> > > > another server process
> > > > 2006-10-17 17:23:44 DETAIL: The postmaster has commanded
> > > this server
> > > > process to roll back the current transaction and exit,
> > > because another
> > > > server process exited abnormally and possibly corrupted
> > > shared memory.
> > > > {snipp}
> > > > 2006-10-17 17:23:44 LOG: all server processes terminated;
> > > reinitializing
> > > > 2006-10-17 17:23:44 LOG: database system was interrupted
> > > at 2006-10-17
> > > > 17:23:41 W. Europe Daylight Time
> > > > 2006-10-17 17:23:44 LOG: Windows
> > > fopen("recovery.conf","r") failed: code
> > > > 2,
> > > > errno 2
> > > > 2006-10-17 17:23:44 LOG: Windows
> > > fopen("pg_xlog/00000001.history","r")
> > > > failed: code 2, errno 2
> > > > 2006-10-17 17:23:44 LOG: Windows fopen("backup_label","r")
> > > failed: code
> > > > 2,
> > > > errno 2
> > > > 2006-10-17 17:23:44 LOG: checkpoint record is at 0/E2ECA728
> > > > 2006-10-17 17:23:44 LOG: redo record is at 0/E2ECA728;
> > > undo record is at
> > > > 0/0; shutdown FALSE
> > > > 2006-10-17 17:23:44 LOG: next transaction ID: 0/514299;
> > > next OID: 6276957
> > > > 2006-10-17 17:23:44 LOG: next MultiXactId: 1; next
> > > MultiXactOffset: 0
> > > > 2006-10-17 17:23:44 LOG: database system was not properly
> > > shut down;
> > > > automatic recovery in progress
> > > > 2006-10-17 17:23:44 LOG: redo starts at 0/E2ECA778
> > > > 2006-10-17 17:23:44 LOG: unexpected pageaddr 0/DB0CC000 in
> > > log file 0,
> > > > segment 227, offset 835584
> > > > 2006-10-17 17:23:44 LOG: redo done at 0/E30CBE78
> > > > 2006-10-17 17:23:45 LOG: database system is ready
> > > > 2006-10-17 17:23:45 LOG: Windows
> > fopen("global/pg_fsm.cache","rb")
> > > > failed:
> > > > code 2, errno 2
> > > > 2006-10-17 17:23:45 LOG: transaction ID wrap limit is
> > > 2147484172, limited
> > > > by database "postgres"
> > > > 2006-10-17 17:23:45 LOG: Windows
> > > fopen("global/pgstat.stat","rb") failed:
> > > > code 2, errno 2
> > > >
> > > >
> > > > i've also tried to update each record on its own in a
> > > for-loop. here the
> > > > crash happens as well, sometimes after 10 updates,
> > > sometimes after 100
> > > > updates, sometimes even after 1 update. but eventually
> > > every record can be
> > > > updated. so i do not think its entierly content-related...
> > > >
> > > > for what its worth, here's the output of pg_controldata:
> > > >
> > > > pg_control version number: 822
> > > > Catalog version number: 200609181
> > > > Database system identifier: 4986650172201464825
> > > > Database cluster state: in production
> > > > pg_control last modified: 17.10.2006 17:44:29
> > > > Current log file ID: 0
> > > > Next log file segment: 230
> > > > Latest checkpoint location: 0/E4E0F978
> > > > Prior checkpoint location: 0/E46BF420
> > > > Latest checkpoint's REDO location: 0/E4E03098
> > > > Latest checkpoint's UNDO location: 0/0
> > > > Latest checkpoint's TimeLineID: 1
> > > > Latest checkpoint's NextXID: 0/531333
> > > > Latest checkpoint's NextOID: 6285149
> > > > Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's
> > > > NextMultiOffset: 0
> > > > Time of latest checkpoint: 17.10.2006 17:43:45
> > > > Minimum recovery ending location: 0/0
> > > > Maximum data alignment: 8
> > > > Database block size: 8192
> > > > Blocks per segment of large relation: 131072
> > > > WAL block size: 8192
> > > > Bytes per WAL segment: 16777216
> > > > Maximum length of identifiers: 64
> > > > Maximum columns in an index: 32
> > > > Date/time type storage: floating-point numbers
> > > > Maximum length of locale name: 128
> > > > LC_COLLATE: German_Switzerland.1252
> > > > LC_CTYPE: German_Switzerland.1252
> > > >
> > > > let me know if more information / data is needed.
> > > >
> > > > on a sidenote: are those fopen() errors
> > > debug-code-leftovers or something
> > > > one should worry about? i can't find those files on the
> > file system.
> > > >
> > > > - thomas
> > > >
> > > >
> > > > ---------------------------(end of
> > > broadcast)---------------------------
> > > > TIP 4: Have you searched our list archives?
> > > >
> > > > http://archives.postgresql.org
> > > >
> > >
> > >
> > >
> > > ---------------------------(end of
> > > broadcast)---------------------------
> > > TIP 6: explain analyze is your friend
> > >
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/docs/faq
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andreas Lange 2006-11-14 13:41:04 Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio
Previous Message Magnus Hagander 2006-11-13 09:00:59 Re: 8.2beta1 (w32): server process crash (tsvector)