From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Ioana Danes <ioanadanes(at)gmail(dot)com> |
Cc: | Francisco Olarte <folarte(at)peoplecall(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Corrupted Data ? |
Date: | 2016-08-12 15:34:01 |
Message-ID: | 6499cfc7-2c89-4d3b-905d-18ceac71440d@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 08/12/2016 08:30 AM, Ioana Danes wrote:
>
>
> On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>
> On 08/12/2016 08:10 AM, Ioana Danes wrote:
>
>
>
> On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
> <folarte(at)peoplecall(dot)com <mailto:folarte(at)peoplecall(dot)com>
> <mailto:folarte(at)peoplecall(dot)com <mailto:folarte(at)peoplecall(dot)com>>>
> wrote:
>
> CCing to the list...
>
> Thanks
>
>
> On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes
> <ioanadanes(at)gmail(dot)com <mailto:ioanadanes(at)gmail(dot)com>
> <mailto:ioanadanes(at)gmail(dot)com <mailto:ioanadanes(at)gmail(dot)com>>>
> wrote:
> >> given 318220 and 318216 are just a bit away ( 4db08/4db0c
> ), and it
> >> repeats sporadically, have you ruled out ( by having page
> checksums or
> >> other mechanism ) a potential disk read/write error ?
> >>
> >>
> >> > Also the index is correct on db3 as the record in case
> (with
> drawid =
> >> > 318216) is retrieved if I filter by drawid = 318220
> >>
> >> Specially if this happens, you may have some slightly bad
> disks/ram/
> >> leading to this kind of problems.
> >>
> >
> > Could be. I also had some issues with an rsync between db3 and
> drdb a week
> > ago that did not complete for bigger files (> 200MB) and
> gave me some
> > corruption messages. Then the system was revbooted and
> everything
> seemed
> > fine but apparently it is not.
> > I am planning to drop & create the table from a good
> backup and if
> that does
> > not fix the issue then I will rebuild the server.
>
> I would check whatever logs you can ( syslog or eventlog,
> smart log,
> etc.. ) hunting for disk errors ( sometimes they are
> reported ). This
> kind of problems, with programs as tested as postgres and
> rsync, tend
> to indicate controller/RAM/disk going bad ( in your case it
> could be
> caused by a single bit getting flipped in a sector for the data
> portion of the table, and not being propagated either because it
> happened after your sync of drdb or because it was synced
> from the WAL
> and not the table, or because it was read from the disk cache ).
>
> I agree, unfortunately I did not find any clues about corruption
> or any
> anomalies in the logs.
> I will work tonight to rebuild that table and see where I go
> from there.
>
>
> The db3 database is on a different machine from all the other
> databases you set up, correct?
>
> Yes, they are all different vms first 3 dbs are on the same cluster but
> drdb is a remote machine,
Aah, another player in the mix.
What virtualization technology are you using?
>
> Thank you
>
>
>
> Thanks,
> ioana
>
> Francisco Olarte.
>
>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-08-12 15:41:57 | Re: pgbasebackup is failing after truncate |
Previous Message | Adrian Klaver | 2016-08-12 15:31:54 | Re: Error at dynamic generated copy... |