[Admin-discuss] Fwd: [rb-admins] Murphy Crash

Andrew Harford receive at redbrick.dcu.ie
Wed Dec 10 17:21:54 GMT 2008


As planned, we began the migration to Murphy at 9pm last night. Apache
itself seemed fine but after a few minutes the kernel seemed to die.

We rebooted and went back to working on apache, but the kernel seemed
to die again after a while, so www was moved back to deathray.

This may be related to the load of nfs traffic, which is something we
wouldn't have seen in testing.

Either way, unless we can resolve the bug this pretty much screws our
plans to run openbsd on this for apache :(

andrew.

----- Forwarded message from Eoghan Cotter <johan at redbrick.dcu.ie> -----

From: Eoghan Cotter <johan at redbrick.dcu.ie>
Date: Wed, 10 Dec 2008 02:16:25 +0000
To: admins at redbrick.dcu.ie

Hey,

Here's the trace and ps output from the debugger,


ps output

ddb{0}> ps
   PID   PPID   PGRP    UID  S       FLAGS  WAIT          COMMAND
 29881  28999  21463   6677  2   0x2004000                php-fastcgi
  6120  21463  21463    576  3   0x2000180  poll          httpd
* 4835   3297  21463  32436  7   0x2004000                php-fastcgi
 15130  17738  21463      0  3   0x2004080  piperd        cronolog
 17738  21463  21463      0  3   0x2004080  pause         sh
  3297  21463  21463    576  3   0x2000180  poll          httpd
 11115  21463  21463    576  3   0x2000180  poll          httpd
 21909  21463  21463    576  3   0x2000180  semwait       httpd
 23564  21463  21463    576  3   0x2000180  semwait       httpd
 10681  21463  21463    576  3   0x2000180  semwait       httpd
 18388  21463  21463    576  3   0x2000180  poll          httpd
  9691  25003  21463  32436  3   0x2004080  poll          php-fastcgi
 12091  21463  21463    576  3   0x2000180  poll          httpd
 19947  21463  21463    576  3   0x2000180  semwait       httpd
 12921  21463  21463    576  3   0x2000180  semwait       httpd
 28999  21463  21463    576  3   0x2000180  poll          httpd
 25003  21463  21463    576  3   0x2000181  poll          httpd
   652  22204  21463      0  3   0x2004080  piperd        cronolog
 22204  21463  21463      0  3   0x2004080  pause         sh
 10483  16299  10483   1001  3   0x2004082  ttyin         bash
 16299  17814  17814   1001  3   0x2000180  select        sshd
 17814  12650  17814      0  3   0x2004180  netio         sshd
 21988  26966  21988      0  3   0x2004082  ttyin         bash
 26966  29262  26966  101031  3   0x2004082  pause         zsh
 29262  20975  20975  101031  3   0x2000180  select        sshd
 20975  12650  20975      0  3   0x2004080  netio         sshd
 21626  31748  21626  101352  3   0x2004082  ttyin         zsh
 31748  25761  25761  101352  3   0x2000180  select        sshd
 25761  12650  25761      0  3   0x2004180  netio         sshd
  3022  26485   3022      0  3   0x2004082  ttyin         bash
 26485   6403  26485  101089  3   0x2004082  wait          bash
  6403   6172   6172  101089  3   0x2000180  select        sshd
  6172  12650   6172      0  3   0x2004080  netio         sshd
 22166      1  22166      0  3   0x2004082  ttyin         getty
 16543      1  16543      0  3   0x2000080  select        cron
 21463      1  21463      0  3   0x2000080  poll          httpd
 20491      1  20491      0  3   0x2000080  select        ypbind
  8743      1   8743    556  3   0x2000180  select        nrpe
  2046      1   2046      0  3   0x2000080  poll          syslog-ng
  6116      1   6116    521  3   0x2040180  select        exim
 23556  27092  27092     93  3   0x2000180  kqread        ypldap
 27092      1  27092     93  3   0x2000180  kqread        ypldap
 12650      1  12650      0  3   0x2000080  select        sshd
  5165      1   5165      0  3   0x2000180  select        inetd
  6186      0      0      0  3   0x2100280  nfsidl        nfsio
  4829      0      0      0  3   0x2100280  nfsidl        nfsio
 23194      0      0      0  3   0x2100280  nfsidl        nfsio
 15120      0      0      0  3   0x2100280  nfsidl        nfsio
 18433      1  18433      0  3   0x2000080  poll          ntpd
  4909      1   4909     83  3   0x2000180  poll          ntpd
  4048   5129   5129      0  3   0x2000080  nfsd          nfsd
 22795   5129   5129      0  3   0x2000080  nfsd          nfsd
  4388   5129   5129      0  3   0x2000080  nfsd          nfsd
 11035   5129   5129      0  3   0x2000080  nfsd          nfsd
  5129      1   5129      0  3   0x2000080  netcon        nfsd
 14928      1  14928      0  3   0x2000080  select        mountd
    81      1     81     28  3   0x2000180  poll          portmap
    46      0      0      0  3   0x2100200  bored         crypto
    45      0      0      0  3   0x2100200  aiodoned      aiodoned
    44      0      0      0  3   0x2100200  syncer        update
    43      0      0      0  3   0x2100200  cleaner       cleaner
    42      0      0      0  3    0x100200  reaper        reaper
    41      0      0      0  3   0x2100200  pgdaemon      pagedaemon
    40      0      0      0  3   0x2100200  rfwcond       raid0
    39      0      0      0  3   0x2100200  pftm          pfpurge
    38      0      0      0  3   0x2100200  usbevt        usb1
    37      0      0      0  3   0x2100200  usbtsk        usbtask
    36      0      0      0  3   0x2100200  usbevt        usb0
    35      0      0      0  7    0x100200                idle31
    34      0      0      0  7    0x100200                idle30
    33      0      0      0  7    0x100200                idle29
    32      0      0      0  7    0x100200                idle28
    31      0      0      0  7    0x100200                idle27
    30      0      0      0  7    0x100200                idle26
    29      0      0      0  7    0x100200                idle25
    28      0      0      0  7    0x100200                idle24
    27      0      0      0  7    0x100200                idle23
    26      0      0      0  7    0x100200                idle22
    25      0      0      0  7    0x100200                idle21
    24      0      0      0  7    0x100200                idle20
    23      0      0      0  7    0x100200                idle19
    22      0      0      0  7    0x100200                idle18
    21      0      0      0  7    0x100200                idle17
    20      0      0      0  7    0x100200                idle16
    19      0      0      0  7    0x100200                idle15
    18      0      0      0  7    0x100200                idle14
    17      0      0      0  7    0x100200                idle13
    16      0      0      0  7    0x100200                idle12
    15      0      0      0  7    0x100200                idle11
    14      0      0      0  7    0x100200                idle10
    13      0      0      0  7    0x100200                idle9
    12      0      0      0  7    0x100200                idle8
    11      0      0      0  7    0x100200                idle7
    10      0      0      0  7    0x100200                idle6
     9      0      0      0  7    0x100200                idle5
     8      0      0      0  7    0x100200                idle4
     7      0      0      0  7    0x100200                idle3
     6      0      0      0  7    0x100200                idle2
     5      0      0      0  7    0x100200                idle1
     4      0      0      0  3   0x2100200  bored         syswq
     3      0      0      0  3    0x100200                idle0
     2      0      0      0  3   0x2100200  kmalloc       kmthread
     1      0      1      0  3   0x2004080  wait          init
     0     -1      0      0  3   0x2080200  scheduler     swapper
ddb{0}>

Trace Output:

ddb{0}> trace
text_access_fault(40083e87940, 9, 1c14008, 0, 0, 1810350) at text_access_fault+
0x1d4
trapbase_sun4v(0, 0, 40083e87c00, 4efe3970, 4a, 4a) at trapbase_sun4v+0x979c
end(0, 4001cdd45f8, 40083e87c00, 4001cdc7690, 0, 1810350) at 0x1c14000
dofilewrite(16, 1, 4001cdd45c0, 4efe3970, 4a, 4a) at dofilewrite+0x6c
sys_write(9, 40083e87dd0, 40083e87dc0, 0, 0, 1810350) at sys_write+0x58
syscall(40083e87ed0, 4, 4d92db08, 4d92db0c, 0, 41ec6600) at syscall+0x13c
softtrap(1, 4efe3970, 4a, 0, 0, 0) at softtrap+0x18c
end(ffffffffffffffff, ffffffffffffffff, ffffffffffffffff, ffffffffffffffff, fff
fffffffffffff, ffffffffffffffff) at 0x444947bc
ddb{0}>

Initial output:

OpenBSD/sparc64 (murphy.redbrick.dcu.ie) (console)
login: Dec  9 23:23:38 su: werdz to root on /dev/ttyp0
Dec  9 23:49:43 su: receive to root on /dev/ttyp2
panic: kernel text_access_fault: pc=1c14008 va=1c14000
kdb breakpoint at 142e1e0
Stopped at      Debugger+0x4:   nop
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING
THIS PANIC!
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT
               INFORMATION!


_______________________________________________
rb-admins mailing list
rb-admins at lists.redbrick.dcu.ie
http://lists.redbrick.dcu.ie/mailman/listinfo/rb-admins

----- End forwarded message -----

-- 
Andrew Harford
System Administrator, DCU Networking Society
Equipment Officer, Societies & Publications Committee

Your own father said that artists use lies to tell the truth. Yes, I 
created a lie. But because you believed it, you found something true about 
yourself.				--V



More information about the Admin-discuss mailing list