Results 1 to 7 of 7

Thread: Random System Crash - what to look for in logs?

  1. #1
    drogers is offline Senior Member
    Join Date
    Oct 2005
    Posts
    54
    Rep Power
    9

    Default Random System Crash - what to look for in logs?

    Hi. this morning i came in and the system was completey farked.
    i tried rebooting and it would come up. found an error mounting the filesystem (root mounted fine)

    this is M1 on fedora 3

    i did interactive startup and didn't start zimbra and the machine booted up.

    what should i look for in the logs as a possible cause? what logs other than /var/logs/zimbra.log?

    thanks.

  2. #2
    marcmac is offline Expert Member
    Join Date
    Sep 2005
    Posts
    2,103
    Rep Power
    13

    Default this could be anything

    When you say the system was down, does that mean you couldn't log into box at all? Or just that zimbra was down?

    Check /var/log/messages for errors - especially whatever was logged right before the reboot.

  3. #3
    drogers is offline Senior Member
    Join Date
    Oct 2005
    Posts
    54
    Rep Power
    9

    Default

    the box was basically frozen, i couldn't even get the screen to show up after i switched to it on the kvm.

  4. #4
    marcmac is offline Expert Member
    Join Date
    Sep 2005
    Posts
    2,103
    Rep Power
    13

    Default /var/log/messages

    Anything in /var/log/messages? That's the system log file.

  5. #5
    drogers is offline Senior Member
    Join Date
    Oct 2005
    Posts
    54
    Rep Power
    9

    Default Machine went down again.

    The machine crashed again. this time, before it went down, i noticed that zimbra was using 99.9% of the CPU.

    also, when i reboot, the machine will hang while checking the swap space. if i do interactive setup, and don't start the zimbra service. the machine boots right up.

    i am looking at /var/log/messages, but i don't know what to look for.

    a couple lines that seem sketchy:
    Code:
    Nov 22 18:33:28 curley clamd[21235]: Database correctly reloaded (41268 viruses)
    i am assuming that means there are 41000+ virus that it checks for, right? not ones found on the system.


    Code:
    Nov 23 04:03:45 curley kernel: CPU:    0
    Nov 23 04:03:45 curley kernel: EIP:    0060:[<c01509b5>]    Not tainted VLI
    Nov 23 04:03:45 curley kernel: EFLAGS: 00010097   (2.6.12-1.1381_FC3)
    Nov 23 04:03:45 curley kernel: EIP is at find_get_pages+0x22/0x41
    Nov 23 04:03:45 curley kernel: eax: 20020028   ebx: 00000002   ecx: 00000001   edx: 00002000
    Nov 23 04:03:45 curley kernel: esi: f7e98e58   edi: 00000080   ebp: ffffffff   esp: f7e98e24
    Nov 23 04:03:45 curley kernel: ds: 007b   es: 007b   ss: 0068
    Nov 23 04:03:45 curley kernel: Process kswapd0 (pid: 124, threadinfo=f7e98000 task=f7f40000)
    Nov 23 04:03:45 curley kernel: Stack: 0000000e 00000000 f7e98e50 c015bfaa f7e98e58 cdc5d168 00000000 c015c339
    Nov 23 04:03:45 curley kernel:        0000000e 00000000 cdc5d250 00000000 00000000 c143b300 00002000 c14dd340
    Nov 23 04:03:45 curley kernel:        c14dd360 c145c820 f7e98f48 c017a7b0 c03bcd08 c015d359 f7e98ea8 00000246
    Nov 23 04:03:45 curley kernel: Call Trace:
    Nov 23 04:03:45 curley kernel:  [<c015bfaa>] pagevec_lookup+0x1c/0x24
    Nov 23 04:03:45 curley kernel:  [<c015c339>] invalidate_mapping_pages+0x45/0xd3
    Nov 23 04:03:45 curley kernel:  [<c017a7b0>] remove_inode_buffers+0x12/0x16c
    Nov 23 04:03:45 curley kernel:  [<c015d359>] shrink_cache+0x41d/0x63d
    Nov 23 04:03:45 curley kernel:  [<c0192d5f>] dput+0xca/0x5dc
    Nov 23 04:03:45 curley kernel:  [<c0198ac3>] prune_icache+0x30a/0x454
    Nov 23 04:03:45 curley kernel:  [<c0198c21>] shrink_icache_memory+0x14/0x37
    Nov 23 04:03:45 curley kernel:  [<c015c840>] shrink_slab+0xf3/0x14f
    Nov 23 04:03:45 curley kernel:  [<c015e460>] balance_pgdat+0x25f/0x3a6
    Nov 23 04:03:45 curley kernel:  [<c015e674>] kswapd+0xcd/0x110
    Nov 23 04:03:45 curley kernel:  [<c013e9b9>] autoremove_wake_function+0x0/0x37
    Nov 23 04:03:45 curley kernel:  [<c013e9b9>] autoremove_wake_function+0x0/0x37
    Nov 23 04:03:45 curley kernel:  [<c015e5a7>] kswapd+0x0/0x110
    Nov 23 04:03:45 curley kernel:  [<c01012c1>] kernel_thread_helper+0x5/0xb
    Nov 23 04:03:45 curley kernel: Code: 5b 5e 5f 5d c3 89 de eb f2 56 53 83 ec 04 8b 74 24 10 fa 89 0c 24 83 c0 04 89 d1 89 f2 e8 a7 64 0b 00 31 c9 89 c3 eb 10 8b 14 8e <8b> 02 f6 c4 80 75 13 ff 42 04 83 c1 01 39 d9 72 ec fb 83 c4 04
    not sure what this means:
    Code:
    Nov 23 04:03:45 curley kernel: Oops: 0000 [#1]
    or this:
    Code:
    Nov 23 06:33:28 curley clamd[21235]: SelfCheck: Database modification detected. Forcing reload.
    Nov 23 06:33:28 curley clamd[21235]: Reading databases from /opt/zimbra/clamav/db
    any help i could get on this would be great thanks.
    I am running fedora core3 with M1 (haven't upgraded yet, would that fix this?) this seems to be only happening over night? is there some scheduled maintenance/performance going on that is hanging?

  6. #6
    marcmac is offline Expert Member
    Join Date
    Sep 2005
    Posts
    2,103
    Rep Power
    13

    Default Nov 23 04:03:45 curley kernel: Oops: 0000 [#1]

    kernel: Oops means that your system is going to crash.

    You can punch some of the kernel messages into google, and see what you find.

    The clamd reloading is normal - we update (by default) the virus definitions every 2 hours, that's what you're seeing.

    Are there any disk errors being logged?

    I can't tell you with 100% certainty that zimbra isn't contributing to this - but I've been running it for months, and it's never crashed the machine. It's possible that that it's contributing, especially if the disk activity is hitting some problem in your controller, or drive.

  7. #7
    drogers is offline Senior Member
    Join Date
    Oct 2005
    Posts
    54
    Rep Power
    9

    Default

    thanks for the quick reply btw...


    I am pretty certain that zimbra is causing this. there isn't much else going on with the machine. its pretty much just a mail server. i figure since the last i could check before i rebooted, was that zimbra was using 99.9% of the cpu.

    i didn't see any disk errors. i am not sure what to lookfor, but i didn't notice anything ostensibly related to disk errors.

    im in the process of googling some of the kernel errors.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Open Source on Fedora 6 Random Server Crash
    By bmj in forum Administrators
    Replies: 2
    Last Post: 04-05-2007, 05:43 AM
  2. Cannot determine services - exiting
    By tawas in forum Installation
    Replies: 7
    Last Post: 04-25-2006, 02:47 AM
  3. Initializing ldap...FAILED (256)
    By CVD in forum Installation
    Replies: 17
    Last Post: 03-10-2006, 09:47 AM
  4. Yet another get.DirectContext issue
    By dccpark in forum Installation
    Replies: 5
    Last Post: 03-08-2006, 01:25 PM
  5. system failure: getDirectContext
    By avisser in forum Installation
    Replies: 3
    Last Post: 10-12-2005, 05:32 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •