| PLEX86 | ||
|
Sun says not ECC error. What do you think(posted in comp.os.linux.misc, since we're running rhel4u3 AS smp x8664, and there's some debate from Sun on the merits ofvar-log-mcelog. So I thought I'd ask in a linux newsgroup) noauth" pppd snit with kubuntu 2660 Snip...Re:etc-ppp-options config file... I'm beginning to understand this better, now that I went back and worked some more with kppp under Dapper Live... In an attempt to try to stay openminded, I thought I'd post this and see how others feel. Sun says these are not memory errors (v40z running rhel4u3 AS smp x8664). Fromvar-log-mcelog: host A: MCE 0 CPU 2 4 northbridge TSC 455250e9229bc ADDR 37fb72a80 Northbridge Chipkill ECC error Chipkill ECC syndrome = 5b9b bit46 = corrected ecc error bit62 = error overflow (multiple errors) bus error 'local node response, request didn't time out generic read mem transaction memory access, level generic' STATUS d44dc0005b080a13 MCGSTATUS 0 MCE 1 CPU 2 4 northbridge TSC 45ebce980ebfd ADDR 272b40000 Northbridge Chipkill ECC error Chipkill ECC syndrome = 9f65 bit40 = error found by scrub bit46 = corrected ecc error bit62 = error overflow (multiple errors) bus error 'local node response, request didn't time out generic read mem transaction memory access, level generic' STATUS d432c1009f080a13 MCGSTATUS 0 MCE 2 CPU 2 4 northbridge TSC 464fcbe158423 ADDR 310740140 Northbridge Chipkill ECC error Chipkill ECC syndrome = 13cc bit40 = error found by scrub bit46 = corrected ecc error bit62 = error overflow (multiple errors) bus error 'local node response, request didn't time out generic read mem transaction memory access, level generic' STATUS d466410013080a13 MCGSTATUS 0 root # host B: MCE 0 CPU 0 4 northbridge TSC 65a455b0bb0ba ADDR 887b2850 Northbridge Chipkill ECC error Chipkill ECC syndrome = 6212 bit40 = error found by scrub bit46 = corrected ecc error bus error 'local node response, request didn't time out generic read mem transaction memory access, level generic' STATUS 9409410062080a13 MCGSTATUS 0 We have about a dozen hosts exhibiting these types of entries in the mcelog; some of these hosts have dozens of entries. modem help i have an internal modem intel536EP and have ubuntu 6.06 trying to install the modem drivers and... Sun says it's likely not a memory problem, nor a CPU problem. We've asked the memory be replaced, and they refuse, stating that we haven't proven it's a memory problem. They also claim that the ECC messages invar-log-mcelog may not be accurate, since ECC alerts go invar-log-messages (which conflicts with "man mcelog", but hey, I'm only a SysAdmin w. 15+ years of experience. What do I know?). single or dual processor Hi all I've found cpu informationproc-cpuinfo on my Linux, as below. It seems that I have two processors, which... Oh, and many of these servers are locking up. Honestly, am I wrong here? Is there some other realistic explanation for these errors, other than memory and-or CPU HW errors? I'm about to go medieval on them, but before I do so, I thought I'd ask a 3rd party. So, what do YOU think? Is it memory, CPU, or are there no hardware problems to be found on these V40z servers? Oh, and if anyone from Sun is interested, I'm willing to provide the Sun case number. Hey, maybe I'll just post it. Thanks in advance for any advice.
|
||||
Linux groups from Newsgroups The #1 Usenet Provider on the Internet
|
||||