|Processors concerned||The Cyrix/IBM 6x86, 6x86L, and 6x86MX.
All revisions up to the current 6x86MX Rev. 1.4 are affected. Cyrix has stated that future 6x86MX revisions won't suffer from this bug.
|What does the bug do?||A simple, short, legal instruction sequence will lock the processor in an infinite loop and will prevent servicing of interrupts. This will obviously crash the machine.|
|Is this bug specific to GNU/Linux?||No. Since it's a hardware bug, it will affect all OS's equally.|
|Workaround 1||Enable the NO_LOCK bit (0x10) in configuration register CCR1 (0xc1).|
|Workaround 2||Use the undocumented Cyrix workaround (see the News page, November 12 & 13).|
|How do I implement workaround 1?||Insert a simple command in rc.local or rc.cyrix : set6x86 -p 0xc1 -s 0x10, or use any other utility to set the NO_LOCK flag in CCR1.|
|How do I test?||See the gcc-compatible program below.|
|How serious is it?||Exactly as serious as the new Pentium F0 bug: if you run a multiuser OS on a 6x86 machine, any user with permissions to run a small innocent program may bring down the machine. And as with the Pentium F0 bug, it's almost impossible to trace the cause of the crash. So, it's a very serious bug.|
|Does workaround 1 imply a performance hit?||No. It might even make your CPU slightly faster.|
|Won't setting the NO_LOCK bit cause other problems?||It shouldn't cause any problem with most OS's: Linux, Windows95, NT, Solaris, BSD Unix variants, NextStep, etc...|
|What about DOS users?||Cyrix has reported having had problems with the DOS4GW extensions when
NO-LOCK is set (no details provided). The DOS4GW extensions are used by
DOOM and QUAKE and other DOS packages to enable access to extended memory.
However, this incompatibility is irrelevant to the serious problem caused
by the Cyrix Coma bug: a security breach in multiuser, multitasking protected-mode
Conclusion: if you run DOS games or other DOS programs that use the DOS4GW extensions, ignore the 6x86 Coma bug issue altogether.
|Who found the bug?||Serguei Shtyliov (Moscow) found the bug, and Alexandr Konosevich (Omsk, West Siberia) investigated it further. Then Alexander contacted editor Uwe Post of c't magazine. First a short note appeared mentionning the bug, and a few days later a full article went online on the magazine's Web site, co-authored by Alexandr Konosevich and Uwe Post.|
|Why did the c't article use the name "hidden CLI bug"?||Alexandr Konosevich and Uwe Post called it the "hidden CLI bug" based on the fact that the 6x86 processor seems to run in an infinite loop with interrupts disabled (the CLI instruction disables interrupts). However, this explanation may be misleading (read the technical explanation below).|
|Why did you call it the "Coma" bug?||I am calling it the Cyrix 6x86 Coma bug because the CPU goes into an infinite loop, executing instructions but not responding to any external "stimulus"; hence some kind of CPU "coma". This contrasts with the Pentium F0 bug, where the Pentium CPU simply stops ("dies").|
I don't understand German, so that made it a little harder to check this information.
A few hours later, here is what I posted (some words changed for the sake of clarity):
You can test this bug on any 6x86 machine. It will crash the machine, so be extra careful that you don't lose any data with your tests. Just copy the above source to your machine. Compile. Careful now: sync your hard disks and unmount all filesystems. Now execute as a simple user. Your 6x86-based computer should be locked, only a hardware reset will bring it out of this state.
The program above basically consists of the following loop:
This, in itself, is an infinite loop. The only way for the processor to get out of this loop is by getting interrupted. So far so good, since any multitasking OS and even DOS will interrupt the processor. However, as you will see for yourself if you try to run this loop or a variant of it, the processor does not respond to any interrupt once in the loop. This is the symptom that led the c't magazine crew to call it the hidden CLI bug.
However, if one could look at the flags inside the processor while it's in the loop, one would see that interrupts are enabled. It does not respond to interrupts because it is executing back-to-back locked bus cycles.
Locked cycles are a special type of bus cycle that cannot be split: you cannot have an interrupt routine called in the middle of a locked cycle (among other pecularities). And typically, xchg instructions are implicitly executed in locked cycles: the xchg instruction is the classic example of an "atomic" instruction, used to implement all sorts of semaphores and other software constructs (coincidently, the Pentium F0 bug is caused by an explicitly locked compare-and-exchange instruction) that depend on this "atomicity".
But what has become of both the mov and the jmp instructions? Shouldn't the processor recognize the interrupts while executing these two instructions? It should, but it doesn't!
The 6x86 CPU family has some advanced architectural features, among others: dual pipelines, register renaming, data dependency removal, speculative execution and branch prediction.
What is probably happening here is that the jmp instruction is having its execution overlapped with the other two instructions through branch prediction. That should still give us the idle cycle where the mov instruction is performed to service the interrupt, but then I assume it too is being overlapped with the xchg instruction; my intuition is that the mov will execute in one pipeline whereas the xchg executes in the other pipeline. How is that possible if the eax register is modified by the xchg instruction? Well, that's where register renaming and data dependency removal will come into play. The xchg instruction is probably acting on one copy of eax while the mov instruction uses a second copy.
Let's put it this way: the original M1 engineers went one (unlocked) cycle too far in their drive to make this processor as efficient as possible. It's simply incredible but this bug has propagated unto all the 6x86 family members! It would probably have gone unnoticed if Serguei Shtyliov had not detected it while writing an assembly language routine.
Why does setting NO_LOCK solve this? Because it effectively disables locked bus cycles for the xchg instruction. Normal bus cycles can be interrupted, so we can always regain control of the CPU and kill the loop.
All this explanation is just a hypothesis: one can only tell what's going on inside a CPU with 100% certainty if one has advanced ICE equipment. Which of course is by far beyond the means of an individual.
However, setting the NO_LOCK bit could cause other lesser but still annoying problems. Here is another Email from Bryan James Philippe I received about three hours after I had posted my first solution:
Well, I first suggested to Bryan that setting the Weak Locking bit in RCR7 instead of the NO_LOCK bit in CCR1 might do the job, but it doesn't.
So far the only way to avoid the deadlock is to set the NO_LOCK bit. Bryan like many GNU/Linux 6x86 users has patched his kernel, instead of installing set6x86. He has two PCI memory regions on his machine that are non-prefetchable, one of which is used by a bus-mastering SCSI controller. However, he has not set the ARRs properly for these regions. His problems could come from a combination of NO_LOCK and not having the ARRs setup correctly on his GNU/Linux box.
More testing is needed right now, and I would welcome any results and comments, specially on other OS's besides GNU/Linux.
Right now my two 6x86 GNU/Linux boxes are doing fine with my simple NO_LOCK workaround, and I have reports of at least 3 other users that NO_LOCK is doing fine on their systems.
Anyway, if you are going to use my NO_LOCK workaround, don't forget to properly setup the ARRs on your 6x86 machine. This is relatively simple: read the set6x86 README, and also take a look at the relevant pages in IBM's Application Note 40205. You will also find some hints in my FAQ page. Here is what the 6x86_arr ARR reporting utility included with set6x86 displays on my system:
As you can see, I have setup ARR6 to handle my PCI video card. Although I am running X and Netscape as I write these lines, with a video board similar to Bryan's, and with the NO_LOCK bit set as described above, I haven't had a single problem on my 6x86L box. My 6x86MX also has the NO_LOCK bit set, the ARRs are correctly setup and it is also working without any problems.
And both systems are safe from the Cyrix Coma bug :-)
Last updated on November 21, 1997.
Copyright 1997 Andrew D. Balsa