How are you executing the __asm__ __volatile__("cli; hlt"); ? These instructions are privileged instructions. Are you executing them in a kernel module? Did you compile with gcc?
How are you setting the NMI? EAS 8.3 says to set and reset the appropriate bit. I think it should work, but have not tried it personally. Can you give more detail?
The NMI handler code ... not immediately obvious to me where it is, but I think one could start by looking at the sccKit code that writes GLCFGx through the sccKit flit widget.
The default SCC Linux does not support the NMI, as it uses a customized configuration for its local APIC.
There are two interrupt pins on the core, LINT0 and LINT1, which are connected directly to the bits in the configuration register. If the APIC is disabled (a configuration that SCC Linux does not use), one of these pins (LINT0 if I remember correctly) acts as the legacy IRQ pin and the other as NMI. When IRQ is asserted, an interrupt acknowledgement cycle is performed to read the vector number from an external 8259A-type PIC. When NMI is asserted, an NMI is sent. This does not work as expected on the SCC, as there is no component that could answer the acknowledgement cycle.
If the APIC is enabled (which SCC Linux does), the operation of the LINT pins is configurable via the corresponding Local Vector Table registers (LVT0 and LVT1). A default Linux kernel (but not SCC Linux) then configures the pins to mimic their operation as if the APIC was disabled; i.e., LINT0 is configured as IRQ, and LINT1 as NMI.
However, SCC Linux uses a different configuration: LINT0 is hard-coded to deliver interrupt vector 0x34, and LINT1 to deliver vector 0x33. That means, whenever LINT0 is asserted, the core acts as if a regular interrupt had been received, and a (possibly external) interrupt controller had send a vector number of 0x34. Then, the 0x34-th entry of the IDT is read and control is passed to the Linux kernel, which can then dispatch to the appropriate device driver.
Internally, the Linux kernel uses IRQ numbers, not interrupt vector numbers, to distinguish interrupt lines. SCC Linux maps interrupt vector 0x34 to IRQ4, and 0x33 to IRQ3. Afterwards, drivers can use request_irq(4, ...) or request_irq(3, ...), respectively, to register their handlers. This is what you have found in rckmb.c: it uses IRQ4, which is mapped to LINT0, which is controlled by the bit named "INTR" in the SCCEAS. IRQ3 is used by rckpc, which is controlled by the "NMI" bit in the EAS, although it is not an NMI.
That's an excellent description of how things work. Two maskable interrupts were configured rather than 1 and NMI because of the very limited number available and the limited utility of NMI.
Atomic interaction with the interrupt bits of another core is very tricky. So, the new FPGA bitstream for 1.40 adds a global interrupt controller more suitable to the architecture of SCC. It greatly facilitates IPIs and eMAC interrupt handling. Check it out -(see http://communities.intel.com/docs/DOC-6241)
Does this mean that our EAS needs to be updated? In EAS Table 6 we describe the INTR and NMI fields. We call the NMI the non-maskable interrupt. And that is what it is, but I'm interpreting this discussion as meaning that for SCC Linux we configured both INTR and NMI to be maskable ... that is through the local APIC. Is it possible that another OS could have configured them differently? In EAS Section 8.3, we say that software can generate a non-maskable and a maskable interrupt, but that "Alternatively, the NMI can be configured as a second maskable (LINT1) interrupt as well." And actually with SCC Linux that is exactly what happens.
So then, does this mean that when John Lee did his cli;hlt ... and then toggled the NMI filed in GLCFGx (if he is running SCC Linux) is not going to bring the core back? ... which is in fact what he observed. But that if he had a different OS that configured the local APIC differently, he would get the core back.
The EAS as an EAS does not know about SCC Linux. As far as the EAS is concerned, SCC Linux is just one instance of software making hardware configuration choices.
Am I interpretng this discussion correctly?
Ted, your interpretation is correct.
I'd prefer if the bit names in the EAS were just LINT0 and LINT1; or alternatively, matched the pin names from the processor datasheet: "INTR/LINT0" and "NMI/LINT1". I find it misleading to name the pins "INTR" and "NMI" only, because this implies a configuration of the processor that does not make sense on the SCC.
The current names "INTR"+"NMI" do only apply to exactly two system configurations: either the local APIC is disabled (default configuration after bootstrap), or the local APIC is enabled and the LINTs are configured as LINT0=ExtINTR and LINT1=NMI (thus mimicking the configuration of the APIC being turned off, as far as external hardware is concerned). In both cases, the pins operate exactly like on a good old 8086. When using any other configuration, the pins' functions are in my opinion better described by the names "LINT0" and "LINT1", because asserting them just happens to invoke whatever action is configured in the local APIC.
When John Lee did cli followed by halt, the core behaved exactly as expected: it disabled delivery of all maskable interrupts, then halted waiting for any interrupt to come in. The only problem was that there were just no means to send such an interrupt to the core. Because SCC Linux configured both LINT0 and LINT1 as vectored (i.e., maskable) interrupts, asserting any of these signals was just ignored. Setting the "NMI" bit in the GLCFG register just asserted LINT1 (the "NMI/LINT1" pin of the core), which happen to be just a regular maskable interrupt, so it could not wake up the core. Indeed, I think there is no way to wake up a core from such a state on the SCC; all you could do is asserting its reset signal, but you would obviously loose your register state and start executing in real-mode again.
If one of the LINTs was configured as an NMI, I assume the core would have woken up.
Speaking of the APIC, something else just crossed my mind. Do we have any means to send a message over the three-wire APIC bus? Like on a regular multi-processor system, where all CPUs but one are placed in the "cli+halt" sleep state upon boot. The one remaining processor can then perform a STARTUP IPI by instructing its own local APIC to send a message to a remote APIC, and this IPI is also not maskable.
> There is no APIC bus on SCC, and nothing equivalent.
I still don't have a good grasp on how the APIC system works (I'm learning!), but this raises a question. I've read that, in a traditional SMP system, an inter-processor interrupt (IPI) can be generated by writing to the Interrupt Control Register (ICR) of the local APIC. Is this also how IPIs are done on the SCC? If so, is that interrupt sent by some other means, and not by the (nonexistent) APIC bus? What is the APIC bus used for, if not IPIs?
In a regular SMP system, the APIC bus connects the LAPICs and IOAPICs. To send an IPI, a processor instructs its LAPIC by writing to the ICR. Typically, these are vectored IRQs (i.e., an interrupt vector number between 0 and 255 is sent to the target), but other types (INIT, STARTUP, NMI, SMI) are possible as well.
On the SCC, the LAPICs are still present, but not connected to an APIC bus. The state of the corresponding pins is controlled directly via the tile's configuration registers, but as I understand it, you cannot use it to send "messages" to the LAPIC (that would appear as if they had been sent over a real APIC bus). So in short, no IPIs on SCC. Each core still has its two LINT pins (which are configured by SCC Linux as interrupt lines 4 and 3), though, and you can assert them to send interrupts. However, you won't be able to send an interrupt vector number (or, for that matter, any of the "special" IPI types) this way.