I haven't seen anyone doing DMA. What would you be doing DMA from? There are really no devices on the SCC. There's just the PCIe interface and eth1 to the BMC.
I want to transfer large data from SCC's DRAM to SCC's DRAM.
I don't have any idea to communicate to outside of SCC board.
Do you mean DMA from core to core? I don't see how that can be done. I think you might simulate this DMA through shared memory. ... I mean you know have one core write to shared memory, hence making it accessible to the other core. RCCE is right now providing about 512MB of shared memory, and I think one could squeek a GB out.
This is uncached shared memory ... yes, we are still fussing with how to reliably deal with cached shared memory. The cacheable shared memory driver works fine; we're still trying to understand how to deal with the non-coherence in L2.
I think DMA might be possible between the cores and the MCPC.
Does anyone have something to add to this DMA discussion? I haven't seen people doing DMA with the SCC, but that doesn't mean it's not happening.
I'm interpreting your issue as meaning you want to transfer large amounts of data from one core's private memory to another core's private memory. Have you looked at the ETI software? They have good performance with large message sizes, although I don't know how large is large.
I think DMA is useful when computation and communication is processed simultaneously.
So, if I want to transfer big array A during some computation with big array B, DMA can be one good candidate for the data transfer.
That's why I am asking a DMA controller does exist or not.
I think if I want to do that, each core has a DMA controller. However, I have never seen DMA controller block diagram in Intel's SCC document here.
So, I just want to confirm from you that there is no DMA controller in SCC board.
Or, possibly it is, then I want to look and and use it.
I don't think you can use DMA between the cores. But to find out for sure, I escalated this question to our hardware engineers. Stay tuned.
No. There isn't. There is no equivalent of 8259 of a PC/AT platform or the DMA engines of an I/O device. The cores are your data movement engines. Overlap computation with data movement by having 1 or more cores compute while 1 or more move the data. Given the capabilities of your engine, it would be more efficient to do streaming operations (part of the processing) while moving it, but that is up to you. You can share memory, beyond the "shared memory area" set up by default in the Linux configuration.