Marvell Kirkwood 88F6281
Marvell Kirkwood 88F6281 (SheevaPlug) (Sheeva 88SV131 CPU core) (Feroceon 88FR131 rev 1 (v5l)) 1200 MHz
- L1 Data cache = 16 KB. 4-Way, 32 B/line.
- L1 Instruction cache = 16 KB, 4-Way, 32 B/line..
- L2 cache = 256 KB. 32 ? B/line, 4-Way, (400 MHz?).
- L1 TLB = 8 items.
- L2 TLB = 64 items.
- Mbus-Light (Mbus-L) separate interfaces to DDR controller and to Mbus bridge.
- 200 MHz 64-bit Mbus
- DDR SDRAM Controller
- DDR2 400 MHz, Dual channel, 16-bit, 3.2 GB/s.
- supports up to four DRAM banks (four DRAM chip selects).
- supports all DDR2 devices with densities up to 2 Gb.
- supports up to 32 open pages (page per bank). It supports DRAM bank interleaving,
as well as open pages (up to eight pages per chip select).
- Up to 2 GB total address space. DDR:CPU Clock ratio of 1:N and 2:N support.
- support for 2T mode.
- supports up to a 128-byte burst per single transaction from the Mbus port.
- supports up to a 32-byte burst per single transaction from the Mbus-L port.
- contains a transaction queue, read and write buffers. It can absorb up to 4 transactions
of 128 byte each, in its buffers. Transactions from the Mbus are pushed into the
transaction queue. The SDRAM controller arbitrates between the transaction from the
top of the queue and transactions received from the CPU Mbus-L path, always giving
priority to the CPU.
- For a CPU read from the DRAM, read data is not pushed to the read buffer.
It goes directly to the CPU bus interface unit via a 64-bit wide Mbus-L path.
This minimizes read latency.
4 KB pages mode
|16 K ||3 ||L1 TLB + L1 |
|32 K ||24 ||+ 21 (L1 miss -> L2 hit) |
|256 K ||31 ||+ 7 (L1 TLB miss -> L2 TLB hit) |
|... ||31 + 180 ns || + 120 ns (L2 TLB miss) + 60 ns (L2 miss -> RAM) |
- ldr r1, [r2, +r3, lsl #2] locks pipeline for one additional cycle, so latency = 4 and throughput is 1 command / 2 cycles.
- L2 doesn't support parallel read accesses.
- L2 Read B/W (32 Bytes stride) = 24 cycles per 32-byte cache line
- RAM Read B/W (4-32 Bytes stride) = 600 MB/s
- Write B/W (any stride) = 380 MB/s (10 ns per 4-byte write). Write-Allocate was disabled.
Branch misprediction penalty = 4 cycles.
SheevaPlug at Wikipedia
Kirkwood Processors at Marvell