Marvell Kirkwood 88F6281

Configuration

Marvell Kirkwood 88F6281 (SheevaPlug) (Sheeva 88SV131 CPU core) (Feroceon 88FR131 rev 1 (v5l)) 1200 MHz

L1 Data cache = 16 KB. 4-Way, 32 B/line.
L1 Instruction cache = 16 KB, 4-Way, 32 B/line..
L2 cache = 256 KB. 32 ? B/line, 4-Way, (400 MHz?).
L1 TLB = 8 items.
L2 TLB = 64 items.
Mbus-Light (Mbus-L) separate interfaces to DDR controller and to Mbus bridge.
200 MHz 64-bit Mbus
DDR SDRAM Controller
- DDR2 400 MHz, Dual channel, 16-bit, 3.2 GB/s.
- supports up to four DRAM banks (four DRAM chip selects).
- supports all DDR2 devices with densities up to 2 Gb.
- supports up to 32 open pages (page per bank). It supports DRAM bank interleaving, as well as open pages (up to eight pages per chip select).
- Up to 2 GB total address space. DDR:CPU Clock ratio of 1:N and 2:N support.
- support for 2T mode.
- supports up to a 128-byte burst per single transaction from the Mbus port.
- supports up to a 32-byte burst per single transaction from the Mbus-L port.
- contains a transaction queue, read and write buffers. It can absorb up to 4 transactions of 128 byte each, in its buffers. Transactions from the Mbus are pushed into the transaction queue. The SDRAM controller arbitrates between the transaction from the top of the queue and transactions received from the CPU Mbus-L path, always giving priority to the CPU.
- For a CPU read from the DRAM, read data is not pushed to the read buffer. It goes directly to the CPU bus interface unit via a 64-bit wide Mbus-L path. This minimizes read latency.

4 KB pages mode

Size	Latency	Description
16 K	3	L1 TLB + L1
32 K	24	+ 21 (L1 miss -> L2 hit)
256 K	31	+ 7 (L1 TLB miss -> L2 TLB hit)
...	31 + 180 ns	+ 120 ns (L2 TLB miss) + 60 ns (L2 miss -> RAM)

ldr r1, [r2, +r3, lsl #2] locks pipeline for one additional cycle, so latency = 4 and throughput is 1 command / 2 cycles.
L2 doesn't support parallel read accesses.
L2 Read B/W (32 Bytes stride) = 24 cycles per 32-byte cache line
RAM Read B/W (4-32 Bytes stride) = 600 MB/s
Write B/W (any stride) = 380 MB/s (10 ns per 4-byte write). Write-Allocate was disabled.

Branch misprediction penalty = 4 cycles.

Links

SheevaPlug at Wikipedia

Kirkwood Processors at Marvell