Intel i5
Configuration
Intel i5-650, Clarkdale, Westmere, 32 nm, 81 mm2, 382 M Transistors + GPU / RAM controller (45 nm, 177 M Transistors, 114 mm2).
GIGABYTE H55M-S2, Intel H55 (IbexPeak DH), Dual-Channel, 2 * 2048 MB PC3-10600 666.7 MHz DDR3 Kingston, 9-9-9-24, External Graphic Card.
- L1 Data cache = 32 KB. 64 B/line, 8-WAY. (Write-Allocate?)
- L1 Instruction cache = 32 KB. 4-WAY. ? B/line
- L2 cache = 256 KB. 64 B/line, 8-WAY
- L3 cache = 4 MB. 64 B/line, ?-WAY
- Load Buffers = 48 items
- Store Buffers = 32 items
- RS = 36 items
- ROB = 128 items
4 KB pages mode (64-bit Windows, 64-bit soft)
- Data TLB L1 size = 64 items. Miss penalty = 7.
- Instruction TLB L1 size = 64 items per thread.
- TLB L2 size = 512 items.
| Size |
Latency |
Description |
| 32 K | 4 | TLB + L1 |
| 256 K | 10 | +6 (L2) |
| 2 M | 47 | +30 (L3) +7 (L1 TLB miss) |
| 4 M | 57 | +10 (L2 TLB miss) |
| ... | 57 + 93 ns | + 93 ns (RAM) |
2 MB pages mode (64-bit Windows, 64-bit soft)
- Data TLB L1 size = 32 items.
- Instruction TLB L1 = 7 items per thread.
MISC
- 64-bytes range cross penalty = 4 cycles.
- 4096-bytes range cross penalty = 24 cycles.
- L1 Read with (L1 TLB miss -> L2 TLB hit) = 2 cycles per read (throughput)
- L2 Read with (L2 TLB miss) doesn't allow similar parallel accesses.
- L2->L1 B/W (Parallel Random Read) = 4 cycles per cache line
- L2->L1 B/W (Read, 64 bytes step) = 3.71 cycles per cache line
- L2 Write (Write, 64 bytes step) = 6.70 cycles per write (cache line)
- L3->L1 B/W (Parallel Random Read) = 5.90 cycles per cache line
- L3->L1 B/W (Read, 64 bytes step) = 5.75 cycles per cache line
- L3 Write (Write, 64 bytes step) = 10.40 cycles per write (cache line)
- RAM Read B/W (Parallel Random Read) = 12 ns / cache line = 5300 MB/s
- RAM Read B/W (Read, 8 Bytes step) = 8400 MB/s
- RAM Read B/W (Read, 64 Bytes step) = 9860 MB/s
- RAM Read B/W (Read, 64 Bytes step - pointer chasing) = 7100 MB/s
- RAM Write B/W (Write, 4-64 Bytes step) = 6600 MB/s
Branch misprediction penalty = 15-16 cycles.