RISC-V CONFIG HOST I/F SRAM SPIKE IN MESH I/F SPIKE OUT CATALYST N3 N3 COMING SOON

N3

128 cores. 524K neurons (24-bit) to 1M (8-bit), up to 8.4M virtual with TDM. Hybrid ANN/SNN. Hardware virtualization. On-chip continual learning.

Read the research Get in touch

Hybrid Computing

ANN and SNN
on one chip.

Each core can switch between spiking neural network mode and INT8 multiply-accumulate mode. Run classical deep learning layers alongside spiking layers on the same silicon. No other neuromorphic chip does this.

4-Level Memory

From 96 KB
to unlimited.

L1 SRAM per core (96 KB), L2 shared tile cache (1 MB), L3 external DRAM (500M+ synapses), L4 CXL fabric. Loihi 2 has two levels. N3 has four, with hardware-managed caching and LRU eviction.

NeurOS Virtualization

680+ networks
on one chip.

Hardware-scheduled time-division multiplexing with dirty-page tracking and compressed DMA context switching. Each physical neuron handles 8 virtual time slots. No other neuromorphic chip offers hardware virtualization.

Learning Accelerators

16 independent
learning engines.

One per tile, each with a 28-opcode ISA and 80 registers. Loihi 2 has one global learning engine. N3 has 16 running in parallel. No cross-chip bottleneck. Learning at wire speed.

Precision-Adaptive

24, 16, or 8 bits.
Your call.

Configure neuron precision per core. At 8-bit, neuron density doubles to over 1 million total. FACTOR low-rank synapse compression saves 2–8× memory. Same network, multiple precision targets.

Continual Learning

Networks that
stabilise themselves.

Hardware metaplasticity (3-bit synaptic consolidation), homeostatic plasticity (firing rate tracking with synaptic scaling), and synaptic fatigue. On-chip mechanisms for networks that learn continuously without catastrophic forgetting.

Full Die

128 cores.
36 MB SRAM.

128 neuromorphic cores across 16 tiles, 4 RISC-V management CPUs, async hybrid network-on-chip, and 36 MB on-chip SRAM. Every architectural feature implemented in RTL and validated on FPGA.

RISC-V CONFIG HOST I/F SRAM SPIKE IN MESH I/F SPIKE OUT CATALYST N3 N3 COMING SOON

Scale

4.2M
Virtual neurons (24-bit)
500M+
Addressable synapses
36 MB
On-chip SRAM
256
GMAC/s (ASIC projected)

Architecture

128
Cores (16 tiles)
8
Neuron models (7 + custom)
16
Learning accelerators
4-level
Memory hierarchy

Performance

91.0%
SHD accuracy
76.4%
SSC accuracy
4.04
nJ / neuron-op (FPGA)
3.7×
Energy efficiency vs N2

What N3 adds over N2.

Unique

ANN INT8 MAC Mode

Per-core toggle between spiking and classical multiply-accumulate. Deploy hybrid ANN/SNN networks natively.

Unique

NeurOS Virtualization

Hardware-scheduled TDM for 680+ virtual networks. Context switching with dirty tracking and compressed DMA.

Unique

Per-Tile Learning

16 independent learning accelerators. 28-opcode ISA, 80 registers each. 16× the throughput of a single global engine.

N3

4-Level Memory

L1 per-core, L2 tile cache, L3 DRAM-backed, L4 CXL fabric. 500M+ addressable synapses with hardware LRU management.

N3

Parameter Groups

32 shared parameter sets per core. 4,096 neurons in 96 KB L1 instead of 1,024. 4× density increase.

N3

Hardware Metaplasticity

3-bit consolidation state per synapse. Automatic meta-learning in hardware. No microcode overhead.

N3

FACTOR Compression

Low-rank synapse format using SVD decomposition. Hardware STORE_A/STORE_B opcodes. 2–8× memory savings.

N3

Winner-Take-All

Two-pass hardware WTA with configurable groups and k-winners. Competitive coding for sparse representations.

N3

Spike Compression

DELTA, BURST, and ADAPTIVE encoding on inter-chip links. 2–8× effective bandwidth for multi-chip systems.

Same SDK. New hardware.

The neurocore SDK targets N3 with the same Python API you already know. Hardware-accurate simulation on CPU, GPU, and FPGA.

pip install catalyst-cloud
import catalyst_cloud as cc

# Target N3 hardware
net = cc.Network(chip="n3")
inp = net.add_population("input", 784)

# N3: hybrid ANN layer (INT8 MAC)
ann = net.add_population("encoder", 256,
    neuron_model="ann_int8")

# N3: spiking recurrent layer
exc = net.add_population("hidden", 512,
    neuron_model="adaptive_lif")
out = net.add_population("output", 10)

net.connect(inp, ann, weight=0.3)
net.connect(ann, exc, learning="stdp")
net.connect(exc, out)

result = cc.run(net, timesteps=1000)
print(result.spike_trains)

How N3 compares.

Catalyst N3 Intel Loihi 2 BrainChip Akida 2 SpiNNaker 2
Cores 128 (16 tiles) 128 8 NPUs 152 (ARM)
Neurons (24-bit)* 524,288 ~1,000,000 2,048 Up to 16M
Neurons (8-bit) 1,048,576
Virtual neurons (TDM) 4.2M (24-bit)
Neuron models 8 (7 + custom ISA) 3+ 1 (LIF) Software
Weight precision 1–16-bit 1–8-bit 1/2/4/8-bit Software
ANN mode INT8 MAC Yes
On-chip learning 16 accelerators 1 global Limited ARM cores
Learning ISA 28 opcodes, 80 reg Microcode Fixed Software
Memory levels 4 (L1–L4) 2 2 External
Virtualization NeurOS (680+)
Spike compression DELTA/BURST
Metaplasticity Hardware (3-bit)
Hardware validated FPGA validated ASIC (Intel 4) ASIC (28nm) ASIC (22nm)
Open design Yes (BSL 1.1) No No Partial

* Neuron counts are not directly comparable across platforms. Loihi 2 neurons are 1-bit compartments; N3 neurons are 24-bit with full state (potential, current, traces). N3 at 8-bit precision supports 1,048,576 physical neurons, or up to 8.4M virtual with TDM.

Technical Specifications

Compute
Cores128 (16 tiles × 8 cores)
Neurons per core (24-bit)4,096
Neurons per core (16-bit)5,120
Neurons per core (8-bit)8,192
Total physical neurons (24-bit)524,288
Total physical neurons (16-bit)655,360
Total physical neurons (8-bit)1,048,576
Virtual neurons (24-bit, TDM ×8)4,194,304
Virtual neurons (8-bit, TDM ×8)8,388,608
Neuron modelsCUBA LIF, ALIF, Sigma-Delta, Gated, Graded, WTA, ANN INT8, Custom ISA
Compartments per neuron8 addressable (4 implemented)
Parameter groups per core32
Management CPUs4 × RISC-V RV32IMC with neuro ISA extensions
Memory
L1 SRAM per core96 KB
L2 shared tile cache1 MB per tile
Shadow SRAM (TDM)8 MB total
Total on-chip SRAM~36 MB
L3 external memoryLPDDR5X / HBM
Addressable synapses500M+ (L3-backed)
L4 fabricCXL interconnect
Synapse formats4 modes: Full, Inference, Compact, FACTOR
Learning
Learning accelerators16 (1 per tile, independent)
ISA opcodes28
Registers per accelerator80
Parallel threads4 per accelerator
Eligibility traces6 types (24-bit mode)
Metaplasticity3-bit consolidation per synapse
Fatigue4-bit per synapse with recovery
Homeostatic plasticityHardware EWMA rate tracking + synaptic scaling
Neuromodulation16 channels per tile + per-core override
Communication
NoC typeAsync hybrid (event-driven routers + synchronous cores)
Intra-tile routing4-phase handshake mesh
Inter-tile routingFat tree + 8 express links
Spike compressionDELTA / BURST / ADAPTIVE
Multicast groups256 per tile × 32 destinations
Spike payloadsBinary + 8-bit graded (16-bit extended mode)
Axonal delays1–255 timesteps (24-bit mode)
Power & Reliability
Clock gatingPer-core ICG
Power gatingPer-tile (async router stays powered)
DVFS4 clock domains, adaptive frequency
Thermal management4 diodes + throttle FSM
ECCPer-SRAM, with scrubber
Deterministic modeFull-chip reproducibility (LFSR seeding, barrier sync)
Performance counters32 per tile (spikes, FIFO, cache, DMA, NoC)
FPGA Performance
PlatformAWS F2 (Xilinx UltraScale+ VU47P)
Neuromorphic clock62.5 MHz (tile), 250 MHz (AXI bus)
Peak throughput14,512 timesteps/sec
Energy per neuron-op4.04 nJ
Efficiency vs N23.7× (nJ/neuron-op)
SHD accuracy91.0%
SSC accuracy76.4%
N-MNIST accuracy99.1%
GSC-12 accuracy88.0%
DVS Gesture accuracy89.0%
Total power (FPGA)1.91 W (full tile active)
Kria K26 (2-core)51,381 LUTs (43.9%), 0.867W, ~58.5 MHz
ASIC Projections (28nm)
Target processTSMC 28nm HPC+
Die area per tile (8 cores)~58 mm²
Full 128-core die~1,070 mm² (not feasible at 28nm — multi-chip or advanced node required)
Initial ASIC targetSingle tile (8 cores) or 4-tile (32 cores) configuration
Energy per synaptic event12–18 pJ
Projected efficiency~57 GSOPs/J
Power (8-core tile, projected)~0.2 W (typical activity)
Scaling pathMulti-chip AER (8 links/chip) or advanced node (≤7nm)
Intellectual Property
RTLProprietary (BSL 1.1)
Architecture spec78 pages, v3.0
RTL modules46 modules, ~17,700 lines

N3 Paper

Full architecture specification, benchmark results, and FPGA validation.

Read on Zenodo Download PDF

Interested in N3?

Research partnerships, early access, and integration enquiries.