N3 — Catalyst Neuromorphic Processor

Hybrid Computing

ANN and SNN
on one chip.

Each core can switch between spiking neural network mode and INT8 multiply-accumulate mode. Run classical deep learning layers alongside spiking layers on the same silicon. No other neuromorphic chip does this.

4-Level Memory

From 96 KB
to unlimited.

L1 SRAM per core (96 KB), L2 shared tile cache (1 MB), L3 external DRAM (500M+ synapses), L4 CXL fabric. Loihi 2 has two levels. N3 has four, with hardware-managed caching and LRU eviction.

NeurOS Virtualization

680+ networks
on one chip.

Hardware-scheduled time-division multiplexing with dirty-page tracking and compressed DMA context switching. Each physical neuron handles 8 virtual time slots. No other neuromorphic chip offers hardware virtualization.

Learning Accelerators

16 independent
learning engines.

One per tile, each with a 28-opcode ISA and 80 registers. Loihi 2 has one global learning engine. N3 has 16 running in parallel. No cross-chip bottleneck. Learning at wire speed.

Precision-Adaptive

24, 16, or 8 bits.
Your call.

Configure neuron precision per core. At 8-bit, neuron density doubles to over 1 million total. FACTOR low-rank synapse compression saves 2–8× memory. Same network, multiple precision targets.

Continual Learning

Networks that
stabilise themselves.

Hardware metaplasticity (3-bit synaptic consolidation), homeostatic plasticity (firing rate tracking with synaptic scaling), and synaptic fatigue. On-chip mechanisms for networks that learn continuously without catastrophic forgetting.

Full Die

128 cores.
36 MB SRAM.

128 neuromorphic cores across 16 tiles, 4 RISC-V management CPUs, async hybrid network-on-chip, and 36 MB on-chip SRAM. Every architectural feature implemented in RTL and validated on FPGA.

Scale

4.2M

Virtual neurons (24-bit)

500M+

Addressable synapses

36 MB

On-chip SRAM

256

GMAC/s (ASIC projected)

Architecture

128

Cores (16 tiles)

Neuron models (7 + custom)

Learning accelerators

4-level

Memory hierarchy

Performance

91.0%

SHD accuracy

76.4%

SSC accuracy

4.04

nJ / neuron-op (FPGA)

3.7×

Energy efficiency vs N2

What N3 adds over N2.

Unique

ANN INT8 MAC Mode

Per-core toggle between spiking and classical multiply-accumulate. Deploy hybrid ANN/SNN networks natively.

Unique

NeurOS Virtualization

Hardware-scheduled TDM for 680+ virtual networks. Context switching with dirty tracking and compressed DMA.

Unique

Per-Tile Learning

16 independent learning accelerators. 28-opcode ISA, 80 registers each. 16× the throughput of a single global engine.

4-Level Memory

L1 per-core, L2 tile cache, L3 DRAM-backed, L4 CXL fabric. 500M+ addressable synapses with hardware LRU management.

Parameter Groups

32 shared parameter sets per core. 4,096 neurons in 96 KB L1 instead of 1,024. 4× density increase.

Hardware Metaplasticity

3-bit consolidation state per synapse. Automatic meta-learning in hardware. No microcode overhead.

FACTOR Compression

Low-rank synapse format using SVD decomposition. Hardware STORE_A/STORE_B opcodes. 2–8× memory savings.

Winner-Take-All

Two-pass hardware WTA with configurable groups and k-winners. Competitive coding for sparse representations.

Spike Compression

DELTA, BURST, and ADAPTIVE encoding on inter-chip links. 2–8× effective bandwidth for multi-chip systems.

Same SDK. New hardware.

The neurocore SDK targets N3 with the same Python API you already know. Hardware-accurate simulation on CPU, GPU, and FPGA.

pip install catalyst-cloud

import catalyst_cloud as cc

# Target N3 hardware
net = cc.Network(chip="n3")
inp = net.add_population("input", 784)

# N3: hybrid ANN layer (INT8 MAC)
ann = net.add_population("encoder", 256,
    neuron_model="ann_int8")

# N3: spiking recurrent layer
exc = net.add_population("hidden", 512,
    neuron_model="adaptive_lif")
out = net.add_population("output", 10)

net.connect(inp, ann, weight=0.3)
net.connect(ann, exc, learning="stdp")
net.connect(exc, out)

result = cc.run(net, timesteps=1000)
print(result.spike_trains)

How N3 compares.

	Catalyst N3	Intel Loihi 2	BrainChip Akida 2	SpiNNaker 2
Cores	128 (16 tiles)	128	8 NPUs	152 (ARM)
Neurons (24-bit)*	524,288	~1,000,000	2,048	Up to 16M
Neurons (8-bit)	1,048,576	—	—	—
Virtual neurons (TDM)	4.2M (24-bit)	—	—	—
Neuron models	8 (7 + custom ISA)	3+	1 (LIF)	Software
Weight precision	1–16-bit	1–8-bit	1/2/4/8-bit	Software
ANN mode	INT8 MAC	—	Yes	—
On-chip learning	16 accelerators	1 global	Limited	ARM cores
Learning ISA	28 opcodes, 80 reg	Microcode	Fixed	Software
Memory levels	4 (L1–L4)	2	2	External
Virtualization	NeurOS (680+)	—	—	—
Spike compression	DELTA/BURST	—	—	—
Metaplasticity	Hardware (3-bit)	—	—	—
Hardware validated	FPGA validated	ASIC (Intel 4)	ASIC (28nm)	ASIC (22nm)
Open design	Yes (BSL 1.1)	No	No	Partial

* Neuron counts are not directly comparable across platforms. Loihi 2 neurons are 1-bit compartments; N3 neurons are 24-bit with full state (potential, current, traces). N3 at 8-bit precision supports 1,048,576 physical neurons, or up to 8.4M virtual with TDM.

Technical Specifications

Compute

Cores128 (16 tiles × 8 cores)

Neurons per core (24-bit)4,096

Neurons per core (16-bit)5,120

Neurons per core (8-bit)8,192

Total physical neurons (24-bit)524,288

Total physical neurons (16-bit)655,360

Total physical neurons (8-bit)1,048,576

Virtual neurons (24-bit, TDM ×8)4,194,304

Virtual neurons (8-bit, TDM ×8)8,388,608

Neuron modelsCUBA LIF, ALIF, Sigma-Delta, Gated, Graded, WTA, ANN INT8, Custom ISA

Compartments per neuron8 addressable (4 implemented)

Parameter groups per core32

Management CPUs4 × RISC-V RV32IMC with neuro ISA extensions

Memory

L1 SRAM per core96 KB

L2 shared tile cache1 MB per tile

Shadow SRAM (TDM)8 MB total

Total on-chip SRAM~36 MB

L3 external memoryLPDDR5X / HBM

Addressable synapses500M+ (L3-backed)

L4 fabricCXL interconnect

Synapse formats4 modes: Full, Inference, Compact, FACTOR

Learning

Learning accelerators16 (1 per tile, independent)

ISA opcodes28

Registers per accelerator80

Parallel threads4 per accelerator

Eligibility traces6 types (24-bit mode)

Metaplasticity3-bit consolidation per synapse

Fatigue4-bit per synapse with recovery

Homeostatic plasticityHardware EWMA rate tracking + synaptic scaling

Neuromodulation16 channels per tile + per-core override

Communication

NoC typeAsync hybrid (event-driven routers + synchronous cores)

Intra-tile routing4-phase handshake mesh

Inter-tile routingFat tree + 8 express links

Spike compressionDELTA / BURST / ADAPTIVE

Multicast groups256 per tile × 32 destinations

Spike payloadsBinary + 8-bit graded (16-bit extended mode)

Axonal delays1–255 timesteps (24-bit mode)

Power & Reliability

Clock gatingPer-core ICG

Power gatingPer-tile (async router stays powered)

DVFS4 clock domains, adaptive frequency

Thermal management4 diodes + throttle FSM

ECCPer-SRAM, with scrubber

Deterministic modeFull-chip reproducibility (LFSR seeding, barrier sync)

Performance counters32 per tile (spikes, FIFO, cache, DMA, NoC)

FPGA Performance

PlatformAWS F2 (Xilinx UltraScale+ VU47P)

Neuromorphic clock62.5 MHz (tile), 250 MHz (AXI bus)

Peak throughput14,512 timesteps/sec

Energy per neuron-op4.04 nJ

Efficiency vs N23.7× (nJ/neuron-op)

SHD accuracy91.0%

SSC accuracy76.4%

N-MNIST accuracy99.1%

GSC-12 accuracy88.0%

DVS Gesture accuracy89.0%

Total power (FPGA)1.91 W (full tile active)

Kria K26 (2-core)51,381 LUTs (43.9%), 0.867W, ~58.5 MHz

ASIC Projections (28nm)

Target processTSMC 28nm HPC+

Die area per tile (8 cores)~58 mm²

Full 128-core die~1,070 mm² (not feasible at 28nm — multi-chip or advanced node required)

Initial ASIC targetSingle tile (8 cores) or 4-tile (32 cores) configuration

Energy per synaptic event12–18 pJ

Projected efficiency~57 GSOPs/J

Power (8-core tile, projected)~0.2 W (typical activity)

Scaling pathMulti-chip AER (8 links/chip) or advanced node (≤7nm)

Intellectual Property

RTLProprietary (BSL 1.1)

Architecture spec78 pages, v3.0

RTL modules46 modules, ~17,700 lines

N3

ANN and SNN
on one chip.

From 96 KB
to unlimited.

680+ networks
on one chip.

16 independent
learning engines.

24, 16, or 8 bits.
Your call.

Networks that
stabilise themselves.

128 cores.
36 MB SRAM.

What N3 adds over N2.

ANN INT8 MAC Mode

NeurOS Virtualization

Per-Tile Learning

4-Level Memory

Parameter Groups

Hardware Metaplasticity

FACTOR Compression

Winner-Take-All

Spike Compression

Same SDK. New hardware.

How N3 compares.

Technical Specifications

N3 Paper

Interested in N3?

N3

ANN and SNNon one chip.

From 96 KBto unlimited.

680+ networkson one chip.

16 independentlearning engines.

24, 16, or 8 bits.Your call.

Networks thatstabilise themselves.

128 cores.36 MB SRAM.

What N3 adds over N2.

ANN INT8 MAC Mode

NeurOS Virtualization

Per-Tile Learning

4-Level Memory

Parameter Groups

Hardware Metaplasticity

FACTOR Compression

Winner-Take-All

Spike Compression

Same SDK. New hardware.

How N3 compares.

Technical Specifications

N3 Paper

Interested in N3?

ANN and SNN
on one chip.

From 96 KB
to unlimited.

680+ networks
on one chip.

16 independent
learning engines.

24, 16, or 8 bits.
Your call.

Networks that
stabilise themselves.

128 cores.
36 MB SRAM.