Table of Contents

macOS Arm64 Apple M4 Cipher Benchmarks

Machine Profile

Machine Specification

The benchmarks were run on the following machine:

BenchmarkDotNet v0.15.8, macOS Tahoe 26.3.1 (a) (25D771280a) [Darwin 25.3.0]
Apple M4, 1 CPU, 10 logical and 10 physical cores
.NET SDK 10.0.201
[Host]    : .NET 10.0.5 (10.0.5, 10.0.526.15411), Arm64 RyuJIT armv8.0-a
.NET 10.0 : .NET 10.0.5 (10.0.5, 10.0.526.15411), Arm64 RyuJIT armv8.0-a
Method=TryComputeHash  Job=.NET 10.0  Runtime=.NET 10.0
Toolchain=net10.0

Note: Results are machine-specific and may vary between systems. Run benchmarks locally for your specific hardware.

BenchmarkDotNet measurements for all cipher algorithm implementations in CryptoHives.Foundation.Security.Cryptography. Each algorithm is benchmarked across representative payload sizes (17 bytes through 128 KiB) to capture both latency and throughput characteristics.

Implementation Variants

Each cipher family exposes multiple acceleration tiers. The runtime automatically selects the fastest tier supported by the host CPU via SimdSupport detection. Callers can also force a specific tier through the Create(SimdSupport) factory for testing or compatibility.

AES Family

Variant Instructions .NET Target When Selected Description
Managed Scalar All No ARM Crypto support T-table AES using scalar uint arithmetic. Fully portable, zero-allocation. ~10–16× slower than ArmAes depending on mode and payload size.
ArmAes AES (ARM Crypto Ext.) .NET 8+ ArmBase.IsSupported Hardware AES round instructions (AESD, AESE, AESMC, AESIMC). For CBC, uses 8-block interleaved decrypt for maximum instruction-level parallelism — all 8 plaintext blocks decoded simultaneously via parallel AESD dispatch. For GCM/CCM, accelerates counter-mode encryption and CBC-MAC. Decrypt is ~8.5× faster than OS at 128 B; at bulk sizes Apple CommonCrypto leads via Apple Silicon–specific AES pipelining.
ArmAes+ArmPmull AES + PMULL (ARM Crypto Ext.) .NET 8+ AdvSimd.Arm64.IsSupported Adds carry-less polynomial multiplication (PMULL/PMULL2) for hardware-accelerated GHASH over GF(2¹²⁸). PMULL operates on 64-bit polynomial operands to produce 128-bit products; PMULL2 reads from the upper halves of 128-bit NEON registers (a free lane-select requiring no additional instruction). Uses the same 8-block stitched AES+GHASH pipeline as the x86 PClMul path. Modular reduction uses a 2-PMULL SymCrypt-style MODREDUCE. Pre-computes Karatsuba cross-term halves for H¹–H⁸ powers. ~32× faster than OS at 17 B; OS leads at ≥8 KiB due to Apple Silicon–specific bulk AES acceleration.

ChaCha20 Family

Variant Instructions .NET Target When Selected Description
Managed Scalar All No NEON support Quarter-round operations using scalar uint arithmetic. Fully portable. ~4× slower than Neon at all payload sizes.
Neon AdvSIMD (NEON) .NET 8+ AdvSimd.IsSupported Maps the 4×4 ChaCha state to four Vector128<uint> rows. Uses ARM NEON shift-left, shift-right, and byte-table permute instructions for the 16-bit, 12-bit, 8-bit, and 7-bit rotations. Diagonal rounds use AdvSimd.ExtractVector128 to rotate rows by one element. Processes one 64-byte keystream block per iteration. ~4× faster than Managed; ~1.24× faster than BouncyCastle at 128 KiB.

When to Use Each Variant

  • Small messages (≤256 B): AES-GCM with ArmAes+ArmPmull is ~32× faster than OS at 17 B and ~14× at 128 B — zero P/Invoke overhead eliminates the ~1.7–1.9 μs kernel transition cost entirely. ChaCha20-Poly1305 NEON is ~3× faster than OS at 128 B.
  • Medium messages (256 B–4 KB): ArmAes+ArmPmull leads through ~1 KiB. ChaCha20-Poly1305 NEON remains competitive at 1 KiB (~1.25× faster than OS). This range covers QUIC (~1.4 KB), WireGuard (~1.4 KB), and IPsec packets.
  • Large messages (8 KB–128 KB): Apple CommonCrypto dominates — OS is ~2× faster for AES-GCM and ~1.7× faster for ChaCha20-Poly1305. This is likely due to Apple Silicon–specific AES/PMULL micro-architectural pipelining that .NET's current ARMv8 paths do not yet fully exploit. This range covers TLS records (1–16 KB) and OPC UA chunks (8 KB default).
  • No hardware AES: Use ChaCha20-Poly1305 NEON — it outperforms Managed AES-GCM by 3–10× depending on payload size and is always zero-allocation.
  • IoT / constrained devices: AES-CCM with ArmAes provides ~4× speedup over BouncyCastle at 128 KiB. Supports variable nonce (7–13 bytes) and tag sizes.

Highlights

Family Leader Key Insight
ChaCha20 Neon NEON ~4× faster than Managed; ~1.24× faster than BouncyCastle at 128 KiB; zero allocation
ChaCha20-Poly1305 Neon ~3× faster than OS at 128 B; OS leads at ≥8 KiB; zero allocation
XChaCha20-Poly1305 Neon ~3.3× faster than Managed at 128 KiB; zero allocation
AES-CBC ArmAes Decrypt ~8.5× faster than OS at 128 B; OS leads at ≥8 KiB (Apple Silicon bulk path); zero allocation
AES-GCM ArmAes+ArmPmull ~32× faster than OS at 17 B; ~14× at 128 B; OS leads at ≥8 KiB; 8-block stitched AES+GHASH pipeline
AES-CCM ArmAes ~4× faster than BouncyCastle at 128 KiB; zero allocation; no OS adapter available

Stream Ciphers

ChaCha20

ChaCha20 is a stream cipher designed by Daniel J. Bernstein. Two acceleration tiers are available on ARM:

  • Neon: Single-block processing — maps the 4×4 ChaCha state matrix to four Vector128<uint> rows. Uses ARM NEON vshl/vsri (shift-and-insert) and vtbl (byte-table permute) instructions for the four rotation widths (16-bit, 12-bit, 8-bit, 7-bit). Diagonal rounds use AdvSimd.ExtractVector128 to rotate rows by one element. Yields ~750 MB/s throughput at 128 KiB; ~1.24× faster than BouncyCastle.
  • Managed: Scalar uint quarter-round arithmetic. Fully portable across all .NET targets. ~4.1× slower than Neon at 128 KiB.

Key observations:

  • Neon is the fastest at all sizes; ~1.24× faster than BouncyCastle at 128 KiB; ~1.35× at 1 KiB
  • BouncyCastle allocates 96 B per call; NaCl.Core allocates 24 B per call
  • Managed and Neon paths are zero-allocation
Description TestDataSize Mean Error StdDev Allocated
Decrypt · ChaCha20 (Neon) 128B 170.2 ns 0.05 ns 0.05 ns -
Decrypt · ChaCha20 (BouncyCastle) 128B 304.1 ns 1.49 ns 1.32 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 128B 521.2 ns 0.19 ns 0.18 ns 24 B
Decrypt · ChaCha20 (Managed) 128B 692.3 ns 2.27 ns 2.12 ns -
Encrypt · ChaCha20 (Neon) 128B 170.2 ns 0.07 ns 0.06 ns -
Encrypt · ChaCha20 (BouncyCastle) 128B 299.9 ns 4.36 ns 4.08 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 128B 521.1 ns 0.20 ns 0.19 ns 24 B
Encrypt · ChaCha20 (Managed) 128B 698.0 ns 2.51 ns 2.35 ns -
Decrypt · ChaCha20 (Neon) 1KB 1,336.0 ns 0.72 ns 0.64 ns -
Decrypt · ChaCha20 (BouncyCastle) 1KB 1,812.7 ns 22.35 ns 19.81 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 1KB 2,935.6 ns 0.94 ns 0.78 ns 24 B
Decrypt · ChaCha20 (Managed) 1KB 5,466.8 ns 13.05 ns 11.57 ns -
Encrypt · ChaCha20 (Neon) 1KB 1,335.9 ns 0.90 ns 0.84 ns -
Encrypt · ChaCha20 (BouncyCastle) 1KB 1,868.0 ns 36.32 ns 37.29 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 1KB 2,935.4 ns 0.89 ns 0.83 ns 24 B
Encrypt · ChaCha20 (Managed) 1KB 5,495.7 ns 17.69 ns 16.55 ns -
Decrypt · ChaCha20 (Neon) 8KB 10,652.9 ns 2.54 ns 2.12 ns -
Decrypt · ChaCha20 (BouncyCastle) 8KB 13,452.3 ns 196.17 ns 183.50 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 8KB 22,244.7 ns 11.08 ns 10.37 ns 24 B
Decrypt · ChaCha20 (Managed) 8KB 43,589.3 ns 170.58 ns 159.56 ns -
Encrypt · ChaCha20 (Neon) 8KB 10,655.7 ns 5.51 ns 5.15 ns -
Encrypt · ChaCha20 (BouncyCastle) 8KB 13,947.8 ns 22.50 ns 21.05 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 8KB 22,251.6 ns 5.66 ns 4.72 ns 24 B
Encrypt · ChaCha20 (Managed) 8KB 43,758.3 ns 189.25 ns 177.03 ns -
Decrypt · ChaCha20 (Neon) 128KB 170,370.4 ns 30.76 ns 27.27 ns -
Decrypt · ChaCha20 (BouncyCastle) 128KB 211,614.5 ns 273.29 ns 255.64 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 128KB 353,412.1 ns 32.71 ns 27.31 ns 24 B
Decrypt · ChaCha20 (Managed) 128KB 697,996.7 ns 2,113.03 ns 1,976.53 ns -
Encrypt · ChaCha20 (Neon) 128KB 170,355.3 ns 79.47 ns 66.36 ns -
Encrypt · ChaCha20 (BouncyCastle) 128KB 212,054.6 ns 185.30 ns 173.33 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 128KB 353,326.8 ns 199.21 ns 186.34 ns 24 B
Encrypt · ChaCha20 (Managed) 128KB 699,802.4 ns 3,056.58 ns 2,859.13 ns -

Block Ciphers

AES-128-CBC

AES-CBC (Cipher Block Chaining) is the most widely deployed AES mode. Two acceleration tiers are available on Apple M4:

  • ArmAes: Uses ARM Cryptography Extension AESD/AESE/AESMC/AESIMC instructions. Decrypt uses 8-block interleaving — 8 ciphertext blocks are loaded and decrypted simultaneously via parallel AESD dispatch. Each block decrypts independently, requiring only the preceding ciphertext block as an XOR mask (10 rounds × 8 blocks = 80 AESD instructions in flight). Encrypt remains serial because each plaintext block must be XORed with the previous ciphertext before the next AESE can proceed.
  • Managed: T-table AES using four 256-entry lookup tables per round. Fully portable, zero-allocation. Comparable to BouncyCastle at large sizes.

Key observations:

  • ArmAes Decrypt: ~8.5× faster than OS at 128 B; near OS at 4 KiB; OS leads from ~8 KiB (Apple Silicon uses a wider AES pipeline at bulk sizes)
  • ArmAes Encrypt: ~1.5× faster than OS at 128 B; OS leads from 1 KiB (CBC encrypt is inherently serial; CommonCrypto uses NEON-assisted interleaving for partial parallelism)
  • Managed: Zero-allocation T-table AES; comparable to BouncyCastle at large sizes
  • OS: Allocates 72 B per call (P/Invoke marshalling overhead)
Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-128-CBC (ArmAes) 128B 22.08 ns 0.068 ns 0.061 ns -
Decrypt · AES-128-CBC (OS) 128B 187.81 ns 1.061 ns 0.828 ns 72 B
Decrypt · AES-128-CBC (Managed) 128B 386.28 ns 0.583 ns 0.545 ns -
Decrypt · AES-128-CBC (BouncyCastle) 128B 604.78 ns 1.863 ns 1.652 ns 832 B
Encrypt · AES-128-CBC (ArmAes) 128B 129.02 ns 0.636 ns 0.564 ns -
Encrypt · AES-128-CBC (OS) 128B 192.84 ns 1.322 ns 1.104 ns 72 B
Encrypt · AES-128-CBC (Managed) 128B 428.98 ns 0.876 ns 0.777 ns -
Encrypt · AES-128-CBC (BouncyCastle) 128B 558.80 ns 3.133 ns 2.931 ns 832 B
Decrypt · AES-128-CBC (ArmAes) 1KB 87.13 ns 0.535 ns 0.500 ns -
Decrypt · AES-128-CBC (OS) 1KB 229.65 ns 2.508 ns 2.346 ns 72 B
Decrypt · AES-128-CBC (Managed) 1KB 2,702.80 ns 1.275 ns 1.193 ns -
Decrypt · AES-128-CBC (BouncyCastle) 1KB 3,378.77 ns 3.735 ns 3.494 ns 832 B
Encrypt · AES-128-CBC (OS) 1KB 541.56 ns 1.345 ns 1.192 ns 72 B
Encrypt · AES-128-CBC (ArmAes) 1KB 914.69 ns 5.226 ns 4.888 ns -
Encrypt · AES-128-CBC (Managed) 1KB 3,112.53 ns 6.331 ns 5.922 ns -
Encrypt · AES-128-CBC (BouncyCastle) 1KB 3,241.94 ns 3.052 ns 2.549 ns 832 B
Decrypt · AES-128-CBC (OS) 8KB 560.77 ns 4.027 ns 3.767 ns 72 B
Decrypt · AES-128-CBC (ArmAes) 8KB 610.05 ns 3.382 ns 3.164 ns -
Decrypt · AES-128-CBC (Managed) 8KB 21,239.52 ns 7.107 ns 6.648 ns -
Decrypt · AES-128-CBC (BouncyCastle) 8KB 25,311.13 ns 45.050 ns 42.140 ns 832 B
Encrypt · AES-128-CBC (OS) 8KB 3,286.63 ns 25.573 ns 22.669 ns 72 B
Encrypt · AES-128-CBC (ArmAes) 8KB 7,177.20 ns 18.455 ns 16.360 ns -
Encrypt · AES-128-CBC (Managed) 8KB 24,539.18 ns 25.378 ns 21.192 ns -
Encrypt · AES-128-CBC (BouncyCastle) 8KB 24,691.65 ns 18.235 ns 17.057 ns 832 B
Decrypt · AES-128-CBC (OS) 128KB 6,436.87 ns 25.556 ns 23.905 ns 72 B
Decrypt · AES-128-CBC (ArmAes) 128KB 9,613.76 ns 33.784 ns 31.601 ns -
Decrypt · AES-128-CBC (Managed) 128KB 341,935.68 ns 504.550 ns 471.956 ns -
Decrypt · AES-128-CBC (BouncyCastle) 128KB 402,159.16 ns 961.030 ns 898.948 ns 832 B
Encrypt · AES-128-CBC (OS) 128KB 50,556.34 ns 34.506 ns 30.589 ns 72 B
Encrypt · AES-128-CBC (ArmAes) 128KB 119,683.72 ns 506.758 ns 395.644 ns -
Encrypt · AES-128-CBC (Managed) 128KB 393,501.61 ns 265.260 ns 221.504 ns -
Encrypt · AES-128-CBC (BouncyCastle) 128KB 393,912.18 ns 499.123 ns 466.880 ns 832 B

AES-256-CBC

AES-256-CBC uses 14 rounds (vs 10 for AES-128), adding ~25-30% overhead. The same 8-block interleaved decrypt and serial encrypt architecture applies via ArmAes. Decrypt is ~1.65× faster than OS at 128 B; OS leads from ~8 KiB. Encrypt is slower than OS from 1 KiB (serial CBC encrypt bottleneck on Apple Silicon).

Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-256-CBC (ArmAes) 128B 24.49 ns 0.058 ns 0.054 ns -
Decrypt · AES-256-CBC (OS) 128B 223.79 ns 2.652 ns 2.481 ns 72 B
Decrypt · AES-256-CBC (Managed) 128B 518.96 ns 0.478 ns 0.447 ns -
Decrypt · AES-256-CBC (BouncyCastle) 128B 795.90 ns 0.699 ns 0.584 ns 1024 B
Encrypt · AES-256-CBC (ArmAes) 128B 147.67 ns 0.518 ns 0.484 ns -
Encrypt · AES-256-CBC (OS) 128B 244.19 ns 1.642 ns 1.536 ns 72 B
Encrypt · AES-256-CBC (Managed) 128B 568.90 ns 0.157 ns 0.140 ns -
Encrypt · AES-256-CBC (BouncyCastle) 128B 727.30 ns 3.226 ns 3.018 ns 1024 B
Decrypt · AES-256-CBC (ArmAes) 1KB 105.71 ns 0.536 ns 0.501 ns -
Decrypt · AES-256-CBC (OS) 1KB 278.84 ns 1.092 ns 1.022 ns 72 B
Decrypt · AES-256-CBC (Managed) 1KB 3,658.68 ns 1.154 ns 0.963 ns -
Decrypt · AES-256-CBC (BouncyCastle) 1KB 4,423.07 ns 2.201 ns 1.951 ns 1024 B
Encrypt · AES-256-CBC (OS) 1KB 726.35 ns 1.198 ns 1.121 ns 72 B
Encrypt · AES-256-CBC (ArmAes) 1KB 1,089.55 ns 3.812 ns 3.379 ns -
Encrypt · AES-256-CBC (Managed) 1KB 4,093.74 ns 2.524 ns 2.361 ns -
Encrypt · AES-256-CBC (BouncyCastle) 1KB 4,264.13 ns 4.516 ns 3.771 ns 1024 B
Decrypt · AES-256-CBC (OS) 8KB 713.66 ns 3.539 ns 3.137 ns 72 B
Decrypt · AES-256-CBC (ArmAes) 8KB 750.17 ns 2.570 ns 2.404 ns -
Decrypt · AES-256-CBC (Managed) 8KB 28,820.45 ns 3.599 ns 3.005 ns -
Decrypt · AES-256-CBC (BouncyCastle) 8KB 33,282.54 ns 35.367 ns 31.352 ns 1024 B
Encrypt · AES-256-CBC (OS) 8KB 4,420.51 ns 3.907 ns 3.463 ns 72 B
Encrypt · AES-256-CBC (ArmAes) 8KB 8,531.38 ns 45.876 ns 42.912 ns -
Encrypt · AES-256-CBC (Managed) 8KB 32,252.99 ns 15.854 ns 14.830 ns -
Encrypt · AES-256-CBC (BouncyCastle) 8KB 32,451.74 ns 19.617 ns 18.350 ns 1024 B
Decrypt · AES-256-CBC (OS) 128KB 8,453.37 ns 31.046 ns 29.040 ns 72 B
Decrypt · AES-256-CBC (ArmAes) 128KB 11,843.55 ns 24.484 ns 22.902 ns -
Decrypt · AES-256-CBC (Managed) 128KB 461,650.22 ns 193.805 ns 171.803 ns -
Decrypt · AES-256-CBC (BouncyCastle) 128KB 527,391.86 ns 2,011.466 ns 1,881.527 ns 1024 B
Encrypt · AES-256-CBC (OS) 128KB 68,785.66 ns 68.572 ns 60.787 ns 72 B
Encrypt · AES-256-CBC (ArmAes) 128KB 136,660.36 ns 528.923 ns 494.755 ns -
Encrypt · AES-256-CBC (Managed) 128KB 515,053.11 ns 540.111 ns 505.221 ns -
Encrypt · AES-256-CBC (BouncyCastle) 128KB 518,864.32 ns 247.465 ns 231.479 ns 1024 B

AEAD Ciphers (Authenticated Encryption)

Authenticated Encryption with Associated Data (AEAD) ciphers provide both confidentiality and authenticity in a single operation. All CryptoHives AEAD implementations are zero-allocation.

AES-128-GCM

AES-GCM combines counter-mode AES encryption (GCTR) with GHASH polynomial authentication over GF(2¹²⁸). Two acceleration tiers are available on Apple M4:

  • ArmAes+ArmPmull (.NET 8+): Uses ARM Cryptography Extension AESD/AESE for counter-mode encryption and PMULL/PMULL2 for GHASH polynomial multiplication. PMULL operates on 64-bit polynomial operands to produce 128-bit products; PMULL2 reads from the upper halves of 128-bit NEON registers (a free lane-select requiring no additional instruction). Uses an 8-block stitched loop that interleaves AES rounds with lagged GHASH of the previous 8 blocks. Modular reduction uses a 2-PMULL SymCrypt-style MODREDUCE. Pre-computes Karatsuba cross-term halves for H¹–H⁸ powers. Small payloads use the non-stitched path (≤8 blocks). ~32× faster than OS at 17 B; ~14× at 128 B. At bulk sizes (≥8 KiB), Apple CommonCrypto leads — likely due to Apple Silicon–specific AES pipelining not yet accessible to the .NET ARM intrinsics layer.
  • Managed: Scalar T-table AES with 4-bit Shoup table GHASH (16-entry reduction table, byte-by-byte multiplication). Fully portable, zero-allocation.

Key observations:

  • ArmAes+ArmPmull: ~32× faster than OS at 17 B encrypt; ~14× at 128 B; ~2.5× at 1 KiB; OS leads from ~4–8 KiB
  • ArmAes+ArmPmull at 128 KiB: OS is ~4.8× faster for both encrypt and decrypt
  • Managed: Uses 4-bit Shoup table GHASH, T-table AES; zero allocation
  • BouncyCastle: Uses ARM AES + PMULL internally on ARM64; allocates ~1.5 KB per call
Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 17B 83.46 ns 0.140 ns 0.131 ns -
Decrypt · AES-128-GCM (Managed) 17B 349.10 ns 0.676 ns 0.599 ns -
Decrypt · AES-128-GCM (BouncyCastle) 17B 571.92 ns 1.796 ns 1.592 ns 1536 B
Decrypt · AES-128-GCM (OS) 17B 1,876.11 ns 7.974 ns 7.459 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 17B 52.02 ns 0.134 ns 0.125 ns -
Encrypt · AES-128-GCM (Managed) 17B 313.23 ns 0.600 ns 0.561 ns -
Encrypt · AES-128-GCM (BouncyCastle) 17B 489.69 ns 2.913 ns 2.725 ns 1520 B
Encrypt · AES-128-GCM (OS) 17B 1,667.75 ns 5.798 ns 4.841 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 65B 117.63 ns 0.446 ns 0.417 ns -
Decrypt · AES-128-GCM (Managed) 65B 608.52 ns 0.821 ns 0.768 ns -
Decrypt · AES-128-GCM (BouncyCastle) 65B 770.15 ns 2.003 ns 1.874 ns 1536 B
Decrypt · AES-128-GCM (OS) 65B 1,878.06 ns 10.164 ns 9.507 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 65B 81.43 ns 0.412 ns 0.385 ns -
Encrypt · AES-128-GCM (Managed) 65B 571.55 ns 0.691 ns 0.647 ns -
Encrypt · AES-128-GCM (BouncyCastle) 65B 700.34 ns 1.246 ns 1.165 ns 1520 B
Encrypt · AES-128-GCM (OS) 65B 1,670.34 ns 8.758 ns 8.192 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 128B 150.28 ns 0.719 ns 0.673 ns -
Decrypt · AES-128-GCM (Managed) 128B 865.62 ns 1.100 ns 0.918 ns -
Decrypt · AES-128-GCM (BouncyCastle) 128B 973.42 ns 1.716 ns 1.521 ns 1536 B
Decrypt · AES-128-GCM (OS) 128B 1,892.75 ns 15.790 ns 14.770 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 128B 114.82 ns 0.541 ns 0.506 ns -
Encrypt · AES-128-GCM (Managed) 128B 834.97 ns 0.197 ns 0.174 ns -
Encrypt · AES-128-GCM (BouncyCastle) 128B 916.07 ns 1.968 ns 1.840 ns 1520 B
Encrypt · AES-128-GCM (OS) 128B 1,675.67 ns 7.713 ns 7.215 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 152B 180.33 ns 0.965 ns 0.903 ns -
Decrypt · AES-128-GCM (Managed) 152B 1,049.62 ns 2.411 ns 2.138 ns -
Decrypt · AES-128-GCM (BouncyCastle) 152B 1,097.85 ns 1.020 ns 0.954 ns 1536 B
Decrypt · AES-128-GCM (OS) 152B 1,915.24 ns 23.454 ns 20.792 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 152B 141.78 ns 0.931 ns 0.871 ns -
Encrypt · AES-128-GCM (Managed) 152B 998.72 ns 1.057 ns 0.937 ns -
Encrypt · AES-128-GCM (BouncyCastle) 152B 1,044.79 ns 2.014 ns 1.884 ns 1520 B
Encrypt · AES-128-GCM (OS) 152B 1,693.58 ns 9.880 ns 9.242 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 256B 245.16 ns 2.051 ns 1.919 ns -
Decrypt · AES-128-GCM (BouncyCastle) 256B 1,505.57 ns 21.598 ns 20.202 ns 1536 B
Decrypt · AES-128-GCM (Managed) 256B 1,576.52 ns 7.472 ns 6.623 ns -
Decrypt · AES-128-GCM (OS) 256B 1,928.80 ns 9.036 ns 8.011 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 256B 206.31 ns 0.522 ns 0.488 ns -
Encrypt · AES-128-GCM (BouncyCastle) 256B 1,457.01 ns 2.930 ns 2.741 ns 1520 B
Encrypt · AES-128-GCM (Managed) 256B 1,537.20 ns 0.764 ns 0.677 ns -
Encrypt · AES-128-GCM (OS) 256B 1,699.98 ns 9.487 ns 8.874 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 1KB 825.49 ns 8.153 ns 7.626 ns -
Decrypt · AES-128-GCM (OS) 1KB 2,043.39 ns 11.923 ns 11.152 ns -
Decrypt · AES-128-GCM (BouncyCastle) 1KB 4,507.99 ns 4.555 ns 4.038 ns 1536 B
Decrypt · AES-128-GCM (Managed) 1KB 5,632.19 ns 3.065 ns 2.559 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 1KB 773.43 ns 0.084 ns 0.070 ns -
Encrypt · AES-128-GCM (OS) 1KB 1,846.66 ns 8.340 ns 7.802 ns -
Encrypt · AES-128-GCM (BouncyCastle) 1KB 4,720.98 ns 2.953 ns 2.618 ns 1520 B
Encrypt · AES-128-GCM (Managed) 1KB 5,488.35 ns 1.785 ns 1.669 ns -
Decrypt · AES-128-GCM (OS) 8KB 2,980.11 ns 8.988 ns 7.968 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 8KB 6,154.80 ns 54.469 ns 48.286 ns -
Decrypt · AES-128-GCM (BouncyCastle) 8KB 32,353.34 ns 14.646 ns 12.230 ns 1536 B
Decrypt · AES-128-GCM (Managed) 8KB 43,234.55 ns 74.420 ns 65.972 ns -
Encrypt · AES-128-GCM (OS) 8KB 2,759.67 ns 23.114 ns 21.621 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 8KB 6,031.61 ns 1.374 ns 1.218 ns -
Encrypt · AES-128-GCM (BouncyCastle) 8KB 34,591.45 ns 15.798 ns 14.004 ns 1520 B
Encrypt · AES-128-GCM (Managed) 8KB 42,968.62 ns 31.414 ns 29.385 ns -
Decrypt · AES-128-GCM (OS) 128KB 20,417.54 ns 92.491 ns 81.991 ns -
Decrypt · AES-128-GCM (ArmAes+ArmPmull) 128KB 98,744.15 ns 770.397 ns 720.630 ns -
Decrypt · AES-128-GCM (BouncyCastle) 128KB 509,846.38 ns 3,538.195 ns 2,954.553 ns 1536 B
Decrypt · AES-128-GCM (Managed) 128KB 687,721.49 ns 520.607 ns 434.731 ns -
Encrypt · AES-128-GCM (OS) 128KB 20,606.77 ns 188.091 ns 175.940 ns -
Encrypt · AES-128-GCM (ArmAes+ArmPmull) 128KB 97,818.19 ns 888.611 ns 742.030 ns -
Encrypt · AES-128-GCM (BouncyCastle) 128KB 548,420.82 ns 578.240 ns 540.886 ns 1520 B
Encrypt · AES-128-GCM (Managed) 128KB 686,949.55 ns 4,014.906 ns 3,352.628 ns -

AES-192-GCM

AES-192-GCM uses 12 rounds (vs 10 for AES-128), adding ~10-15% overhead. The same ArmAes+ArmPmull pipeline applies. The performance pattern mirrors AES-128-GCM: dominant over OS at small payloads, OS leads at bulk sizes.

Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 17B 88.64 ns 0.816 ns 1.386 ns 88.48 ns -
Decrypt · AES-192-GCM (Managed) 17B 384.55 ns 2.477 ns 2.317 ns 383.82 ns -
Decrypt · AES-192-GCM (BouncyCastle) 17B 644.88 ns 2.070 ns 1.936 ns 644.89 ns 1640 B
Decrypt · AES-192-GCM (OS) 17B 1,961.31 ns 9.998 ns 9.352 ns 1,963.25 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 17B 54.83 ns 0.363 ns 0.339 ns 54.80 ns -
Encrypt · AES-192-GCM (Managed) 17B 337.99 ns 0.361 ns 0.301 ns 337.96 ns -
Encrypt · AES-192-GCM (BouncyCastle) 17B 539.33 ns 1.369 ns 1.213 ns 539.13 ns 1624 B
Encrypt · AES-192-GCM (OS) 17B 1,724.67 ns 11.283 ns 10.002 ns 1,725.28 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 65B 127.00 ns 0.526 ns 0.466 ns 126.98 ns -
Decrypt · AES-192-GCM (Managed) 65B 678.65 ns 3.988 ns 3.535 ns 678.16 ns -
Decrypt · AES-192-GCM (BouncyCastle) 65B 875.62 ns 6.169 ns 5.469 ns 873.15 ns 1640 B
Decrypt · AES-192-GCM (OS) 65B 1,961.01 ns 8.769 ns 8.203 ns 1,960.41 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 65B 85.93 ns 0.357 ns 0.334 ns 86.04 ns -
Encrypt · AES-192-GCM (Managed) 65B 621.38 ns 5.008 ns 4.439 ns 620.10 ns -
Encrypt · AES-192-GCM (BouncyCastle) 65B 768.43 ns 1.952 ns 1.630 ns 767.94 ns 1624 B
Encrypt · AES-192-GCM (OS) 65B 1,699.25 ns 6.290 ns 5.884 ns 1,697.84 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 128B 165.14 ns 0.953 ns 0.892 ns 165.07 ns -
Decrypt · AES-192-GCM (Managed) 128B 963.21 ns 5.582 ns 4.948 ns 964.87 ns -
Decrypt · AES-192-GCM (BouncyCastle) 128B 1,123.41 ns 2.551 ns 2.386 ns 1,123.69 ns 1640 B
Decrypt · AES-192-GCM (OS) 128B 2,013.48 ns 25.533 ns 23.884 ns 2,016.24 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 128B 119.79 ns 0.434 ns 0.406 ns 119.74 ns -
Encrypt · AES-192-GCM (Managed) 128B 900.18 ns 1.336 ns 1.185 ns 899.74 ns -
Encrypt · AES-192-GCM (BouncyCastle) 128B 1,011.73 ns 1.451 ns 1.357 ns 1,011.96 ns 1624 B
Encrypt · AES-192-GCM (OS) 128B 1,721.26 ns 9.321 ns 8.719 ns 1,720.09 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 152B 197.90 ns 0.479 ns 0.425 ns 197.89 ns -
Decrypt · AES-192-GCM (Managed) 152B 1,186.42 ns 3.950 ns 3.695 ns 1,187.44 ns -
Decrypt · AES-192-GCM (BouncyCastle) 152B 1,269.90 ns 7.636 ns 7.143 ns 1,269.82 ns 1640 B
Decrypt · AES-192-GCM (OS) 152B 2,026.10 ns 21.096 ns 19.733 ns 2,028.72 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 152B 148.07 ns 0.826 ns 0.773 ns 147.96 ns -
Encrypt · AES-192-GCM (Managed) 152B 1,083.81 ns 2.450 ns 2.172 ns 1,083.46 ns -
Encrypt · AES-192-GCM (BouncyCastle) 152B 1,156.93 ns 1.140 ns 1.011 ns 1,156.88 ns 1624 B
Encrypt · AES-192-GCM (OS) 152B 1,886.51 ns 37.631 ns 77.714 ns 1,921.00 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 256B 274.20 ns 1.219 ns 1.140 ns 274.24 ns -
Decrypt · AES-192-GCM (BouncyCastle) 256B 1,716.26 ns 5.456 ns 5.104 ns 1,715.83 ns 1640 B
Decrypt · AES-192-GCM (Managed) 256B 1,777.79 ns 4.789 ns 4.480 ns 1,777.46 ns -
Decrypt · AES-192-GCM (OS) 256B 2,006.89 ns 5.787 ns 5.414 ns 2,006.84 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 256B 245.46 ns 3.782 ns 3.538 ns 246.90 ns -
Encrypt · AES-192-GCM (BouncyCastle) 256B 1,784.07 ns 13.793 ns 12.902 ns 1,785.44 ns 1624 B
Encrypt · AES-192-GCM (Managed) 256B 1,826.82 ns 16.124 ns 15.082 ns 1,832.15 ns -
Encrypt · AES-192-GCM (OS) 256B 1,942.43 ns 26.472 ns 23.467 ns 1,944.04 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 1KB 919.98 ns 17.348 ns 23.747 ns 910.43 ns -
Decrypt · AES-192-GCM (OS) 1KB 2,129.58 ns 29.742 ns 24.836 ns 2,125.08 ns -
Decrypt · AES-192-GCM (BouncyCastle) 1KB 5,155.56 ns 83.072 ns 119.140 ns 5,093.43 ns 1640 B
Decrypt · AES-192-GCM (Managed) 1KB 6,220.61 ns 20.487 ns 19.164 ns 6,220.82 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 1KB 911.15 ns 15.261 ns 14.275 ns 914.45 ns -
Encrypt · AES-192-GCM (OS) 1KB 2,084.34 ns 18.503 ns 16.402 ns 2,084.09 ns -
Encrypt · AES-192-GCM (BouncyCastle) 1KB 5,806.63 ns 41.236 ns 38.572 ns 5,816.79 ns 1624 B
Encrypt · AES-192-GCM (Managed) 1KB 6,625.54 ns 56.436 ns 52.790 ns 6,646.75 ns -
Decrypt · AES-192-GCM (OS) 8KB 3,192.92 ns 41.167 ns 38.507 ns 3,197.00 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 8KB 6,861.46 ns 96.269 ns 85.340 ns 6,846.07 ns -
Decrypt · AES-192-GCM (BouncyCastle) 8KB 36,794.07 ns 157.620 ns 147.438 ns 36,746.45 ns 1640 B
Decrypt · AES-192-GCM (Managed) 8KB 47,855.32 ns 115.413 ns 107.957 ns 47,853.18 ns -
Encrypt · AES-192-GCM (OS) 8KB 3,216.31 ns 25.133 ns 22.280 ns 3,216.80 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 8KB 7,275.47 ns 143.373 ns 147.234 ns 7,293.49 ns -
Encrypt · AES-192-GCM (BouncyCastle) 8KB 42,470.61 ns 342.962 ns 320.807 ns 42,536.57 ns 1624 B
Encrypt · AES-192-GCM (Managed) 8KB 51,352.26 ns 1,025.780 ns 1,566.475 ns 51,974.95 ns -
Decrypt · AES-192-GCM (OS) 128KB 21,617.22 ns 41.413 ns 38.738 ns 21,625.07 ns -
Decrypt · AES-192-GCM (ArmAes+ArmPmull) 128KB 108,670.67 ns 1,473.814 ns 3,502.675 ns 108,040.67 ns -
Decrypt · AES-192-GCM (BouncyCastle) 128KB 569,932.15 ns 197.979 ns 175.504 ns 569,919.62 ns 1640 B
Decrypt · AES-192-GCM (Managed) 128KB 747,764.92 ns 339.076 ns 283.143 ns 747,864.68 ns -
Encrypt · AES-192-GCM (OS) 128KB 23,587.42 ns 129.504 ns 114.802 ns 23,599.07 ns -
Encrypt · AES-192-GCM (ArmAes+ArmPmull) 128KB 112,051.01 ns 2,663.548 ns 7,246.378 ns 113,045.82 ns -
Encrypt · AES-192-GCM (BouncyCastle) 128KB 643,812.16 ns 7,453.016 ns 6,971.556 ns 644,847.05 ns 1624 B
Encrypt · AES-192-GCM (Managed) 128KB 796,955.76 ns 5,329.920 ns 4,450.725 ns 797,598.18 ns -

AES-256-GCM

AES-256-GCM uses 14 rounds (vs 10 for AES-128), adding ~20-30% overhead per block. The same 2-tier architecture (ArmAes+ArmPmull → Managed) applies. Encrypt is ~14-16× faster than OS at 128 B; OS leads from ~4–8 KiB. The large-payload gap mirrors AES-128-GCM — Apple CommonCrypto likely exploits Apple Silicon–specific AES/PMULL execution units that are not yet accessible through the .NET ARMv8 intrinsics layer.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 17B 85.36 ns 0.119 ns 0.112 ns -
Decrypt · AES-256-GCM (Managed) 17B 390.48 ns 0.448 ns 0.397 ns -
Decrypt · AES-256-GCM (BouncyCastle) 17B 663.59 ns 0.935 ns 0.875 ns 1744 B
Decrypt · AES-256-GCM (OS) 17B 1,922.85 ns 9.287 ns 8.687 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 17B 55.83 ns 0.110 ns 0.098 ns -
Encrypt · AES-256-GCM (Managed) 17B 360.85 ns 0.454 ns 0.379 ns -
Encrypt · AES-256-GCM (BouncyCastle) 17B 591.05 ns 3.246 ns 2.711 ns 1728 B
Encrypt · AES-256-GCM (OS) 17B 1,755.88 ns 7.393 ns 6.915 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 65B 119.10 ns 0.537 ns 0.503 ns -
Decrypt · AES-256-GCM (Managed) 65B 701.72 ns 0.245 ns 0.217 ns -
Decrypt · AES-256-GCM (BouncyCastle) 65B 902.60 ns 1.008 ns 0.943 ns 1744 B
Decrypt · AES-256-GCM (OS) 65B 1,915.24 ns 11.849 ns 11.083 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 65B 86.45 ns 0.318 ns 0.298 ns -
Encrypt · AES-256-GCM (Managed) 65B 663.52 ns 0.391 ns 0.305 ns -
Encrypt · AES-256-GCM (BouncyCastle) 65B 843.46 ns 1.208 ns 1.009 ns 1728 B
Encrypt · AES-256-GCM (OS) 65B 1,748.38 ns 3.972 ns 3.317 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 128B 154.90 ns 1.016 ns 0.950 ns -
Decrypt · AES-256-GCM (Managed) 128B 1,001.89 ns 0.404 ns 0.378 ns -
Decrypt · AES-256-GCM (BouncyCastle) 128B 1,153.02 ns 0.897 ns 0.839 ns 1744 B
Decrypt · AES-256-GCM (OS) 128B 1,935.36 ns 11.800 ns 11.038 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 128B 122.88 ns 0.824 ns 0.730 ns -
Encrypt · AES-256-GCM (Managed) 128B 965.58 ns 0.905 ns 0.707 ns -
Encrypt · AES-256-GCM (BouncyCastle) 128B 1,105.18 ns 1.522 ns 1.271 ns 1728 B
Encrypt · AES-256-GCM (OS) 128B 1,765.75 ns 6.769 ns 6.001 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 152B 185.18 ns 1.329 ns 1.179 ns -
Decrypt · AES-256-GCM (Managed) 152B 1,206.84 ns 0.975 ns 0.912 ns -
Decrypt · AES-256-GCM (BouncyCastle) 152B 1,308.12 ns 1.040 ns 0.973 ns 1744 B
Decrypt · AES-256-GCM (OS) 152B 1,940.98 ns 15.149 ns 14.171 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 152B 150.99 ns 0.686 ns 0.642 ns -
Encrypt · AES-256-GCM (Managed) 152B 1,169.38 ns 0.932 ns 0.871 ns -
Encrypt · AES-256-GCM (BouncyCastle) 152B 1,267.30 ns 1.184 ns 1.107 ns 1728 B
Encrypt · AES-256-GCM (OS) 152B 1,775.11 ns 10.951 ns 9.708 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 256B 251.56 ns 1.575 ns 1.473 ns -
Decrypt · AES-256-GCM (BouncyCastle) 256B 1,781.33 ns 0.660 ns 0.618 ns 1744 B
Decrypt · AES-256-GCM (Managed) 256B 1,819.14 ns 0.465 ns 0.435 ns -
Decrypt · AES-256-GCM (OS) 256B 1,927.66 ns 10.525 ns 8.789 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 256B 221.30 ns 0.766 ns 0.716 ns -
Encrypt · AES-256-GCM (BouncyCastle) 256B 1,767.94 ns 0.785 ns 0.735 ns 1728 B
Encrypt · AES-256-GCM (Managed) 256B 1,780.59 ns 0.407 ns 0.361 ns -
Encrypt · AES-256-GCM (OS) 256B 1,786.94 ns 8.376 ns 7.835 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 1KB 821.40 ns 4.078 ns 3.615 ns -
Decrypt · AES-256-GCM (OS) 1KB 2,066.99 ns 13.375 ns 12.511 ns -
Decrypt · AES-256-GCM (BouncyCastle) 1KB 5,529.15 ns 1.817 ns 1.700 ns 1744 B
Decrypt · AES-256-GCM (Managed) 1KB 6,580.74 ns 2.554 ns 2.264 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 1KB 798.05 ns 4.140 ns 3.873 ns -
Encrypt · AES-256-GCM (OS) 1KB 1,899.13 ns 17.089 ns 15.985 ns -
Encrypt · AES-256-GCM (BouncyCastle) 1KB 5,748.85 ns 1.430 ns 1.338 ns 1728 B
Encrypt · AES-256-GCM (Managed) 1KB 6,458.19 ns 1.358 ns 1.270 ns -
Decrypt · AES-256-GCM (OS) 8KB 3,085.76 ns 25.035 ns 23.417 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 8KB 6,200.93 ns 5.322 ns 4.718 ns -
Decrypt · AES-256-GCM (BouncyCastle) 8KB 39,942.50 ns 28.876 ns 27.010 ns 1744 B
Decrypt · AES-256-GCM (Managed) 8KB 50,698.65 ns 35.145 ns 29.347 ns -
Encrypt · AES-256-GCM (OS) 8KB 2,974.20 ns 11.999 ns 11.224 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 8KB 6,201.55 ns 15.969 ns 14.156 ns -
Encrypt · AES-256-GCM (BouncyCastle) 8KB 42,596.76 ns 13.218 ns 12.364 ns 1728 B
Encrypt · AES-256-GCM (Managed) 8KB 50,671.48 ns 26.373 ns 24.670 ns -
Decrypt · AES-256-GCM (OS) 128KB 22,147.61 ns 96.588 ns 90.349 ns -
Decrypt · AES-256-GCM (ArmAes+ArmPmull) 128KB 98,473.75 ns 154.333 ns 120.493 ns -
Decrypt · AES-256-GCM (BouncyCastle) 128KB 631,948.75 ns 279.856 ns 261.777 ns 1744 B
Decrypt · AES-256-GCM (Managed) 128KB 808,117.93 ns 135.905 ns 120.477 ns -
Encrypt · AES-256-GCM (OS) 128KB 22,816.94 ns 86.248 ns 80.676 ns -
Encrypt · AES-256-GCM (ArmAes+ArmPmull) 128KB 99,847.51 ns 1,357.781 ns 1,270.069 ns -
Encrypt · AES-256-GCM (BouncyCastle) 128KB 672,293.47 ns 362.406 ns 302.626 ns 1728 B
Encrypt · AES-256-GCM (Managed) 128KB 806,702.94 ns 324.443 ns 287.610 ns -

AES-128-CCM

AES-CCM (Counter with CBC-MAC) combines CTR mode encryption with CBC-MAC authentication. Unlike GCM, CCM requires two sequential passes (encrypt + MAC or MAC + decrypt), making it inherently less parallelizable. It is widely used in IoT protocols (Bluetooth LE, ZigBee, Thread) and supports variable nonce (7–13 bytes) and tag sizes (4–16 bytes). Two acceleration tiers are available:

  • ArmAes: ARM Cryptography Extension AESD/AESE instructions for all block operations — counter-mode encryption, CBC-MAC computation, and AAD processing. Uses Vector128<byte> round keys via MemoryMarshal.Cast from the shared uint[] key schedule. Dispatched via _useAesNi bool flag (shared with x86 dispatch; indicates hardware AES availability on any ISA).
  • Managed: T-table AES for all block operations. Fully portable, zero-allocation.

Key observations:

  • ArmAes: ~4× faster than Managed at 128 KiB; ~4.3× faster than BouncyCastle; zero allocation
  • Managed: T-table AES; comparable to BouncyCastle at large sizes
  • BouncyCastle: Allocates ~2.4–2.5 KB per call
  • No OS adapter available for comparison (System.Security.Cryptography does not expose AES-CCM on all platforms)
Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-128-CCM (ArmAes) 128B 273.4 ns 1.13 ns 1.05 ns -
Decrypt · AES-128-CCM (Managed) 128B 957.4 ns 0.69 ns 0.65 ns -
Decrypt · AES-128-CCM (BouncyCastle) 128B 1,427.7 ns 2.83 ns 2.65 ns 2424 B
Encrypt · AES-128-CCM (ArmAes) 128B 238.9 ns 1.22 ns 1.14 ns -
Encrypt · AES-128-CCM (Managed) 128B 912.3 ns 0.48 ns 0.40 ns -
Encrypt · AES-128-CCM (BouncyCastle) 128B 1,384.6 ns 3.04 ns 2.85 ns 2464 B
Decrypt · AES-128-CCM (ArmAes) 1KB 1,538.2 ns 3.70 ns 3.46 ns -
Decrypt · AES-128-CCM (Managed) 1KB 5,995.5 ns 2.76 ns 2.58 ns -
Decrypt · AES-128-CCM (BouncyCastle) 1KB 6,843.3 ns 18.32 ns 17.14 ns 2424 B
Encrypt · AES-128-CCM (ArmAes) 1KB 1,502.6 ns 3.89 ns 3.64 ns -
Encrypt · AES-128-CCM (Managed) 1KB 5,953.0 ns 1.37 ns 1.28 ns -
Encrypt · AES-128-CCM (BouncyCastle) 1KB 6,744.4 ns 11.18 ns 10.46 ns 2464 B
Decrypt · AES-128-CCM (ArmAes) 8KB 11,687.0 ns 36.11 ns 33.78 ns -
Decrypt · AES-128-CCM (Managed) 8KB 46,739.8 ns 396.13 ns 370.54 ns -
Decrypt · AES-128-CCM (BouncyCastle) 8KB 49,745.9 ns 87.41 ns 81.76 ns 2424 B
Encrypt · AES-128-CCM (ArmAes) 8KB 11,540.1 ns 33.28 ns 31.13 ns -
Encrypt · AES-128-CCM (Managed) 8KB 46,157.0 ns 39.38 ns 32.89 ns -
Encrypt · AES-128-CCM (BouncyCastle) 8KB 49,590.7 ns 112.17 ns 99.44 ns 2464 B
Decrypt · AES-128-CCM (ArmAes) 128KB 184,210.5 ns 452.86 ns 423.60 ns -
Decrypt · AES-128-CCM (Managed) 128KB 736,430.0 ns 459.91 ns 430.20 ns -
Decrypt · AES-128-CCM (BouncyCastle) 128KB 792,925.7 ns 809.34 ns 717.46 ns 2424 B
Encrypt · AES-128-CCM (ArmAes) 128KB 183,919.0 ns 630.04 ns 589.34 ns -
Encrypt · AES-128-CCM (Managed) 128KB 736,115.3 ns 191.11 ns 169.41 ns -
Encrypt · AES-128-CCM (BouncyCastle) 128KB 801,690.1 ns 591.05 ns 493.55 ns 2464 B

AES-256-CCM

AES-256-CCM uses 14 rounds (vs 10 for AES-128). The same ArmAes / Managed dispatch applies. The additional rounds add ~10-15% overhead on the Apple M4.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-256-CCM (ArmAes) 128B 299.9 ns 1.13 ns 1.06 ns -
Decrypt · AES-256-CCM (Managed) 128B 1,252.1 ns 1.36 ns 1.27 ns -
Decrypt · AES-256-CCM (BouncyCastle) 128B 1,785.1 ns 3.98 ns 3.73 ns 2808 B
Encrypt · AES-256-CCM (ArmAes) 128B 265.4 ns 1.55 ns 1.45 ns -
Encrypt · AES-256-CCM (Managed) 128B 1,208.7 ns 0.83 ns 0.65 ns -
Encrypt · AES-256-CCM (BouncyCastle) 128B 1,743.7 ns 4.22 ns 3.94 ns 2848 B
Decrypt · AES-256-CCM (ArmAes) 1KB 1,707.3 ns 5.60 ns 5.24 ns -
Decrypt · AES-256-CCM (Managed) 1KB 7,946.1 ns 3.66 ns 3.42 ns -
Decrypt · AES-256-CCM (BouncyCastle) 1KB 8,898.1 ns 2.81 ns 2.49 ns 2808 B
Encrypt · AES-256-CCM (ArmAes) 1KB 1,670.1 ns 6.02 ns 5.63 ns -
Encrypt · AES-256-CCM (Managed) 1KB 7,898.7 ns 2.85 ns 2.53 ns -
Encrypt · AES-256-CCM (BouncyCastle) 1KB 8,859.8 ns 1.49 ns 1.32 ns 2848 B
Decrypt · AES-256-CCM (ArmAes) 8KB 12,868.8 ns 32.20 ns 30.12 ns -
Decrypt · AES-256-CCM (Managed) 8KB 61,446.4 ns 25.35 ns 23.72 ns -
Decrypt · AES-256-CCM (BouncyCastle) 8KB 65,644.3 ns 34.18 ns 30.30 ns 2808 B
Encrypt · AES-256-CCM (ArmAes) 8KB 12,807.3 ns 44.59 ns 41.71 ns -
Encrypt · AES-256-CCM (Managed) 8KB 61,295.6 ns 22.59 ns 20.03 ns -
Encrypt · AES-256-CCM (BouncyCastle) 8KB 65,412.6 ns 33.97 ns 31.78 ns 2848 B
Decrypt · AES-256-CCM (ArmAes) 128KB 205,913.9 ns 612.81 ns 543.24 ns -
Decrypt · AES-256-CCM (Managed) 128KB 979,175.3 ns 592.66 ns 554.37 ns -
Decrypt · AES-256-CCM (BouncyCastle) 128KB 1,040,518.4 ns 673.52 ns 630.01 ns 2808 B
Encrypt · AES-256-CCM (ArmAes) 128KB 204,195.7 ns 643.92 ns 602.32 ns -
Encrypt · AES-256-CCM (Managed) 128KB 977,042.6 ns 740.09 ns 656.07 ns -
Encrypt · AES-256-CCM (BouncyCastle) 128KB 1,038,874.9 ns 506.46 ns 473.74 ns 2848 B

ChaCha20-Poly1305

ChaCha20-Poly1305 is a software-friendly AEAD cipher (RFC 8439) that combines ChaCha20 stream encryption with Poly1305 MAC authentication. It is the recommended AEAD cipher when hardware AES acceleration is unavailable. Two acceleration tiers are available on ARM:

  • Neon: Single-block ChaCha20 via Vector128<uint> combined with Poly1305 donna-64 MAC (3×44-bit limbs, 9 multiplications per 16-byte block using Math.BigMul). ~3× faster than OS at 128 B; competitive with OS at 1 KiB; OS leads at ≥8 KiB.
  • Managed: Scalar ChaCha20 + Poly1305 donna-32 (5×26-bit limbs, 25 multiplications per block on .NET Framework / .NET Standard). Fully portable.

Key observations:

  • Neon ~3× faster than OS at 128 B encrypt; ~1.25× at 1 KiB; OS ~1.45× faster from 8 KiB; OS ~1.67× faster at 128 KiB
  • BouncyCastle is slightly faster than NEON at very small payloads (128 B) due to lower NEON setup overhead at that granularity
  • Managed and Neon paths are zero-allocation
  • BouncyCastle allocates 336–416 B per call; NaCl.Core allocates 48–72 B per call
Description TestDataSize Mean Error StdDev Allocated
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 128B 684.7 ns 0.99 ns 0.93 ns 416 B
Decrypt · ChaCha20-Poly1305 (Neon) 128B 725.1 ns 3.68 ns 3.44 ns -
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 128B 822.1 ns 0.42 ns 0.33 ns 48 B
Decrypt · ChaCha20-Poly1305 (Managed) 128B 1,314.5 ns 9.14 ns 8.55 ns -
Decrypt · ChaCha20-Poly1305 (OS) 128B 2,262.0 ns 21.27 ns 18.85 ns -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 128B 493.7 ns 1.23 ns 1.15 ns 336 B
Encrypt · ChaCha20-Poly1305 (Neon) 128B 655.5 ns 2.59 ns 2.29 ns -
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 128B 790.9 ns 0.15 ns 0.14 ns 48 B
Encrypt · ChaCha20-Poly1305 (Managed) 128B 1,207.0 ns 18.02 ns 16.85 ns -
Encrypt · ChaCha20-Poly1305 (OS) 128B 1,933.5 ns 21.88 ns 20.47 ns -
Decrypt · ChaCha20-Poly1305 (Neon) 1KB 2,333.8 ns 0.98 ns 0.82 ns -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 1KB 2,395.5 ns 4.30 ns 4.02 ns 416 B
Decrypt · ChaCha20-Poly1305 (OS) 1KB 3,155.5 ns 18.22 ns 17.04 ns -
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 1KB 3,668.8 ns 1.21 ns 1.13 ns 72 B
Decrypt · ChaCha20-Poly1305 (Managed) 1KB 6,721.7 ns 20.85 ns 19.50 ns -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 1KB 2,195.2 ns 4.44 ns 3.70 ns 336 B
Encrypt · ChaCha20-Poly1305 (Neon) 1KB 2,278.3 ns 0.86 ns 0.80 ns -
Encrypt · ChaCha20-Poly1305 (OS) 1KB 2,848.6 ns 21.18 ns 19.81 ns -
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 1KB 3,625.7 ns 0.83 ns 0.77 ns 72 B
Encrypt · ChaCha20-Poly1305 (Managed) 1KB 6,658.5 ns 22.85 ns 21.37 ns -
Decrypt · ChaCha20-Poly1305 (OS) 8KB 10,635.2 ns 42.47 ns 39.73 ns -
Decrypt · ChaCha20-Poly1305 (Neon) 8KB 14,726.0 ns 9.83 ns 9.19 ns -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 8KB 15,766.2 ns 42.70 ns 39.94 ns 416 B
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 8KB 26,270.6 ns 9.37 ns 8.76 ns 72 B
Decrypt · ChaCha20-Poly1305 (Managed) 8KB 48,278.6 ns 137.24 ns 128.37 ns -
Encrypt · ChaCha20-Poly1305 (OS) 8KB 10,135.5 ns 49.14 ns 45.96 ns -
Encrypt · ChaCha20-Poly1305 (Neon) 8KB 14,754.8 ns 4.52 ns 3.77 ns -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 8KB 15,683.5 ns 40.95 ns 38.30 ns 336 B
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 8KB 26,311.4 ns 8.07 ns 7.15 ns 72 B
Encrypt · ChaCha20-Poly1305 (Managed) 8KB 47,890.3 ns 173.96 ns 162.72 ns -
Decrypt · ChaCha20-Poly1305 (OS) 128KB 147,445.9 ns 855.86 ns 800.57 ns -
Decrypt · ChaCha20-Poly1305 (Neon) 128KB 228,291.2 ns 170.33 ns 159.33 ns -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 128KB 247,358.2 ns 713.67 ns 667.57 ns 416 B
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 128KB 414,167.8 ns 183.58 ns 171.72 ns 72 B
Decrypt · ChaCha20-Poly1305 (Managed) 128KB 761,690.2 ns 3,038.17 ns 2,841.91 ns -
Encrypt · ChaCha20-Poly1305 (OS) 128KB 136,972.2 ns 815.14 ns 762.49 ns -
Encrypt · ChaCha20-Poly1305 (Neon) 128KB 228,956.6 ns 57.95 ns 54.20 ns -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 128KB 249,403.3 ns 567.57 ns 530.90 ns 336 B
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 128KB 414,330.5 ns 165.82 ns 147.00 ns 72 B
Encrypt · ChaCha20-Poly1305 (Managed) 128KB 760,649.4 ns 3,111.39 ns 2,910.40 ns -

XChaCha20-Poly1305

XChaCha20-Poly1305 extends ChaCha20-Poly1305 with a 24-byte nonce (vs 12 bytes), making random nonce generation safe against collisions (2⁹² birthday bound vs 2³² for ChaCha20-Poly1305). The implementation prepends an HChaCha20 key derivation step that derives a subkey from the first 16 bytes of the nonce. The same Neon / Managed acceleration tiers apply to the inner ChaCha20-Poly1305 operation.

Key observations:

  • Performance nearly identical to ChaCha20-Poly1305 (HChaCha20 adds ~400 ns constant overhead)
  • Neon ~3.3× faster than Managed at 128 KiB; ~3.3× faster than NaCl.Core at 128 KiB
  • No OS or BouncyCastle implementations available for comparison
  • NaCl.Core allocates 48–72 B per call
  • Managed and Neon paths are zero-allocation
Description TestDataSize Mean Error StdDev Allocated
Decrypt · XChaCha20-Poly1305 (Neon) 128B 1.167 μs 0.0064 μs 0.0060 μs -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 128B 1.480 μs 0.0002 μs 0.0002 μs 48 B
Decrypt · XChaCha20-Poly1305 (Managed) 128B 1.722 μs 0.0056 μs 0.0052 μs -
Encrypt · XChaCha20-Poly1305 (Neon) 128B 1.096 μs 0.0075 μs 0.0070 μs -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 128B 1.448 μs 0.0003 μs 0.0002 μs 48 B
Encrypt · XChaCha20-Poly1305 (Managed) 128B 1.621 μs 0.0049 μs 0.0045 μs -
Decrypt · XChaCha20-Poly1305 (Neon) 1KB 2.699 μs 0.0012 μs 0.0011 μs -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 1KB 6.635 μs 0.0023 μs 0.0021 μs 72 B
Decrypt · XChaCha20-Poly1305 (Managed) 1KB 7.086 μs 0.0254 μs 0.0238 μs -
Encrypt · XChaCha20-Poly1305 (Neon) 1KB 2.671 μs 0.0012 μs 0.0012 μs -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 1KB 6.597 μs 0.0015 μs 0.0013 μs 72 B
Encrypt · XChaCha20-Poly1305 (Managed) 1KB 7.048 μs 0.0183 μs 0.0171 μs -
Decrypt · XChaCha20-Poly1305 (Neon) 8KB 15.129 μs 0.0049 μs 0.0041 μs -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 8KB 47.608 μs 0.0093 μs 0.0087 μs 72 B
Decrypt · XChaCha20-Poly1305 (Managed) 8KB 48.467 μs 0.2036 μs 0.1905 μs -
Encrypt · XChaCha20-Poly1305 (Neon) 8KB 15.204 μs 0.0069 μs 0.0065 μs -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 8KB 47.521 μs 0.0060 μs 0.0050 μs 72 B
Encrypt · XChaCha20-Poly1305 (Managed) 8KB 48.376 μs 0.2060 μs 0.1927 μs -
Decrypt · XChaCha20-Poly1305 (Neon) 128KB 228.965 μs 0.1020 μs 0.0954 μs -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 128KB 751.208 μs 0.1381 μs 0.1292 μs 72 B
Decrypt · XChaCha20-Poly1305 (Managed) 128KB 757.817 μs 3.0047 μs 2.8106 μs -
Encrypt · XChaCha20-Poly1305 (Neon) 128KB 229.493 μs 0.0862 μs 0.0806 μs -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 128KB 750.458 μs 0.1387 μs 0.1298 μs 72 B
Encrypt · XChaCha20-Poly1305 (Managed) 128KB 756.772 μs 2.3731 μs 2.2198 μs -

Regional Block Ciphers

Regional block ciphers implement national cryptographic standards. All operate on 128-bit blocks in CBC mode. Benchmarks compare Managed implementations against BouncyCastle where available.

SM4-CBC (China)

SM4 is the Chinese national block cipher (GB/T 32907-2016). It uses a 128-bit key with 32 rounds of nonlinear key mixing.

  • Managed: Lookup-table implementation with 32-bit word operations. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · SM4-CBC (Managed) 128B 1.329 μs 0.0058 μs 0.0054 μs -
Decrypt · SM4-CBC (BouncyCastle) 128B 1.418 μs 0.0106 μs 0.0099 μs 40 B
Encrypt · SM4-CBC (Managed) 128B 1.447 μs 0.0049 μs 0.0046 μs -
Encrypt · SM4-CBC (BouncyCastle) 128B 1.486 μs 0.0035 μs 0.0031 μs 40 B
Decrypt · SM4-CBC (BouncyCastle) 1KB 8.802 μs 0.0361 μs 0.0338 μs 40 B
Decrypt · SM4-CBC (Managed) 1KB 9.392 μs 0.0300 μs 0.0280 μs -
Encrypt · SM4-CBC (BouncyCastle) 1KB 9.618 μs 0.0402 μs 0.0356 μs 40 B
Encrypt · SM4-CBC (Managed) 1KB 10.431 μs 0.0371 μs 0.0329 μs -
Decrypt · SM4-CBC (BouncyCastle) 8KB 67.695 μs 0.2983 μs 0.2790 μs 40 B
Decrypt · SM4-CBC (Managed) 8KB 73.865 μs 0.3488 μs 0.3262 μs -
Encrypt · SM4-CBC (BouncyCastle) 8KB 74.971 μs 0.2233 μs 0.2089 μs 40 B
Encrypt · SM4-CBC (Managed) 8KB 82.340 μs 0.4462 μs 0.4173 μs -
Decrypt · SM4-CBC (BouncyCastle) 128KB 1,078.473 μs 6.0526 μs 5.3655 μs 40 B
Decrypt · SM4-CBC (Managed) 128KB 1,179.205 μs 5.2554 μs 4.9159 μs -
Encrypt · SM4-CBC (BouncyCastle) 128KB 1,195.655 μs 3.5399 μs 3.1381 μs 40 B
Encrypt · SM4-CBC (Managed) 128KB 1,317.563 μs 7.5847 μs 7.0948 μs -

ARIA-128-CBC (Korea)

ARIA is a Korean national cipher (KS X 1213) with an involutional SPN structure. ARIA-128 uses 12 rounds.

  • Managed: S-box substitution with byte-level diffusion layer. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · ARIA-128-CBC (Managed) 128B 2.221 μs 0.0083 μs 0.0073 μs -
Decrypt · ARIA-128-CBC (BouncyCastle) 128B 2.339 μs 0.0087 μs 0.0073 μs 1288 B
Encrypt · ARIA-128-CBC (Managed) 128B 2.197 μs 0.0079 μs 0.0070 μs -
Encrypt · ARIA-128-CBC (BouncyCastle) 128B 2.228 μs 0.0071 μs 0.0059 μs 1288 B
Decrypt · ARIA-128-CBC (BouncyCastle) 1KB 14.343 μs 0.0478 μs 0.0424 μs 3528 B
Decrypt · ARIA-128-CBC (Managed) 1KB 15.985 μs 0.0751 μs 0.0703 μs -
Encrypt · ARIA-128-CBC (BouncyCastle) 1KB 14.107 μs 0.2760 μs 0.2711 μs 3528 B
Encrypt · ARIA-128-CBC (Managed) 1KB 15.848 μs 0.0734 μs 0.0613 μs -
Decrypt · ARIA-128-CBC (BouncyCastle) 8KB 109.475 μs 0.3934 μs 0.3487 μs 21448 B
Decrypt · ARIA-128-CBC (Managed) 8KB 126.115 μs 0.3413 μs 0.2850 μs -
Encrypt · ARIA-128-CBC (BouncyCastle) 8KB 106.352 μs 0.2397 μs 0.2002 μs 21448 B
Encrypt · ARIA-128-CBC (Managed) 8KB 125.575 μs 0.4582 μs 0.4062 μs -
Decrypt · ARIA-128-CBC (BouncyCastle) 128KB 1,719.477 μs 5.5217 μs 4.8948 μs 328648 B
Decrypt · ARIA-128-CBC (Managed) 128KB 2,023.756 μs 9.2452 μs 8.1956 μs -
Encrypt · ARIA-128-CBC (BouncyCastle) 128KB 1,696.951 μs 5.7161 μs 5.0672 μs 328648 B
Encrypt · ARIA-128-CBC (Managed) 128KB 2,021.430 μs 8.8185 μs 7.8173 μs -

ARIA-256-CBC (Korea)

ARIA-256 uses 16 rounds for 256-bit key security. The same SPN structure applies with additional rounds.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · ARIA-256-CBC (Managed) 128B 2.969 μs 0.0142 μs 0.0126 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 128B 2.991 μs 0.0063 μs 0.0053 μs 1496 B
Encrypt · ARIA-256-CBC (BouncyCastle) 128B 2.907 μs 0.0053 μs 0.0047 μs 1496 B
Encrypt · ARIA-256-CBC (Managed) 128B 2.974 μs 0.0052 μs 0.0046 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 1KB 18.676 μs 0.0366 μs 0.0306 μs 3736 B
Decrypt · ARIA-256-CBC (Managed) 1KB 21.258 μs 0.0321 μs 0.0284 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 1KB 18.359 μs 0.0685 μs 0.0572 μs 3736 B
Encrypt · ARIA-256-CBC (Managed) 1KB 21.345 μs 0.0424 μs 0.0397 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 8KB 139.550 μs 0.2432 μs 0.1899 μs 21656 B
Decrypt · ARIA-256-CBC (Managed) 8KB 168.287 μs 0.6729 μs 0.5965 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 8KB 140.725 μs 0.3938 μs 0.3491 μs 21656 B
Encrypt · ARIA-256-CBC (Managed) 8KB 168.559 μs 0.2917 μs 0.2586 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 128KB 2,275.573 μs 6.0435 μs 5.3574 μs 328856 B
Decrypt · ARIA-256-CBC (Managed) 128KB 2,691.327 μs 9.0075 μs 8.4256 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 128KB 2,247.459 μs 6.1531 μs 5.7556 μs 328856 B
Encrypt · ARIA-256-CBC (Managed) 128KB 2,704.634 μs 8.3292 μs 6.9553 μs -

Camellia-128-CBC (Japan)

Camellia is a Japanese CRYPTREC/NESSIE cipher (RFC 3713) with a Feistel structure and FL/FL⁻¹ key-dependent layers.

  • Managed: Pre-computed SP-box tables with 6 S-boxes. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · Camellia-128-CBC (BouncyCastle) 128B 930.9 ns 1.39 ns 1.23 ns 576 B
Decrypt · Camellia-128-CBC (Managed) 128B 1,442.9 ns 5.45 ns 5.10 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 128B 906.0 ns 0.89 ns 0.74 ns 576 B
Encrypt · Camellia-128-CBC (Managed) 128B 1,541.0 ns 18.37 ns 17.18 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 1KB 5,819.7 ns 12.68 ns 11.86 ns 2816 B
Decrypt · Camellia-128-CBC (Managed) 1KB 10,318.2 ns 59.01 ns 55.20 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 1KB 5,935.5 ns 12.76 ns 11.93 ns 2816 B
Encrypt · Camellia-128-CBC (Managed) 1KB 10,840.3 ns 21.60 ns 20.20 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 8KB 45,077.4 ns 123.42 ns 115.45 ns 20736 B
Decrypt · Camellia-128-CBC (Managed) 8KB 81,270.2 ns 575.14 ns 537.99 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 8KB 45,478.1 ns 138.37 ns 129.43 ns 20736 B
Encrypt · Camellia-128-CBC (Managed) 8KB 85,170.7 ns 244.50 ns 228.70 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 128KB 737,510.6 ns 1,487.94 ns 1,391.82 ns 327936 B
Decrypt · Camellia-128-CBC (Managed) 128KB 1,299,903.3 ns 5,020.73 ns 4,450.74 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 128KB 719,839.6 ns 2,435.91 ns 2,278.56 ns 327936 B
Encrypt · Camellia-128-CBC (Managed) 128KB 1,377,271.3 ns 8,525.38 ns 7,974.65 ns -

Camellia-256-CBC (Japan)

Camellia-256 uses 24 rounds (vs 18 for 128-bit). The additional FL/FL⁻¹ layers add minimal overhead.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · Camellia-256-CBC (BouncyCastle) 128B 1.169 μs 0.0037 μs 0.0035 μs 592 B
Decrypt · Camellia-256-CBC (Managed) 128B 1.904 μs 0.0088 μs 0.0082 μs -
Encrypt · Camellia-256-CBC (BouncyCastle) 128B 1.160 μs 0.0040 μs 0.0035 μs 592 B
Encrypt · Camellia-256-CBC (Managed) 128B 2.014 μs 0.0070 μs 0.0066 μs -
Decrypt · Camellia-256-CBC (BouncyCastle) 1KB 7.787 μs 0.0386 μs 0.0361 μs 2832 B
Decrypt · Camellia-256-CBC (Managed) 1KB 13.417 μs 0.0418 μs 0.0370 μs -
Encrypt · Camellia-256-CBC (BouncyCastle) 1KB 8.160 μs 0.1230 μs 0.1027 μs 2832 B
Encrypt · Camellia-256-CBC (Managed) 1KB 14.476 μs 0.0520 μs 0.0461 μs -
Decrypt · Camellia-256-CBC (BouncyCastle) 8KB 58.380 μs 0.1671 μs 0.1396 μs 20752 B
Decrypt · Camellia-256-CBC (Managed) 8KB 107.567 μs 0.6596 μs 0.6170 μs -
Encrypt · Camellia-256-CBC (BouncyCastle) 8KB 59.039 μs 0.1537 μs 0.1362 μs 20752 B
Encrypt · Camellia-256-CBC (Managed) 8KB 114.540 μs 0.4344 μs 0.4063 μs -
Decrypt · Camellia-256-CBC (BouncyCastle) 128KB 933.643 μs 1.8411 μs 1.6321 μs 327952 B
Decrypt · Camellia-256-CBC (Managed) 128KB 1,697.288 μs 8.4801 μs 7.9323 μs -
Encrypt · Camellia-256-CBC (BouncyCastle) 128KB 935.850 μs 2.3988 μs 2.2438 μs 327952 B
Encrypt · Camellia-256-CBC (Managed) 128KB 1,830.016 μs 6.0824 μs 5.6895 μs -

Kuznyechik-CBC (Russia)

Kuznyechik (GOST R 34.12-2015) is the modern Russian cipher with a 256-bit key and 10 rounds. It replaces the older GOST 28147-89.

  • Managed: Pre-computed S-box and linear transformation tables. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · Kuznyechik-CBC (Managed) 128B 384.4 μs 7.64 μs 10.95 μs -
Encrypt · Kuznyechik-CBC (Managed) 128B 377.6 μs 7.01 μs 6.22 μs -
Decrypt · Kuznyechik-CBC (Managed) 1KB 3,181.6 μs 12.86 μs 12.03 μs -
Encrypt · Kuznyechik-CBC (Managed) 1KB 2,957.0 μs 20.45 μs 18.13 μs -
Decrypt · Kuznyechik-CBC (Managed) 8KB 26,392.5 μs 41.45 μs 38.77 μs -
Encrypt · Kuznyechik-CBC (Managed) 8KB 25,439.2 μs 29.89 μs 26.50 μs -
Decrypt · Kuznyechik-CBC (Managed) 128KB 412,074.8 μs 732.89 μs 685.54 μs -
Encrypt · Kuznyechik-CBC (Managed) 128KB 404,593.0 μs 472.23 μs 418.62 μs -

Kalyna-128-CBC (Ukraine)

Kalyna (DSTU 7624:2014) is the Ukrainian national cipher paired with the Kupyna hash family. Uses MDS matrix diffusion.

  • Managed: S-box substitution with MDS matrix multiplication. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · Kalyna-128-CBC (Managed) 128B 2.250 μs 0.0008 μs 0.0007 μs -
Decrypt · Kalyna-128-CBC (BouncyCastle) 128B 2.417 μs 0.0038 μs 0.0033 μs 872 B
Encrypt · Kalyna-128-CBC (BouncyCastle) 128B 1.270 μs 0.0019 μs 0.0018 μs 872 B
Encrypt · Kalyna-128-CBC (Managed) 128B 2.037 μs 0.0020 μs 0.0016 μs -
Decrypt · Kalyna-128-CBC (BouncyCastle) 1KB 15.392 μs 0.0216 μs 0.0202 μs 872 B
Decrypt · Kalyna-128-CBC (Managed) 1KB 16.165 μs 0.0157 μs 0.0131 μs -
Encrypt · Kalyna-128-CBC (BouncyCastle) 1KB 7.156 μs 0.0095 μs 0.0084 μs 872 B
Encrypt · Kalyna-128-CBC (Managed) 1KB 14.600 μs 0.0194 μs 0.0172 μs -
Decrypt · Kalyna-128-CBC (BouncyCastle) 8KB 119.049 μs 0.0755 μs 0.0669 μs 872 B
Decrypt · Kalyna-128-CBC (Managed) 8KB 127.465 μs 0.0254 μs 0.0199 μs -
Encrypt · Kalyna-128-CBC (BouncyCastle) 8KB 54.206 μs 0.0761 μs 0.0712 μs 872 B
Encrypt · Kalyna-128-CBC (Managed) 8KB 114.945 μs 0.1831 μs 0.1623 μs -
Decrypt · Kalyna-128-CBC (BouncyCastle) 128KB 1,898.162 μs 1.5546 μs 1.4542 μs 872 B
Decrypt · Kalyna-128-CBC (Managed) 128KB 2,040.730 μs 0.7159 μs 0.5978 μs -
Encrypt · Kalyna-128-CBC (BouncyCastle) 128KB 861.432 μs 0.6880 μs 0.6099 μs 872 B
Encrypt · Kalyna-128-CBC (Managed) 128KB 1,840.886 μs 2.2577 μs 2.0014 μs -

Kalyna-256-CBC (Ukraine)

Kalyna-256 uses 14 rounds (vs 10 for 128-bit key). The same MDS-based architecture applies.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · Kalyna-256-CBC (Managed) 128B 3.098 μs 0.0014 μs 0.0012 μs -
Decrypt · Kalyna-256-CBC (BouncyCastle) 128B 3.292 μs 0.0033 μs 0.0031 μs 1112 B
Encrypt · Kalyna-256-CBC (BouncyCastle) 128B 1.706 μs 0.0020 μs 0.0018 μs 1112 B
Encrypt · Kalyna-256-CBC (Managed) 128B 2.787 μs 0.0021 μs 0.0016 μs -
Decrypt · Kalyna-256-CBC (BouncyCastle) 1KB 21.163 μs 0.0130 μs 0.0115 μs 1112 B
Decrypt · Kalyna-256-CBC (Managed) 1KB 22.254 μs 0.0094 μs 0.0079 μs -
Encrypt · Kalyna-256-CBC (BouncyCastle) 1KB 9.790 μs 0.0135 μs 0.0126 μs 1112 B
Encrypt · Kalyna-256-CBC (Managed) 1KB 20.026 μs 0.0286 μs 0.0267 μs -
Decrypt · Kalyna-256-CBC (BouncyCastle) 8KB 163.975 μs 0.0928 μs 0.0775 μs 1112 B
Decrypt · Kalyna-256-CBC (Managed) 8KB 175.451 μs 0.0991 μs 0.0828 μs -
Encrypt · Kalyna-256-CBC (BouncyCastle) 8KB 74.237 μs 0.1661 μs 0.1473 μs 1112 B
Encrypt · Kalyna-256-CBC (Managed) 8KB 156.759 μs 0.1026 μs 0.0909 μs -
Decrypt · Kalyna-256-CBC (BouncyCastle) 128KB 2,612.778 μs 2.0607 μs 1.8268 μs 1112 B
Decrypt · Kalyna-256-CBC (Managed) 128KB 2,807.778 μs 1.5034 μs 1.2554 μs -
Encrypt · Kalyna-256-CBC (BouncyCastle) 128KB 1,177.886 μs 1.7797 μs 1.4862 μs 1112 B
Encrypt · Kalyna-256-CBC (Managed) 128KB 2,522.515 μs 2.0303 μs 1.6954 μs -

SEED-CBC (Korea)

SEED is a Korean cipher (RFC 4269, KISA) with a 128-bit key and 16-round Feistel structure. S-boxes are derived from the golden ratio.

  • Managed: Pre-computed 32-bit SS-boxes (SS0–SS3). Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · SEED-CBC (Managed) 128B 1.316 μs 0.0142 μs 0.0126 μs -
Decrypt · SEED-CBC (BouncyCastle) 128B 1.400 μs 0.0069 μs 0.0064 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 128B 1.428 μs 0.0050 μs 0.0044 μs 152 B
Encrypt · SEED-CBC (Managed) 128B 1.439 μs 0.0052 μs 0.0049 μs -
Decrypt · SEED-CBC (Managed) 1KB 9.363 μs 0.0453 μs 0.0424 μs -
Decrypt · SEED-CBC (BouncyCastle) 1KB 9.601 μs 0.0523 μs 0.0489 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 1KB 9.960 μs 0.0510 μs 0.0477 μs 152 B
Encrypt · SEED-CBC (Managed) 1KB 10.463 μs 0.0413 μs 0.0386 μs -
Decrypt · SEED-CBC (Managed) 8KB 73.523 μs 0.2633 μs 0.2463 μs -
Decrypt · SEED-CBC (BouncyCastle) 8KB 75.218 μs 0.3217 μs 0.3009 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 8KB 78.222 μs 0.4169 μs 0.3899 μs 152 B
Encrypt · SEED-CBC (Managed) 8KB 82.674 μs 0.3556 μs 0.3327 μs -
Decrypt · SEED-CBC (Managed) 128KB 1,178.190 μs 5.9217 μs 5.5392 μs -
Decrypt · SEED-CBC (BouncyCastle) 128KB 1,200.086 μs 5.2922 μs 4.9504 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 128KB 1,250.964 μs 5.3121 μs 4.9690 μs 152 B
Encrypt · SEED-CBC (Managed) 128KB 1,324.827 μs 6.9881 μs 6.5367 μs -

Allocation Summary

All CryptoHives cipher implementations achieve zero heap allocation for both encrypt and decrypt operations across all payload sizes. This is critical for high-throughput scenarios such as network packet processing, where GC pressure directly impacts tail latency.

Implementation Allocation Notes
CryptoHives (all variants) 0 B All tiers (Managed, ArmAes, ArmAes+ArmPmull, Neon) are zero-allocation at all payload sizes
OS (.NET) — GCM / ChaCha20-Poly1305 0 B OS AEAD implementations are zero-allocation
OS (.NET) — CBC 72 B Fixed P/Invoke marshalling overhead per call, independent of payload size
BouncyCastle — CBC 832–1,024 B Fixed per-call allocation (832 B for AES-128, 1,024 B for AES-256)
BouncyCastle — GCM 1,520–1,744 B Fixed per-call allocation (1,520 B for AES-128 encrypt, 1,744 B for AES-256 decrypt)
BouncyCastle — CCM 2,424–2,848 B Fixed per-call allocation (2,424 B for AES-128 decrypt, 2,848 B for AES-256 encrypt)
BouncyCastle — ChaCha20-Poly1305 336–416 B Varies slightly by payload size
BouncyCastle — ChaCha20 96 B Fixed per-call allocation
NaCl.Core — ChaCha20 24 B Small fixed allocation
NaCl.Core — ChaCha20-Poly1305 / XChaCha20 48–72 B Small allocation, varies by payload size