Table of Contents

macOS Arm64 Apple M4 Cipher Benchmarks

Machine Profile

Machine Specification

The benchmarks were run on the following machine:

BenchmarkDotNet v0.15.8, macOS Tahoe 26.4.1 (25E253) [Darwin 25.4.0]
Apple M4, 1 CPU, 10 logical and 10 physical cores
.NET SDK 10.0.201
[Host]    : .NET 10.0.5 (10.0.5, 10.0.526.15411), Arm64 RyuJIT armv8.0-a
.NET 10.0 : .NET 10.0.5 (10.0.5, 10.0.526.15411), Arm64 RyuJIT armv8.0-a
Method=TryComputeHash  Job=.NET 10.0  Runtime=.NET 10.0
Toolchain=net10.0

Note: Results are machine-specific and may vary between systems. Run benchmarks locally for your specific hardware.

BenchmarkDotNet measurements for all cipher algorithm implementations in CryptoHives.Foundation.Security.Cryptography. Each algorithm is benchmarked across representative payload sizes (17 bytes through 128 KiB) to capture both latency and throughput characteristics.

Implementation Variants

Each cipher family exposes multiple acceleration tiers. The runtime automatically selects the fastest tier supported by the host CPU via SimdSupport detection. Callers can also force a specific tier through the Create(SimdSupport) factory for testing or compatibility.

AES Family

Variant Instructions .NET Target When Selected Description
Managed Scalar All No ARM Crypto support T-table AES using scalar uint arithmetic. Fully portable, zero-allocation. ~10–16× slower than ArmAes depending on mode and payload size.
ArmAes AES (ARM Crypto Ext.) .NET 8+ ArmBase.IsSupported Hardware AES round instructions (AESD, AESE, AESMC, AESIMC). For CBC, uses 8-block interleaved decrypt for maximum instruction-level parallelism — all 8 plaintext blocks decoded simultaneously via parallel AESD dispatch. For GCM/CCM, accelerates counter-mode encryption and CBC-MAC. Decrypt is ~5.8× faster than OS at 128 B; at bulk sizes Apple CommonCrypto leads via Apple Silicon–specific AES pipelining.
ArmAes+ArmPmull AES + PMULL (ARM Crypto Ext.) .NET 8+ AdvSimd.Arm64.IsSupported Adds carry-less polynomial multiplication (PMULL/PMULL2) for hardware-accelerated GHASH over GF(2¹²⁸). PMULL operates on 64-bit polynomial operands to produce 128-bit products; PMULL2 reads from the upper halves of 128-bit NEON registers (a free lane-select requiring no additional instruction). Uses the same 8-block stitched AES+GHASH pipeline as the x86 PClMul path. Modular reduction uses a 2-PMULL SymCrypt-style MODREDUCE. Pre-computes Karatsuba cross-term halves for H¹–H⁸ powers. Encrypt: ~120× faster than OS at 17 B; ~14× at 128 B (OS CommonCrypto incurs ~8 μs per-call overhead at these sizes). OS leads at ≥8 KiB due to Apple Silicon–specific bulk AES acceleration.

ChaCha20 Family

Variant Instructions .NET Target When Selected Description
Managed Scalar All No NEON support Quarter-round operations using scalar uint arithmetic. Fully portable. ~4× slower than Neon at all payload sizes.
Neon AdvSIMD (NEON) .NET 8+ AdvSimd.IsSupported Maps the 4×4 ChaCha state to four Vector128<uint> rows. Uses ARM NEON shift-left, shift-right, and byte-table permute instructions for the 16-bit, 12-bit, 8-bit, and 7-bit rotations. Diagonal rounds use AdvSimd.ExtractVector128 to rotate rows by one element. Processes one 64-byte keystream block per iteration. ~4× faster than Managed; faster than BouncyCastle at all sizes up to 1 KiB; OS leads from ~8 KiB.

When to Use Each Variant

  • Small messages (≤256 B): AES-GCM with ArmAes+ArmPmull eliminates the ~8 μs CommonCrypto per-call overhead entirely — ~120× faster than OS at 17 B encrypt and ~14× at 128 B. ChaCha20-Poly1305 NEON is ~5× faster than OS at 128 B.
  • Medium messages (256 B–4 KB): ArmAes+ArmPmull leads through ~1 KiB. ChaCha20-Poly1305 NEON remains competitive at 1 KiB (~1.4× faster than OS). This range covers QUIC (~1.4 KB), WireGuard (~1.4 KB), and IPsec packets.
  • Large messages (8 KB–128 KB): Apple CommonCrypto dominates — OS is ~2× faster for AES-GCM and ~1.54× faster for ChaCha20-Poly1305. This is due to Apple Silicon–specific AES/PMULL micro-architectural pipelining that .NET's current ARMv8 paths do not yet fully exploit. This range covers TLS records (1–16 KB) and OPC UA chunks (8 KB default).
  • No hardware AES: Use ChaCha20-Poly1305 NEON — it outperforms Managed AES-GCM by 3–10× depending on payload size and is always zero-allocation.
  • IoT / constrained devices: AES-CCM with ArmAes provides ~4× speedup over BouncyCastle at 128 KiB. Supports variable nonce (7–13 bytes) and tag sizes.

Highlights

Family Leader Key Insight
ChaCha20 Neon NEON ~4× faster than Managed; faster than BouncyCastle at all sizes up to 1 KiB; zero allocation
ChaCha20-Poly1305 Neon ~5× faster than OS at 128 B; OS leads at ≥8 KiB; Neon on par with BouncyCastle at 128 KiB; zero allocation
XChaCha20-Poly1305 Neon ~3.3× faster than Managed at 128 KiB; zero allocation
AES-CBC ArmAes Decrypt ~5.8× faster than OS at 128 B; OS leads at ≥8 KiB (Apple Silicon bulk path); zero allocation
AES-GCM ArmAes+ArmPmull ~120× faster than OS encrypt at 17 B; ~14× at 128 B; OS leads at ≥8 KiB; 8-block stitched AES+GHASH pipeline
AES-CCM ArmAes ~4× faster than BouncyCastle at 128 KiB; zero allocation; no OS adapter available

Stream Ciphers

ChaCha20

ChaCha20 is a stream cipher designed by Daniel J. Bernstein. Two acceleration tiers are available on ARM:

  • Neon: Single-block processing — maps the 4×4 ChaCha state matrix to four Vector128<uint> rows. Uses ARM NEON vshl/vsri (shift-and-insert) and vtbl (byte-table permute) instructions for the four rotation widths (16-bit, 12-bit, 8-bit, 7-bit). Diagonal rounds use AdvSimd.ExtractVector128 to rotate rows by one element. Yields ~750 MB/s throughput at 128 KiB; ~1.24× faster than BouncyCastle.
  • Managed: Scalar uint quarter-round arithmetic. Fully portable across all .NET targets. ~4.1× slower than Neon at 128 KiB.

Key observations:

  • Neon is the fastest at all sizes; ~1.24× faster than BouncyCastle at 128 KiB; ~1.35× at 1 KiB
  • BouncyCastle allocates 96 B per call; NaCl.Core allocates 24 B per call
  • Managed and Neon paths are zero-allocation
Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · ChaCha20 (CryptoHives-Neon) 128B 885.6 ns 0.98 ns 0.91 ns 885.4 ns -
Decrypt · ChaCha20 (BouncyCastle) 128B 1,472.0 ns 28.32 ns 26.49 ns 1,478.2 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 128B 2,712.0 ns 1.87 ns 1.56 ns 2,711.6 ns 24 B
Decrypt · ChaCha20 (CryptoHives-Scalar) 128B 3,487.6 ns 1.19 ns 0.99 ns 3,487.3 ns -
Encrypt · ChaCha20 (CryptoHives-Neon) 128B 885.6 ns 0.56 ns 0.52 ns 885.5 ns -
Encrypt · ChaCha20 (BouncyCastle) 128B 1,465.2 ns 20.58 ns 17.19 ns 1,467.5 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 128B 2,713.6 ns 3.25 ns 3.04 ns 2,712.2 ns 24 B
Encrypt · ChaCha20 (CryptoHives-Scalar) 128B 3,479.6 ns 1.46 ns 1.29 ns 3,479.4 ns -
Decrypt · ChaCha20 (CryptoHives-Neon) 1KB 6,969.0 ns 6.62 ns 5.87 ns 6,967.8 ns -
Decrypt · ChaCha20 (BouncyCastle) 1KB 8,688.3 ns 173.89 ns 374.31 ns 8,979.5 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 1KB 15,278.2 ns 1.60 ns 1.25 ns 15,277.8 ns 24 B
Decrypt · ChaCha20 (CryptoHives-Scalar) 1KB 27,515.6 ns 23.72 ns 19.81 ns 27,508.5 ns -
Encrypt · ChaCha20 (CryptoHives-Neon) 1KB 6,971.7 ns 5.91 ns 5.24 ns 6,972.1 ns -
Encrypt · ChaCha20 (BouncyCastle) 1KB 8,747.7 ns 174.74 ns 352.98 ns 8,948.1 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 1KB 15,280.2 ns 3.34 ns 2.96 ns 15,279.7 ns 24 B
Encrypt · ChaCha20 (CryptoHives-Scalar) 1KB 27,505.9 ns 11.81 ns 10.47 ns 27,502.4 ns -
Decrypt · ChaCha20 (CryptoHives-Neon) 8KB 55,596.4 ns 57.71 ns 48.19 ns 55,597.1 ns -
Decrypt · ChaCha20 (BouncyCastle) 8KB 63,145.6 ns 25.00 ns 22.17 ns 63,150.0 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 8KB 116,123.6 ns 68.27 ns 57.01 ns 116,113.7 ns 24 B
Decrypt · ChaCha20 (CryptoHives-Scalar) 8KB 219,494.8 ns 59.29 ns 52.56 ns 219,487.6 ns -
Encrypt · ChaCha20 (CryptoHives-Neon) 8KB 55,590.6 ns 33.51 ns 27.99 ns 55,579.6 ns -
Encrypt · ChaCha20 (BouncyCastle) 8KB 64,638.2 ns 1,286.61 ns 1,885.90 ns 63,799.6 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 8KB 116,157.5 ns 32.48 ns 28.79 ns 116,150.5 ns 24 B
Encrypt · ChaCha20 (CryptoHives-Scalar) 8KB 219,433.8 ns 97.01 ns 81.01 ns 219,461.3 ns -
Decrypt · ChaCha20 (CryptoHives-Neon) 128KB 888,142.6 ns 290.36 ns 226.69 ns 888,114.3 ns -
Decrypt · ChaCha20 (BouncyCastle) 128KB 1,007,533.6 ns 2,335.69 ns 2,184.80 ns 1,008,171.6 ns 96 B
Decrypt · ChaCha20 (NaCl.Core) 128KB 1,836,344.0 ns 880.80 ns 735.51 ns 1,836,603.8 ns 24 B
Decrypt · ChaCha20 (CryptoHives-Scalar) 128KB 3,512,558.6 ns 2,277.03 ns 2,129.93 ns 3,511,660.2 ns -
Encrypt · ChaCha20 (CryptoHives-Neon) 128KB 888,519.1 ns 795.00 ns 743.64 ns 888,202.7 ns -
Encrypt · ChaCha20 (BouncyCastle) 128KB 1,007,309.4 ns 1,247.26 ns 1,166.69 ns 1,007,598.7 ns 96 B
Encrypt · ChaCha20 (NaCl.Core) 128KB 1,839,740.1 ns 625.78 ns 522.56 ns 1,839,583.2 ns 24 B
Encrypt · ChaCha20 (CryptoHives-Scalar) 128KB 3,511,311.3 ns 1,720.62 ns 1,525.28 ns 3,510,710.0 ns -

Block Ciphers

AES-128-CBC

AES-CBC (Cipher Block Chaining) is the most widely deployed AES mode. Two acceleration tiers are available on Apple M4:

  • ArmAes: Uses ARM Cryptography Extension AESD/AESE/AESMC/AESIMC instructions. Decrypt uses 8-block interleaving — 8 ciphertext blocks are loaded and decrypted simultaneously via parallel AESD dispatch. Each block decrypts independently, requiring only the preceding ciphertext block as an XOR mask (10 rounds × 8 blocks = 80 AESD instructions in flight). Encrypt remains serial because each plaintext block must be XORed with the previous ciphertext before the next AESE can proceed.
  • Managed: T-table AES using four 256-entry lookup tables per round. Fully portable, zero-allocation. Comparable to BouncyCastle at large sizes.

Key observations:

  • ArmAes Decrypt: ~5.8× faster than OS at 128 B; near OS at 4 KiB; OS leads from ~8 KiB (Apple Silicon uses a wider AES pipeline at bulk sizes)
  • ArmAes Encrypt: ~1.5× faster than OS at 128 B; OS leads from 1 KiB (CBC encrypt is inherently serial; CommonCrypto uses NEON-assisted interleaving for partial parallelism)
  • Managed: Zero-allocation T-table AES; comparable to BouncyCastle at large sizes
  • OS: Allocates 72 B per call (P/Invoke marshalling overhead)
Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-128-CBC (CryptoHives-ARM-AES) 128B 22.59 ns 0.039 ns 0.037 ns -
Decrypt · AES-128-CBC (OS) 128B 197.10 ns 0.822 ns 0.729 ns 72 B
Decrypt · AES-128-CBC (CryptoHives-Scalar) 128B 385.61 ns 0.187 ns 0.175 ns -
Decrypt · AES-128-CBC (BouncyCastle) 128B 615.04 ns 0.511 ns 0.478 ns 832 B
Encrypt · AES-128-CBC (CryptoHives-ARM-AES) 128B 136.85 ns 0.727 ns 0.680 ns -
Encrypt · AES-128-CBC (OS) 128B 202.67 ns 0.612 ns 0.542 ns 72 B
Encrypt · AES-128-CBC (CryptoHives-Scalar) 128B 436.00 ns 0.121 ns 0.107 ns -
Encrypt · AES-128-CBC (BouncyCastle) 128B 574.91 ns 0.358 ns 0.335 ns 832 B
Decrypt · AES-128-CBC (CryptoHives-ARM-AES) 1KB 88.84 ns 0.117 ns 0.098 ns -
Decrypt · AES-128-CBC (OS) 1KB 234.61 ns 0.825 ns 0.772 ns 72 B
Decrypt · AES-128-CBC (CryptoHives-Scalar) 1KB 2,705.18 ns 0.601 ns 0.533 ns -
Decrypt · AES-128-CBC (BouncyCastle) 1KB 3,382.96 ns 4.409 ns 4.124 ns 832 B
Encrypt · AES-128-CBC (OS) 1KB 557.71 ns 2.487 ns 2.326 ns 72 B
Encrypt · AES-128-CBC (CryptoHives-ARM-AES) 1KB 984.41 ns 3.148 ns 2.945 ns -
Encrypt · AES-128-CBC (CryptoHives-Scalar) 1KB 3,133.08 ns 0.250 ns 0.195 ns -
Encrypt · AES-128-CBC (BouncyCastle) 1KB 3,271.53 ns 0.931 ns 0.826 ns 832 B
Decrypt · AES-128-CBC (OS) 8KB 591.12 ns 3.471 ns 3.247 ns 72 B
Decrypt · AES-128-CBC (CryptoHives-ARM-AES) 8KB 627.20 ns 1.189 ns 1.112 ns -
Decrypt · AES-128-CBC (CryptoHives-Scalar) 8KB 21,309.84 ns 33.386 ns 29.596 ns -
Decrypt · AES-128-CBC (BouncyCastle) 8KB 25,321.70 ns 43.447 ns 40.641 ns 832 B
Encrypt · AES-128-CBC (OS) 8KB 3,284.32 ns 8.210 ns 7.679 ns 72 B
Encrypt · AES-128-CBC (CryptoHives-ARM-AES) 8KB 9,902.54 ns 49.818 ns 46.600 ns -
Encrypt · AES-128-CBC (BouncyCastle) 8KB 24,664.10 ns 3.629 ns 3.217 ns 832 B
Encrypt · AES-128-CBC (CryptoHives-Scalar) 8KB 24,696.69 ns 3.173 ns 2.968 ns -
Decrypt · AES-128-CBC (OS) 128KB 6,686.33 ns 60.404 ns 53.546 ns 72 B
Decrypt · AES-128-CBC (CryptoHives-ARM-AES) 128KB 9,844.99 ns 4.951 ns 4.631 ns -
Decrypt · AES-128-CBC (CryptoHives-Scalar) 128KB 341,916.59 ns 89.519 ns 83.736 ns -
Decrypt · AES-128-CBC (BouncyCastle) 128KB 402,781.64 ns 1,109.311 ns 983.375 ns 832 B
Encrypt · AES-128-CBC (OS) 128KB 50,596.49 ns 26.602 ns 23.582 ns 72 B
Encrypt · AES-128-CBC (CryptoHives-ARM-AES) 128KB 123,352.94 ns 940.343 ns 879.597 ns -
Encrypt · AES-128-CBC (BouncyCastle) 128KB 394,324.31 ns 81.897 ns 68.387 ns 832 B
Encrypt · AES-128-CBC (CryptoHives-Scalar) 128KB 394,511.75 ns 103.156 ns 96.492 ns -

AES-256-CBC

AES-256-CBC uses 14 rounds (vs 10 for AES-128), adding ~25-30% overhead. The same 8-block interleaved decrypt and serial encrypt architecture applies via ArmAes. Decrypt is ~1.65× faster than OS at 128 B; OS leads from ~8 KiB. Encrypt is slower than OS from 1 KiB (serial CBC encrypt bottleneck on Apple Silicon).

Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-256-CBC (CryptoHives-ARM-AES) 128B 24.89 ns 0.022 ns 0.018 ns -
Decrypt · AES-256-CBC (OS) 128B 226.71 ns 1.738 ns 1.452 ns 72 B
Decrypt · AES-256-CBC (CryptoHives-Scalar) 128B 519.76 ns 0.421 ns 0.328 ns -
Decrypt · AES-256-CBC (BouncyCastle) 128B 796.15 ns 2.271 ns 1.773 ns 1024 B
Encrypt · AES-256-CBC (CryptoHives-ARM-AES) 128B 152.05 ns 0.659 ns 0.617 ns -
Encrypt · AES-256-CBC (OS) 128B 254.59 ns 0.770 ns 0.683 ns 72 B
Encrypt · AES-256-CBC (CryptoHives-Scalar) 128B 569.41 ns 0.174 ns 0.162 ns -
Encrypt · AES-256-CBC (BouncyCastle) 128B 739.33 ns 0.335 ns 0.261 ns 1024 B
Decrypt · AES-256-CBC (CryptoHives-ARM-AES) 1KB 103.67 ns 1.002 ns 0.888 ns -
Decrypt · AES-256-CBC (OS) 1KB 277.33 ns 1.808 ns 1.510 ns 72 B
Decrypt · AES-256-CBC (CryptoHives-Scalar) 1KB 3,665.74 ns 0.654 ns 0.546 ns -
Decrypt · AES-256-CBC (BouncyCastle) 1KB 4,431.11 ns 4.299 ns 3.811 ns 1024 B
Encrypt · AES-256-CBC (OS) 1KB 725.48 ns 6.036 ns 5.041 ns 72 B
Encrypt · AES-256-CBC (CryptoHives-ARM-AES) 1KB 1,111.63 ns 1.337 ns 1.250 ns -
Encrypt · AES-256-CBC (CryptoHives-Scalar) 1KB 4,080.11 ns 6.491 ns 5.754 ns -
Encrypt · AES-256-CBC (BouncyCastle) 1KB 4,278.77 ns 1.812 ns 1.415 ns 1024 B
Decrypt · AES-256-CBC (OS) 8KB 718.94 ns 5.224 ns 4.631 ns 72 B
Decrypt · AES-256-CBC (CryptoHives-ARM-AES) 8KB 757.18 ns 7.508 ns 6.656 ns -
Decrypt · AES-256-CBC (CryptoHives-Scalar) 8KB 28,852.85 ns 11.885 ns 9.279 ns -
Decrypt · AES-256-CBC (BouncyCastle) 8KB 33,210.94 ns 35.105 ns 32.837 ns 1024 B
Encrypt · AES-256-CBC (OS) 8KB 4,427.06 ns 2.474 ns 2.193 ns 72 B
Encrypt · AES-256-CBC (CryptoHives-ARM-AES) 8KB 8,449.00 ns 125.718 ns 104.980 ns -
Encrypt · AES-256-CBC (CryptoHives-Scalar) 8KB 32,271.16 ns 15.984 ns 14.951 ns -
Encrypt · AES-256-CBC (BouncyCastle) 8KB 32,441.36 ns 9.565 ns 8.479 ns 1024 B
Decrypt · AES-256-CBC (OS) 128KB 8,427.32 ns 66.965 ns 55.919 ns 72 B
Decrypt · AES-256-CBC (CryptoHives-ARM-AES) 128KB 12,018.59 ns 19.040 ns 14.865 ns -
Decrypt · AES-256-CBC (CryptoHives-Scalar) 128KB 460,842.07 ns 2,400.630 ns 2,004.635 ns -
Decrypt · AES-256-CBC (BouncyCastle) 128KB 527,934.17 ns 2,460.032 ns 2,054.238 ns 1024 B
Encrypt · AES-256-CBC (OS) 128KB 69,445.56 ns 565.076 ns 528.573 ns 72 B
Encrypt · AES-256-CBC (CryptoHives-ARM-AES) 128KB 140,490.51 ns 1,373.498 ns 1,146.933 ns -
Encrypt · AES-256-CBC (CryptoHives-Scalar) 128KB 515,748.26 ns 457.872 ns 357.476 ns -
Encrypt · AES-256-CBC (BouncyCastle) 128KB 517,875.88 ns 244.278 ns 203.984 ns 1024 B

AEAD Ciphers (Authenticated Encryption)

Authenticated Encryption with Associated Data (AEAD) ciphers provide both confidentiality and authenticity in a single operation. All CryptoHives AEAD implementations are zero-allocation.

AES-128-GCM

AES-GCM combines counter-mode AES encryption (GCTR) with GHASH polynomial authentication over GF(2¹²⁸). Two acceleration tiers are available on Apple M4:

  • ArmAes+ArmPmull (.NET 8+): Uses ARM Cryptography Extension AESD/AESE for counter-mode encryption and PMULL/PMULL2 for GHASH polynomial multiplication. PMULL operates on 64-bit polynomial operands to produce 128-bit products; PMULL2 reads from the upper halves of 128-bit NEON registers (a free lane-select requiring no additional instruction). Uses an 8-block stitched loop that interleaves AES rounds with lagged GHASH of the previous 8 blocks. Modular reduction uses a 2-PMULL SymCrypt-style MODREDUCE. Pre-computes Karatsuba cross-term halves for H¹–H⁸ powers. Small payloads use the non-stitched path (≤8 blocks). ~120× faster than OS encrypt at 17 B; ~14× at 128 B (OS CommonCrypto incurs ~8 μs per-call overhead for small payloads). At bulk sizes (≥8 KiB), Apple CommonCrypto leads — due to Apple Silicon–specific AES pipelining not accessible via the .NET ARM intrinsics layer.
  • Managed: Scalar T-table AES with 4-bit Shoup table GHASH (16-entry reduction table, byte-by-byte multiplication). Fully portable, zero-allocation.

Key observations:

  • ArmAes+ArmPmull: ~120× faster than OS encrypt at 17 B; ~14× at 128 B; ~2.5× at 1 KiB; OS leads from ~4–8 KiB
  • ArmAes+ArmPmull at 128 KiB: OS is ~4.8× faster for both encrypt and decrypt
  • Managed: Uses 4-bit Shoup table GHASH, T-table AES; zero allocation
  • BouncyCastle: Uses ARM AES + PMULL internally on ARM64; allocates ~1.5 KB per call
Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 17B 389.43 ns 3.280 ns 3.068 ns -
Decrypt · AES-128-GCM (CryptoHives-Scalar) 17B 1,638.59 ns 4.881 ns 4.327 ns -
Decrypt · AES-128-GCM (BouncyCastle) 17B 2,699.87 ns 2.009 ns 1.678 ns 1536 B
Decrypt · AES-128-GCM (OS) 17B 8,929.34 ns 89.738 ns 79.550 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 17B 65.32 ns 0.204 ns 0.181 ns -
Encrypt · AES-128-GCM (CryptoHives-Scalar) 17B 1,492.51 ns 4.913 ns 4.355 ns -
Encrypt · AES-128-GCM (BouncyCastle) 17B 2,332.59 ns 4.123 ns 3.443 ns 1520 B
Encrypt · AES-128-GCM (OS) 17B 7,954.80 ns 35.601 ns 31.559 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 65B 552.80 ns 1.273 ns 1.063 ns -
Decrypt · AES-128-GCM (CryptoHives-Scalar) 65B 2,867.56 ns 3.519 ns 3.120 ns -
Decrypt · AES-128-GCM (BouncyCastle) 65B 3,626.11 ns 6.691 ns 6.259 ns 1536 B
Decrypt · AES-128-GCM (OS) 65B 8,821.32 ns 67.076 ns 59.461 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 65B 407.28 ns 0.229 ns 0.179 ns -
Encrypt · AES-128-GCM (CryptoHives-Scalar) 65B 2,704.24 ns 3.921 ns 3.668 ns -
Encrypt · AES-128-GCM (BouncyCastle) 65B 3,340.67 ns 5.560 ns 5.201 ns 1520 B
Encrypt · AES-128-GCM (OS) 65B 7,976.92 ns 40.196 ns 31.383 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 128B 727.20 ns 5.243 ns 4.904 ns -
Decrypt · AES-128-GCM (CryptoHives-Scalar) 128B 4,066.16 ns 4.142 ns 3.875 ns -
Decrypt · AES-128-GCM (BouncyCastle) 128B 4,559.98 ns 4.211 ns 3.288 ns 1536 B
Decrypt · AES-128-GCM (OS) 128B 8,866.51 ns 47.055 ns 41.713 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 128B 593.66 ns 0.250 ns 0.209 ns -
Encrypt · AES-128-GCM (CryptoHives-Scalar) 128B 3,941.76 ns 2.220 ns 1.733 ns -
Encrypt · AES-128-GCM (BouncyCastle) 128B 4,371.03 ns 6.346 ns 5.299 ns 1520 B
Encrypt · AES-128-GCM (OS) 128B 8,138.36 ns 80.187 ns 75.007 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 152B 903.42 ns 14.675 ns 13.727 ns -
Decrypt · AES-128-GCM (CryptoHives-Scalar) 152B 4,933.29 ns 4.163 ns 3.690 ns -
Decrypt · AES-128-GCM (BouncyCastle) 152B 5,166.28 ns 7.061 ns 6.260 ns 1536 B
Decrypt · AES-128-GCM (OS) 152B 8,954.32 ns 47.750 ns 42.329 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 152B 732.69 ns 0.223 ns 0.174 ns -
Encrypt · AES-128-GCM (CryptoHives-Scalar) 152B 4,709.54 ns 7.031 ns 6.232 ns -
Encrypt · AES-128-GCM (BouncyCastle) 152B 4,975.79 ns 10.718 ns 9.502 ns 1520 B
Encrypt · AES-128-GCM (OS) 152B 8,257.26 ns 129.869 ns 121.479 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 256B 250.03 ns 5.113 ns 6.087 ns -
Decrypt · AES-128-GCM (BouncyCastle) 256B 1,474.91 ns 1.244 ns 1.039 ns 1536 B
Decrypt · AES-128-GCM (CryptoHives-Scalar) 256B 1,578.76 ns 24.977 ns 23.364 ns -
Decrypt · AES-128-GCM (OS) 256B 1,851.45 ns 14.041 ns 12.447 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 256B 1,072.68 ns 0.643 ns 0.537 ns -
Encrypt · AES-128-GCM (BouncyCastle) 256B 6,919.93 ns 4.011 ns 3.350 ns 1520 B
Encrypt · AES-128-GCM (CryptoHives-Scalar) 256B 7,247.23 ns 2.920 ns 2.280 ns -
Encrypt · AES-128-GCM (OS) 256B 8,277.33 ns 124.031 ns 116.019 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 1KB 804.89 ns 1.422 ns 1.330 ns -
Decrypt · AES-128-GCM (OS) 1KB 2,062.07 ns 15.769 ns 13.979 ns -
Decrypt · AES-128-GCM (BouncyCastle) 1KB 4,503.30 ns 2.430 ns 2.029 ns 1536 B
Decrypt · AES-128-GCM (CryptoHives-Scalar) 1KB 5,624.93 ns 16.460 ns 12.851 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 1KB 4,024.61 ns 4.514 ns 3.769 ns -
Encrypt · AES-128-GCM (OS) 1KB 8,791.51 ns 38.908 ns 32.490 ns -
Encrypt · AES-128-GCM (BouncyCastle) 1KB 22,295.05 ns 13.812 ns 11.533 ns 1520 B
Encrypt · AES-128-GCM (CryptoHives-Scalar) 1KB 25,923.08 ns 12.712 ns 9.925 ns -
Decrypt · AES-128-GCM (OS) 8KB 2,902.27 ns 21.412 ns 17.880 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 8KB 6,172.11 ns 36.613 ns 34.248 ns -
Decrypt · AES-128-GCM (BouncyCastle) 8KB 32,444.51 ns 11.323 ns 10.037 ns 1536 B
Decrypt · AES-128-GCM (CryptoHives-Scalar) 8KB 43,153.23 ns 22.199 ns 17.331 ns -
Encrypt · AES-128-GCM (OS) 8KB 13,204.40 ns 115.049 ns 107.617 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 8KB 31,403.84 ns 14.656 ns 11.442 ns -
Encrypt · AES-128-GCM (BouncyCastle) 8KB 163,736.64 ns 566.036 ns 501.776 ns 1520 B
Encrypt · AES-128-GCM (CryptoHives-Scalar) 8KB 202,917.77 ns 288.201 ns 255.483 ns -
Decrypt · AES-128-GCM (OS) 128KB 19,683.11 ns 129.639 ns 121.264 ns -
Decrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 128KB 101,850.67 ns 721.677 ns 639.748 ns -
Decrypt · AES-128-GCM (BouncyCastle) 128KB 509,399.33 ns 200.182 ns 187.250 ns 1536 B
Decrypt · AES-128-GCM (CryptoHives-Scalar) 128KB 686,087.41 ns 109.640 ns 97.193 ns -
Encrypt · AES-128-GCM (OS) 128KB 92,595.84 ns 1,003.219 ns 938.412 ns -
Encrypt · AES-128-GCM (CryptoHives-ARM-AES+PMULL) 128KB 503,971.26 ns 643.796 ns 502.634 ns -
Encrypt · AES-128-GCM (BouncyCastle) 128KB 2,584,321.85 ns 1,926.173 ns 1,608.442 ns 1520 B
Encrypt · AES-128-GCM (CryptoHives-Scalar) 128KB 3,232,210.80 ns 1,524.435 ns 1,190.179 ns -

AES-192-GCM

AES-192-GCM uses 12 rounds (vs 10 for AES-128), adding ~10-15% overhead. The same ArmAes+ArmPmull pipeline applies. The performance pattern mirrors AES-128-GCM: dominant over OS at small payloads, OS leads at bulk sizes.

Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 17B 392.45 ns 3.110 ns 2.909 ns 393.96 ns -
Decrypt · AES-192-GCM (CryptoHives-Scalar) 17B 1,758.46 ns 3.325 ns 3.110 ns 1,758.40 ns -
Decrypt · AES-192-GCM (BouncyCastle) 17B 2,938.89 ns 2.402 ns 2.129 ns 2,939.04 ns 1640 B
Decrypt · AES-192-GCM (OS) 17B 8,777.06 ns 50.778 ns 39.644 ns 8,784.71 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 17B 57.95 ns 1.177 ns 2.182 ns 58.07 ns -
Encrypt · AES-192-GCM (CryptoHives-Scalar) 17B 369.48 ns 7.030 ns 10.945 ns 375.66 ns -
Encrypt · AES-192-GCM (BouncyCastle) 17B 633.13 ns 12.309 ns 15.567 ns 625.88 ns 1624 B
Encrypt · AES-192-GCM (OS) 17B 1,945.43 ns 38.579 ns 93.173 ns 1,923.46 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 65B 553.25 ns 3.749 ns 3.507 ns 553.77 ns -
Decrypt · AES-192-GCM (CryptoHives-Scalar) 65B 3,066.36 ns 4.271 ns 3.786 ns 3,066.78 ns -
Decrypt · AES-192-GCM (BouncyCastle) 65B 3,921.21 ns 2.323 ns 1.813 ns 3,921.44 ns 1640 B
Decrypt · AES-192-GCM (OS) 65B 8,863.75 ns 93.957 ns 87.888 ns 8,857.16 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 65B 93.13 ns 1.876 ns 3.832 ns 93.18 ns -
Encrypt · AES-192-GCM (CryptoHives-Scalar) 65B 704.40 ns 13.883 ns 23.195 ns 694.92 ns -
Encrypt · AES-192-GCM (BouncyCastle) 65B 928.02 ns 18.266 ns 17.086 ns 927.66 ns 1624 B
Encrypt · AES-192-GCM (OS) 65B 1,975.34 ns 38.805 ns 58.082 ns 1,964.51 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 128B 742.05 ns 4.259 ns 3.557 ns 742.57 ns -
Decrypt · AES-192-GCM (CryptoHives-Scalar) 128B 4,368.73 ns 6.113 ns 5.718 ns 4,368.56 ns -
Decrypt · AES-192-GCM (BouncyCastle) 128B 4,996.76 ns 15.475 ns 14.475 ns 4,993.00 ns 1640 B
Decrypt · AES-192-GCM (OS) 128B 9,045.71 ns 149.994 ns 140.304 ns 8,985.30 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 128B 124.62 ns 0.125 ns 0.111 ns 124.58 ns -
Encrypt · AES-192-GCM (CryptoHives-Scalar) 128B 1,069.40 ns 12.551 ns 11.740 ns 1,074.00 ns -
Encrypt · AES-192-GCM (BouncyCastle) 128B 1,241.79 ns 17.618 ns 16.479 ns 1,239.67 ns 1624 B
Encrypt · AES-192-GCM (OS) 128B 2,136.51 ns 22.193 ns 19.674 ns 2,130.53 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 152B 918.21 ns 10.882 ns 10.179 ns 918.23 ns -
Decrypt · AES-192-GCM (CryptoHives-Scalar) 152B 5,311.08 ns 17.759 ns 16.611 ns 5,305.41 ns -
Decrypt · AES-192-GCM (BouncyCastle) 152B 5,674.82 ns 15.316 ns 14.327 ns 5,669.71 ns 1640 B
Decrypt · AES-192-GCM (OS) 152B 8,969.10 ns 79.931 ns 74.767 ns 8,949.62 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 152B 164.19 ns 3.243 ns 4.952 ns 165.42 ns -
Encrypt · AES-192-GCM (CryptoHives-Scalar) 152B 5,117.36 ns 15.223 ns 13.495 ns 5,123.91 ns -
Encrypt · AES-192-GCM (BouncyCastle) 152B 5,457.81 ns 7.070 ns 6.267 ns 5,457.61 ns 1624 B
Encrypt · AES-192-GCM (OS) 152B 8,187.88 ns 82.140 ns 76.834 ns 8,206.96 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 256B 1,269.01 ns 13.239 ns 12.384 ns 1,261.99 ns -
Decrypt · AES-192-GCM (BouncyCastle) 256B 7,635.70 ns 3.393 ns 2.833 ns 7,636.31 ns 1640 B
Decrypt · AES-192-GCM (CryptoHives-Scalar) 256B 7,991.88 ns 6.588 ns 6.163 ns 7,989.93 ns -
Decrypt · AES-192-GCM (OS) 256B 9,162.01 ns 59.437 ns 55.597 ns 9,152.20 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 256B 1,086.05 ns 0.968 ns 0.756 ns 1,086.30 ns -
Encrypt · AES-192-GCM (BouncyCastle) 256B 7,593.12 ns 5.763 ns 5.108 ns 7,592.27 ns 1624 B
Encrypt · AES-192-GCM (CryptoHives-Scalar) 256B 7,811.69 ns 5.031 ns 4.201 ns 7,810.45 ns -
Encrypt · AES-192-GCM (OS) 256B 8,337.81 ns 69.314 ns 64.837 ns 8,349.16 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 1KB 854.23 ns 17.011 ns 22.120 ns 860.14 ns -
Decrypt · AES-192-GCM (OS) 1KB 1,960.22 ns 39.188 ns 40.243 ns 1,947.46 ns -
Decrypt · AES-192-GCM (BouncyCastle) 1KB 5,056.71 ns 39.262 ns 32.786 ns 5,042.00 ns 1640 B
Decrypt · AES-192-GCM (CryptoHives-Scalar) 1KB 6,134.89 ns 73.339 ns 61.241 ns 6,102.35 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 1KB 4,036.06 ns 3.988 ns 3.535 ns 4,035.07 ns -
Encrypt · AES-192-GCM (OS) 1KB 8,762.99 ns 79.900 ns 74.739 ns 8,755.16 ns -
Encrypt · AES-192-GCM (BouncyCastle) 1KB 24,634.84 ns 12.668 ns 10.578 ns 24,632.61 ns 1624 B
Encrypt · AES-192-GCM (CryptoHives-Scalar) 1KB 28,191.66 ns 11.438 ns 10.140 ns 28,191.53 ns -
Decrypt · AES-192-GCM (OS) 8KB 2,955.64 ns 17.672 ns 15.666 ns 2,952.98 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 8KB 6,083.40 ns 4.775 ns 3.728 ns 6,084.46 ns -
Decrypt · AES-192-GCM (BouncyCastle) 8KB 36,111.73 ns 17.922 ns 16.764 ns 36,117.73 ns 1640 B
Decrypt · AES-192-GCM (CryptoHives-Scalar) 8KB 46,917.34 ns 9.490 ns 8.412 ns 46,917.63 ns -
Encrypt · AES-192-GCM (OS) 8KB 13,389.90 ns 71.936 ns 56.163 ns 13,401.61 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 8KB 31,544.06 ns 9.340 ns 7.292 ns 31,544.54 ns -
Encrypt · AES-192-GCM (BouncyCastle) 8KB 181,571.55 ns 235.114 ns 196.331 ns 181,621.22 ns 1624 B
Encrypt · AES-192-GCM (CryptoHives-Scalar) 8KB 220,715.77 ns 117.549 ns 91.774 ns 220,697.41 ns -
Decrypt · AES-192-GCM (OS) 128KB 19,516.26 ns 88.794 ns 83.058 ns 19,509.62 ns -
Decrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 128KB 98,492.34 ns 815.194 ns 722.648 ns 98,489.86 ns -
Decrypt · AES-192-GCM (BouncyCastle) 128KB 569,394.54 ns 356.216 ns 297.456 ns 569,324.58 ns 1640 B
Decrypt · AES-192-GCM (CryptoHives-Scalar) 128KB 747,695.84 ns 550.855 ns 459.989 ns 747,488.81 ns -
Encrypt · AES-192-GCM (OS) 128KB 96,747.02 ns 119.204 ns 93.067 ns 96,742.32 ns -
Encrypt · AES-192-GCM (CryptoHives-ARM-AES+PMULL) 128KB 504,290.32 ns 347.637 ns 308.171 ns 504,308.11 ns -
Encrypt · AES-192-GCM (BouncyCastle) 128KB 2,872,363.98 ns 1,840.506 ns 1,536.906 ns 2,872,103.19 ns 1624 B
Encrypt · AES-192-GCM (CryptoHives-Scalar) 128KB 3,519,943.02 ns 2,355.070 ns 1,966.590 ns 3,518,982.10 ns -

AES-256-GCM

AES-256-GCM uses 14 rounds (vs 10 for AES-128), adding ~20-30% overhead per block. The same 2-tier architecture (ArmAes+ArmPmull → Managed) applies. Encrypt is ~14–16× faster than OS at 128 B; OS leads from ~4–8 KiB. The large-payload gap mirrors AES-128-GCM — Apple CommonCrypto likely exploits Apple Silicon–specific AES/PMULL execution units that are not yet accessible through the .NET ARMv8 intrinsics layer.

Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 17B 392.48 ns 4.065 ns 3.803 ns 391.99 ns -
Decrypt · AES-256-GCM (CryptoHives-Scalar) 17B 1,848.11 ns 2.697 ns 2.522 ns 1,848.36 ns -
Decrypt · AES-256-GCM (BouncyCastle) 17B 3,120.12 ns 7.685 ns 6.813 ns 3,118.81 ns 1744 B
Decrypt · AES-256-GCM (OS) 17B 9,008.65 ns 85.134 ns 79.634 ns 9,021.45 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 17B 55.91 ns 0.054 ns 0.045 ns 55.92 ns -
Encrypt · AES-256-GCM (CryptoHives-Scalar) 17B 360.84 ns 0.204 ns 0.191 ns 360.85 ns -
Encrypt · AES-256-GCM (BouncyCastle) 17B 586.76 ns 0.718 ns 0.672 ns 586.60 ns 1728 B
Encrypt · AES-256-GCM (OS) 17B 1,739.71 ns 12.648 ns 11.831 ns 1,733.40 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 65B 562.57 ns 2.518 ns 2.355 ns 561.77 ns -
Decrypt · AES-256-GCM (CryptoHives-Scalar) 65B 3,299.54 ns 4.311 ns 3.821 ns 3,299.90 ns -
Decrypt · AES-256-GCM (BouncyCastle) 65B 4,257.31 ns 4.490 ns 4.200 ns 4,258.70 ns 1744 B
Decrypt · AES-256-GCM (OS) 65B 9,069.33 ns 44.404 ns 41.536 ns 9,060.56 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 65B 87.64 ns 0.030 ns 0.028 ns 87.63 ns -
Encrypt · AES-256-GCM (CryptoHives-Scalar) 65B 663.07 ns 0.171 ns 0.160 ns 663.07 ns -
Encrypt · AES-256-GCM (BouncyCastle) 65B 840.77 ns 0.835 ns 0.697 ns 840.93 ns 1728 B
Encrypt · AES-256-GCM (OS) 65B 1,967.93 ns 35.177 ns 80.115 ns 1,950.23 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 128B 768.34 ns 5.102 ns 4.773 ns 767.17 ns -
Decrypt · AES-256-GCM (CryptoHives-Scalar) 128B 4,746.61 ns 5.049 ns 4.476 ns 4,745.02 ns -
Decrypt · AES-256-GCM (BouncyCastle) 128B 5,417.20 ns 3.699 ns 3.279 ns 5,416.63 ns 1744 B
Decrypt · AES-256-GCM (OS) 128B 9,212.45 ns 106.131 ns 99.275 ns 9,235.57 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 128B 126.78 ns 0.120 ns 0.106 ns 126.80 ns -
Encrypt · AES-256-GCM (CryptoHives-Scalar) 128B 963.68 ns 0.282 ns 0.220 ns 963.66 ns -
Encrypt · AES-256-GCM (BouncyCastle) 128B 1,290.47 ns 25.576 ns 39.819 ns 1,295.46 ns 1728 B
Encrypt · AES-256-GCM (OS) 128B 1,967.29 ns 24.487 ns 20.448 ns 1,962.75 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 152B 938.06 ns 4.264 ns 3.329 ns 938.90 ns -
Decrypt · AES-256-GCM (CryptoHives-Scalar) 152B 5,673.72 ns 4.763 ns 4.455 ns 5,672.28 ns -
Decrypt · AES-256-GCM (BouncyCastle) 152B 6,151.40 ns 7.209 ns 6.744 ns 6,152.46 ns 1744 B
Decrypt · AES-256-GCM (OS) 152B 9,095.95 ns 55.664 ns 52.069 ns 9,098.71 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 152B 156.25 ns 0.152 ns 0.135 ns 156.29 ns -
Encrypt · AES-256-GCM (CryptoHives-Scalar) 152B 1,360.08 ns 26.254 ns 26.961 ns 1,360.72 ns -
Encrypt · AES-256-GCM (BouncyCastle) 152B 1,545.96 ns 30.671 ns 31.496 ns 1,553.18 ns 1728 B
Encrypt · AES-256-GCM (OS) 152B 2,099.86 ns 23.709 ns 22.178 ns 2,101.45 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 256B 1,297.43 ns 12.461 ns 11.656 ns 1,297.96 ns -
Decrypt · AES-256-GCM (BouncyCastle) 256B 1,777.07 ns 1.360 ns 1.272 ns 1,776.82 ns 1744 B
Decrypt · AES-256-GCM (CryptoHives-Scalar) 256B 1,840.39 ns 36.503 ns 61.985 ns 1,810.38 ns -
Decrypt · AES-256-GCM (OS) 256B 1,972.13 ns 15.607 ns 14.599 ns 1,968.01 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 256B 233.89 ns 0.628 ns 0.524 ns 233.80 ns -
Encrypt · AES-256-GCM (CryptoHives-Scalar) 256B 2,093.08 ns 40.431 ns 53.974 ns 2,108.54 ns -
Encrypt · AES-256-GCM (BouncyCastle) 256B 2,122.70 ns 41.953 ns 60.168 ns 2,111.18 ns 1728 B
Encrypt · AES-256-GCM (OS) 256B 2,191.74 ns 43.816 ns 38.841 ns 2,198.54 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 1KB 837.58 ns 8.276 ns 6.911 ns 837.76 ns -
Decrypt · AES-256-GCM (OS) 1KB 2,099.22 ns 19.172 ns 16.995 ns 2,094.14 ns -
Decrypt · AES-256-GCM (BouncyCastle) 1KB 5,522.34 ns 4.760 ns 3.975 ns 5,521.94 ns 1744 B
Decrypt · AES-256-GCM (CryptoHives-Scalar) 1KB 6,580.40 ns 2.054 ns 1.604 ns 6,580.24 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 1KB 889.83 ns 17.806 ns 16.656 ns 889.44 ns -
Encrypt · AES-256-GCM (OS) 1KB 9,040.18 ns 119.524 ns 111.802 ns 8,986.19 ns -
Encrypt · AES-256-GCM (BouncyCastle) 1KB 27,106.41 ns 11.764 ns 10.429 ns 27,103.38 ns 1728 B
Encrypt · AES-256-GCM (CryptoHives-Scalar) 1KB 30,003.83 ns 1,111.541 ns 3,153.258 ns 30,457.34 ns -
Decrypt · AES-256-GCM (OS) 8KB 3,099.95 ns 13.097 ns 12.251 ns 3,095.90 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 8KB 6,362.26 ns 83.183 ns 69.462 ns 6,371.20 ns -
Decrypt · AES-256-GCM (BouncyCastle) 8KB 40,008.45 ns 8.312 ns 6.490 ns 40,009.26 ns 1744 B
Decrypt · AES-256-GCM (CryptoHives-Scalar) 8KB 50,781.50 ns 35.053 ns 27.367 ns 50,771.57 ns -
Encrypt · AES-256-GCM (OS) 8KB 13,975.05 ns 175.416 ns 164.084 ns 13,928.83 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 8KB 32,094.53 ns 48.704 ns 38.025 ns 32,120.38 ns -
Encrypt · AES-256-GCM (BouncyCastle) 8KB 200,098.54 ns 122.654 ns 102.422 ns 200,085.24 ns 1728 B
Encrypt · AES-256-GCM (CryptoHives-Scalar) 8KB 238,659.34 ns 145.934 ns 129.367 ns 238,671.31 ns -
Decrypt · AES-256-GCM (OS) 128KB 20,784.60 ns 76.921 ns 71.952 ns 20,791.28 ns -
Decrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 128KB 103,150.76 ns 472.990 ns 442.435 ns 103,087.03 ns -
Decrypt · AES-256-GCM (BouncyCastle) 128KB 632,521.95 ns 243.138 ns 215.535 ns 632,514.93 ns 1744 B
Decrypt · AES-256-GCM (CryptoHives-Scalar) 128KB 808,029.66 ns 101.426 ns 94.874 ns 807,995.85 ns -
Encrypt · AES-256-GCM (OS) 128KB 102,155.07 ns 597.475 ns 529.646 ns 101,828.32 ns -
Encrypt · AES-256-GCM (CryptoHives-ARM-AES+PMULL) 128KB 511,507.58 ns 233.359 ns 182.192 ns 511,515.99 ns -
Encrypt · AES-256-GCM (BouncyCastle) 128KB 3,181,223.93 ns 1,684.265 ns 1,493.057 ns 3,181,190.92 ns 1728 B
Encrypt · AES-256-GCM (CryptoHives-Scalar) 128KB 3,806,613.70 ns 1,725.147 ns 1,529.298 ns 3,805,917.32 ns -

AES-128-CCM

AES-CCM (Counter with CBC-MAC) combines CTR mode encryption with CBC-MAC authentication. Unlike GCM, CCM requires two sequential passes (encrypt + MAC or MAC + decrypt), making it inherently less parallelizable. It is widely used in IoT protocols (Bluetooth LE, ZigBee, Thread) and supports variable nonce (7–13 bytes) and tag sizes (4–16 bytes). Two acceleration tiers are available:

  • ArmAes: ARM Cryptography Extension AESD/AESE instructions for all block operations — counter-mode encryption, CBC-MAC computation, and AAD processing. Uses Vector128<byte> round keys via MemoryMarshal.Cast from the shared uint[] key schedule. Dispatched via _useAesNi bool flag (shared with x86 dispatch; indicates hardware AES availability on any ISA).
  • Managed: T-table AES for all block operations. Fully portable, zero-allocation.

Key observations:

  • ArmAes: ~4× faster than Managed at 128 KiB; ~4.3× faster than BouncyCastle; zero allocation
  • Managed: T-table AES; comparable to BouncyCastle at large sizes
  • BouncyCastle: Allocates ~2.4–2.5 KB per call
  • No OS adapter available for comparison (System.Security.Cryptography does not expose AES-CCM on all platforms)
Description TestDataSize Mean Error StdDev Allocated
Decrypt · AES-128-CCM (CryptoHives-ARM-AES) 128B 273.1 ns 1.38 ns 1.07 ns -
Decrypt · AES-128-CCM (CryptoHives-Scalar) 128B 955.0 ns 0.57 ns 0.44 ns -
Decrypt · AES-128-CCM (BouncyCastle) 128B 1,441.3 ns 2.64 ns 2.20 ns 2424 B
Encrypt · AES-128-CCM (CryptoHives-ARM-AES) 128B 237.8 ns 4.33 ns 3.84 ns -
Encrypt · AES-128-CCM (CryptoHives-Scalar) 128B 907.0 ns 2.51 ns 2.09 ns -
Encrypt · AES-128-CCM (BouncyCastle) 128B 1,381.0 ns 6.24 ns 5.84 ns 2464 B
Decrypt · AES-128-CCM (CryptoHives-ARM-AES) 1KB 1,562.1 ns 3.02 ns 2.52 ns -
Decrypt · AES-128-CCM (CryptoHives-Scalar) 1KB 5,997.1 ns 1.73 ns 1.61 ns -
Decrypt · AES-128-CCM (BouncyCastle) 1KB 6,842.8 ns 2.05 ns 1.91 ns 2424 B
Encrypt · AES-128-CCM (CryptoHives-ARM-AES) 1KB 1,509.6 ns 9.00 ns 7.98 ns -
Encrypt · AES-128-CCM (CryptoHives-Scalar) 1KB 5,953.8 ns 1.47 ns 1.31 ns -
Encrypt · AES-128-CCM (BouncyCastle) 1KB 6,562.9 ns 40.12 ns 33.50 ns 2464 B
Decrypt · AES-128-CCM (CryptoHives-ARM-AES) 8KB 11,859.9 ns 4.50 ns 4.21 ns -
Decrypt · AES-128-CCM (CryptoHives-Scalar) 8KB 46,249.3 ns 7.77 ns 6.89 ns -
Decrypt · AES-128-CCM (BouncyCastle) 8KB 50,068.6 ns 18.44 ns 17.25 ns 2424 B
Encrypt · AES-128-CCM (CryptoHives-ARM-AES) 8KB 11,361.4 ns 59.40 ns 55.56 ns -
Encrypt · AES-128-CCM (CryptoHives-Scalar) 8KB 45,878.6 ns 537.76 ns 419.84 ns -
Encrypt · AES-128-CCM (BouncyCastle) 8KB 49,030.8 ns 174.47 ns 145.69 ns 2464 B
Decrypt · AES-128-CCM (CryptoHives-ARM-AES) 128KB 188,074.5 ns 317.22 ns 296.73 ns -
Decrypt · AES-128-CCM (CryptoHives-Scalar) 128KB 736,377.3 ns 117.59 ns 109.99 ns -
Decrypt · AES-128-CCM (BouncyCastle) 128KB 793,130.6 ns 451.95 ns 422.76 ns 2424 B
Encrypt · AES-128-CCM (CryptoHives-ARM-AES) 128KB 182,973.9 ns 1,031.92 ns 965.26 ns -
Encrypt · AES-128-CCM (CryptoHives-Scalar) 128KB 726,173.1 ns 5,011.64 ns 4,184.95 ns -
Encrypt · AES-128-CCM (BouncyCastle) 128KB 799,450.8 ns 6,038.70 ns 5,042.59 ns 2464 B

AES-256-CCM

AES-256-CCM uses 14 rounds (vs 10 for AES-128). The same ArmAes / Managed dispatch applies. The additional rounds add ~10-15% overhead on the Apple M4.

Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · AES-256-CCM (CryptoHives-ARM-AES) 128B 337.8 ns 6.49 ns 5.75 ns 339.0 ns -
Decrypt · AES-256-CCM (CryptoHives-Scalar) 128B 1,437.3 ns 28.78 ns 63.78 ns 1,407.8 ns -
Decrypt · AES-256-CCM (BouncyCastle) 128B 2,113.7 ns 42.01 ns 118.48 ns 2,078.6 ns 2808 B
Encrypt · AES-256-CCM (CryptoHives-ARM-AES) 128B 272.3 ns 0.33 ns 0.31 ns 272.5 ns -
Encrypt · AES-256-CCM (CryptoHives-Scalar) 128B 1,209.3 ns 0.45 ns 0.43 ns 1,209.4 ns -
Encrypt · AES-256-CCM (BouncyCastle) 128B 1,759.0 ns 1.36 ns 1.27 ns 1,758.8 ns 2848 B
Decrypt · AES-256-CCM (CryptoHives-ARM-AES) 1KB 1,988.8 ns 39.16 ns 60.97 ns 1,964.0 ns -
Decrypt · AES-256-CCM (CryptoHives-Scalar) 1KB 9,733.2 ns 193.41 ns 180.91 ns 9,743.4 ns -
Decrypt · AES-256-CCM (BouncyCastle) 1KB 10,563.3 ns 206.98 ns 383.65 ns 10,490.3 ns 2808 B
Encrypt · AES-256-CCM (CryptoHives-ARM-AES) 1KB 1,716.5 ns 1.43 ns 1.34 ns 1,716.7 ns -
Encrypt · AES-256-CCM (CryptoHives-Scalar) 1KB 7,904.6 ns 1.19 ns 1.06 ns 7,904.5 ns -
Encrypt · AES-256-CCM (BouncyCastle) 1KB 8,874.2 ns 3.47 ns 3.24 ns 8,873.8 ns 2848 B
Decrypt · AES-256-CCM (CryptoHives-ARM-AES) 8KB 14,769.1 ns 228.42 ns 368.85 ns 14,659.5 ns -
Decrypt · AES-256-CCM (CryptoHives-Scalar) 8KB 74,334.0 ns 1,429.61 ns 1,701.85 ns 74,752.2 ns -
Decrypt · AES-256-CCM (BouncyCastle) 8KB 74,908.0 ns 619.45 ns 579.43 ns 74,797.7 ns 2808 B
Encrypt · AES-256-CCM (CryptoHives-ARM-AES) 8KB 13,200.3 ns 4.76 ns 3.98 ns 13,201.8 ns -
Encrypt · AES-256-CCM (CryptoHives-Scalar) 8KB 64,871.5 ns 1,275.19 ns 2,165.37 ns 65,645.0 ns -
Encrypt · AES-256-CCM (BouncyCastle) 8KB 71,548.6 ns 1,389.83 ns 1,300.05 ns 71,899.2 ns 2848 B
Decrypt · AES-256-CCM (CryptoHives-ARM-AES) 128KB 238,909.2 ns 4,709.22 ns 8,491.69 ns 235,102.1 ns -
Decrypt · AES-256-CCM (CryptoHives-Scalar) 128KB 1,147,014.8 ns 22,832.17 ns 32,745.21 ns 1,157,165.6 ns -
Decrypt · AES-256-CCM (BouncyCastle) 128KB 1,281,403.6 ns 13,744.75 ns 12,856.85 ns 1,276,111.1 ns 2808 B
Encrypt · AES-256-CCM (CryptoHives-ARM-AES) 128KB 226,366.6 ns 1,819.51 ns 1,612.95 ns 226,620.7 ns -
Encrypt · AES-256-CCM (CryptoHives-Scalar) 128KB 1,083,294.2 ns 19,719.07 ns 24,938.32 ns 1,086,819.6 ns -
Encrypt · AES-256-CCM (BouncyCastle) 128KB 1,156,635.3 ns 17,336.29 ns 20,637.62 ns 1,156,524.3 ns 2848 B

ChaCha20-Poly1305

ChaCha20-Poly1305 is a software-friendly AEAD cipher (RFC 8439) that combines ChaCha20 stream encryption with Poly1305 MAC authentication. It is the recommended AEAD cipher when hardware AES acceleration is unavailable. Two acceleration tiers are available on ARM:

  • Neon: Single-block ChaCha20 via Vector128<uint> combined with Poly1305 donna-64 MAC (3×44-bit limbs, 9 multiplications per 16-byte block using Math.BigMul). ~5× faster than OS at 128 B (1.84 μs vs 9.58 μs); competitive with OS at 1 KiB; OS leads from 8 KiB. At 128 KiB, Neon (~1.19 ms) is ~1.54× slower than OS (~0.77 ms) and on par with BouncyCastle (~1.18 ms). A dual-block NEON path (comparable to the x86 AVX2 path) would be required to close this gap.
  • Managed: Scalar ChaCha20 + Poly1305 donna-32 (5×26-bit limbs, 25 multiplications per block on .NET Framework / .NET Standard). Fully portable.

Key observations:

  • Neon ~5× faster than OS at 128 B; ~1.4× faster than OS at 1 KiB; OS leads from ~8 KiB (~1.54× faster at 128 KiB)
  • Neon beats BouncyCastle at all sizes up to ~4 KiB; on par with BouncyCastle at 128 KiB (potential improvement area: a dual-block NEON path)
  • Managed and Neon paths are zero-allocation
  • BouncyCastle allocates 336–416 B per call; NaCl.Core allocates 48–72 B per call
Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 128B 2.156 μs 0.0046 μs 0.0043 μs 2.157 μs -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 128B 3.288 μs 0.0058 μs 0.0049 μs 3.287 μs 416 B
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 128B 4.272 μs 0.0011 μs 0.0010 μs 4.272 μs 48 B
Decrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 128B 6.557 μs 0.1285 μs 0.1428 μs 6.672 μs -
Decrypt · ChaCha20-Poly1305 (OS) 128B 11.071 μs 0.1061 μs 0.0941 μs 11.038 μs -
Encrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 128B 1.836 μs 0.0108 μs 0.0101 μs 1.840 μs -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 128B 2.344 μs 0.0040 μs 0.0033 μs 2.344 μs 336 B
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 128B 4.112 μs 0.0013 μs 0.0012 μs 4.112 μs 48 B
Encrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 128B 6.232 μs 0.0103 μs 0.0092 μs 6.229 μs -
Encrypt · ChaCha20-Poly1305 (OS) 128B 9.582 μs 0.1183 μs 0.1106 μs 9.581 μs -
Decrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 1KB 10.548 μs 0.0649 μs 0.0607 μs 10.533 μs -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 1KB 11.339 μs 0.0041 μs 0.0032 μs 11.338 μs 416 B
Decrypt · ChaCha20-Poly1305 (OS) 1KB 15.822 μs 0.0718 μs 0.0672 μs 15.797 μs -
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 1KB 19.073 μs 0.0068 μs 0.0056 μs 19.074 μs 72 B
Decrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 1KB 33.905 μs 0.0143 μs 0.0112 μs 33.902 μs -
Encrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 1KB 10.092 μs 0.0111 μs 0.0104 μs 10.090 μs -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 1KB 10.421 μs 0.0099 μs 0.0083 μs 10.418 μs 336 B
Encrypt · ChaCha20-Poly1305 (OS) 1KB 14.138 μs 0.1184 μs 0.1108 μs 14.106 μs -
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 1KB 18.914 μs 0.0143 μs 0.0111 μs 18.917 μs 72 B
Encrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 1KB 33.723 μs 0.0110 μs 0.0098 μs 33.721 μs -
Decrypt · ChaCha20-Poly1305 (OS) 8KB 54.149 μs 0.1266 μs 0.1123 μs 54.176 μs -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 8KB 75.021 μs 0.1526 μs 0.1191 μs 75.002 μs 416 B
Decrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 8KB 75.338 μs 0.2684 μs 0.2095 μs 75.238 μs -
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 8KB 137.033 μs 0.0552 μs 0.0517 μs 137.046 μs 72 B
Decrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 8KB 243.532 μs 0.2211 μs 0.1726 μs 243.494 μs -
Encrypt · ChaCha20-Poly1305 (OS) 8KB 51.838 μs 0.1845 μs 0.1636 μs 51.836 μs -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 8KB 74.306 μs 0.0370 μs 0.0328 μs 74.303 μs 336 B
Encrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 8KB 75.394 μs 0.1183 μs 0.0988 μs 75.344 μs -
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 8KB 136.922 μs 0.1618 μs 0.1434 μs 136.941 μs 72 B
Encrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 8KB 243.441 μs 0.0565 μs 0.0472 μs 243.437 μs -
Decrypt · ChaCha20-Poly1305 (OS) 128KB 773.679 μs 2.9226 μs 2.4405 μs 774.563 μs -
Decrypt · ChaCha20-Poly1305 (BouncyCastle) 128KB 1,174.941 μs 2.0728 μs 1.7309 μs 1,174.399 μs 416 B
Decrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 128KB 1,189.233 μs 3.4307 μs 3.2091 μs 1,188.708 μs -
Decrypt · ChaCha20-Poly1305 (NaCl.Core) 128KB 2,173.125 μs 5.2605 μs 4.6633 μs 2,172.153 μs 72 B
Decrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 128KB 3,835.770 μs 3.5903 μs 3.1827 μs 3,835.275 μs -
Encrypt · ChaCha20-Poly1305 (OS) 128KB 720.085 μs 2.2119 μs 2.0690 μs 720.637 μs -
Encrypt · ChaCha20-Poly1305 (BouncyCastle) 128KB 1,180.734 μs 2.7441 μs 2.4326 μs 1,181.029 μs 336 B
Encrypt · ChaCha20-Poly1305 (CryptoHives-Neon) 128KB 1,193.358 μs 1.4276 μs 1.3354 μs 1,192.924 μs -
Encrypt · ChaCha20-Poly1305 (NaCl.Core) 128KB 2,158.831 μs 0.5012 μs 0.4443 μs 2,158.781 μs 72 B
Encrypt · ChaCha20-Poly1305 (CryptoHives-Scalar) 128KB 3,840.596 μs 3.5656 μs 3.1608 μs 3,840.163 μs -

XChaCha20-Poly1305

XChaCha20-Poly1305 extends ChaCha20-Poly1305 with a 24-byte nonce (vs 12 bytes), making random nonce generation safe against collisions (2⁹² birthday bound vs 2³² for ChaCha20-Poly1305). The implementation prepends an HChaCha20 key derivation step that derives a subkey from the first 16 bytes of the nonce. The same Neon / Managed acceleration tiers apply to the inner ChaCha20-Poly1305 operation.

Key observations:

  • Performance nearly identical to ChaCha20-Poly1305 (HChaCha20 adds ~400 ns constant overhead)
  • Neon ~3.3× faster than Managed at 128 KiB; ~3.3× faster than NaCl.Core at 128 KiB
  • No OS or BouncyCastle implementations available for comparison
  • NaCl.Core allocates 48–72 B per call
  • Managed and Neon paths are zero-allocation
Description TestDataSize Mean Error StdDev Median Allocated
Decrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 128B 870.8 ns 7.85 ns 6.96 ns 869.1 ns -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 128B 1,494.1 ns 1.85 ns 1.64 ns 1,493.5 ns 48 B
Decrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 128B 1,777.1 ns 10.78 ns 9.00 ns 1,774.8 ns -
Encrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 128B 724.6 ns 5.29 ns 4.95 ns 726.2 ns -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 128B 1,448.5 ns 1.33 ns 1.18 ns 1,447.9 ns 48 B
Encrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 128B 1,681.4 ns 9.47 ns 8.39 ns 1,681.9 ns -
Decrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 1KB 2,468.2 ns 2.50 ns 2.21 ns 2,467.3 ns -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 1KB 6,662.1 ns 54.00 ns 45.10 ns 6,647.6 ns 72 B
Decrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 1KB 7,248.1 ns 28.61 ns 23.89 ns 7,255.3 ns -
Encrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 1KB 2,360.2 ns 1.34 ns 1.12 ns 2,360.2 ns -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 1KB 6,632.1 ns 91.79 ns 81.37 ns 6,590.8 ns 72 B
Encrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 1KB 7,140.9 ns 22.92 ns 20.32 ns 7,141.7 ns -
Decrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 8KB 14,941.5 ns 12.66 ns 10.57 ns 14,938.7 ns -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 8KB 48,078.7 ns 590.11 ns 551.99 ns 47,752.9 ns 72 B
Decrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 8KB 49,552.9 ns 159.09 ns 141.03 ns 49,545.3 ns -
Encrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 8KB 14,876.7 ns 4.58 ns 3.58 ns 14,875.9 ns -
Encrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 8KB 49,262.7 ns 197.17 ns 184.43 ns 49,166.9 ns -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 8KB 49,715.8 ns 987.01 ns 2,402.51 ns 48,239.9 ns 72 B
Decrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 128KB 230,374.7 ns 1,734.54 ns 1,448.42 ns 230,310.9 ns -
Decrypt · XChaCha20-Poly1305 (NaCl.Core) 128KB 753,041.0 ns 2,679.29 ns 2,375.12 ns 752,726.2 ns 72 B
Decrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 128KB 778,248.7 ns 5,998.17 ns 5,317.22 ns 777,958.1 ns -
Encrypt · XChaCha20-Poly1305 (CryptoHives-Neon) 128KB 232,627.7 ns 3,798.88 ns 3,553.47 ns 231,496.5 ns -
Encrypt · XChaCha20-Poly1305 (NaCl.Core) 128KB 754,021.8 ns 2,403.71 ns 2,130.83 ns 753,460.4 ns 72 B
Encrypt · XChaCha20-Poly1305 (CryptoHives-Scalar) 128KB 785,492.5 ns 3,332.60 ns 2,782.87 ns 784,970.5 ns -

Regional Block Ciphers

Regional block ciphers implement national cryptographic standards. All operate on 128-bit blocks in CBC mode. Benchmarks compare Managed implementations against BouncyCastle where available.

SM4-CBC (China)

SM4 is the Chinese national block cipher (GB/T 32907-2016). It uses a 128-bit key with 32 rounds of nonlinear key mixing.

  • Managed: Lookup-table implementation with 32-bit word operations. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · SM4-CBC (CryptoHives-Scalar) 128B 921.5 ns 12.28 ns 10.89 ns -
Decrypt · SM4-CBC (BouncyCastle) 128B 1,411.9 ns 8.01 ns 6.25 ns 40 B
Encrypt · SM4-CBC (CryptoHives-Scalar) 128B 1,036.8 ns 4.76 ns 4.45 ns -
Encrypt · SM4-CBC (BouncyCastle) 128B 1,472.8 ns 5.07 ns 4.49 ns 40 B
Decrypt · SM4-CBC (CryptoHives-Scalar) 1KB 6,479.2 ns 26.79 ns 22.37 ns -
Decrypt · SM4-CBC (BouncyCastle) 1KB 8,835.6 ns 115.11 ns 107.67 ns 40 B
Encrypt · SM4-CBC (CryptoHives-Scalar) 1KB 7,430.8 ns 16.52 ns 15.45 ns -
Encrypt · SM4-CBC (BouncyCastle) 1KB 9,525.7 ns 71.14 ns 63.07 ns 40 B
Decrypt · SM4-CBC (CryptoHives-Scalar) 8KB 51,202.7 ns 428.44 ns 400.76 ns -
Decrypt · SM4-CBC (BouncyCastle) 8KB 67,298.9 ns 334.28 ns 279.13 ns 40 B
Encrypt · SM4-CBC (CryptoHives-Scalar) 8KB 58,660.9 ns 176.29 ns 147.21 ns -
Encrypt · SM4-CBC (BouncyCastle) 8KB 73,962.4 ns 398.15 ns 352.95 ns 40 B
Decrypt · SM4-CBC (CryptoHives-Scalar) 128KB 826,478.1 ns 13,701.75 ns 15,778.96 ns -
Decrypt · SM4-CBC (BouncyCastle) 128KB 1,076,688.9 ns 4,939.38 ns 4,378.63 ns 40 B
Encrypt · SM4-CBC (CryptoHives-Scalar) 128KB 937,807.3 ns 1,932.93 ns 1,713.49 ns -
Encrypt · SM4-CBC (BouncyCastle) 128KB 1,176,637.9 ns 6,588.28 ns 6,162.68 ns 40 B

ARIA-128-CBC (Korea)

ARIA is a Korean national cipher (KS X 1213) with an involutional SPN structure. ARIA-128 uses 12 rounds.

  • Managed: S-box substitution with byte-level diffusion layer. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · ARIA-128-CBC (CryptoHives-Scalar) 128B 927.8 ns 1.82 ns 1.70 ns -
Decrypt · ARIA-128-CBC (BouncyCastle) 128B 2,324.1 ns 8.46 ns 7.91 ns 1288 B
Encrypt · ARIA-128-CBC (CryptoHives-Scalar) 128B 941.6 ns 4.68 ns 3.91 ns -
Encrypt · ARIA-128-CBC (BouncyCastle) 128B 2,240.6 ns 7.23 ns 6.76 ns 1288 B
Decrypt · ARIA-128-CBC (CryptoHives-Scalar) 1KB 6,627.7 ns 4.24 ns 3.76 ns -
Decrypt · ARIA-128-CBC (BouncyCastle) 1KB 14,412.4 ns 40.69 ns 36.07 ns 3528 B
Encrypt · ARIA-128-CBC (CryptoHives-Scalar) 1KB 6,779.2 ns 12.17 ns 10.79 ns -
Encrypt · ARIA-128-CBC (BouncyCastle) 1KB 14,089.4 ns 36.20 ns 33.86 ns 3528 B
Decrypt · ARIA-128-CBC (CryptoHives-Scalar) 8KB 52,162.7 ns 46.58 ns 43.57 ns -
Decrypt · ARIA-128-CBC (BouncyCastle) 8KB 109,308.5 ns 199.95 ns 187.04 ns 21448 B
Encrypt · ARIA-128-CBC (CryptoHives-Scalar) 8KB 52,727.6 ns 693.84 ns 649.02 ns -
Encrypt · ARIA-128-CBC (BouncyCastle) 8KB 106,426.1 ns 272.56 ns 241.62 ns 21448 B
Decrypt · ARIA-128-CBC (CryptoHives-Scalar) 128KB 830,361.3 ns 2,946.62 ns 2,756.27 ns -
Decrypt · ARIA-128-CBC (BouncyCastle) 128KB 1,746,905.1 ns 4,124.45 ns 3,858.02 ns 328648 B
Encrypt · ARIA-128-CBC (CryptoHives-Scalar) 128KB 855,540.8 ns 724.33 ns 565.51 ns -
Encrypt · ARIA-128-CBC (BouncyCastle) 128KB 1,719,585.5 ns 3,806.57 ns 3,560.67 ns 328648 B

ARIA-256-CBC (Korea)

ARIA-256 uses 16 rounds for 256-bit key security. The same SPN structure applies with additional rounds.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · ARIA-256-CBC (CryptoHives-Scalar) 128B 1.210 μs 0.0005 μs 0.0005 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 128B 3.003 μs 0.0050 μs 0.0047 μs 1496 B
Encrypt · ARIA-256-CBC (CryptoHives-Scalar) 128B 1.245 μs 0.0011 μs 0.0010 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 128B 2.914 μs 0.0058 μs 0.0054 μs 1496 B
Decrypt · ARIA-256-CBC (CryptoHives-Scalar) 1KB 8.639 μs 0.0104 μs 0.0097 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 1KB 18.666 μs 0.0537 μs 0.0476 μs 3736 B
Encrypt · ARIA-256-CBC (CryptoHives-Scalar) 1KB 8.922 μs 0.0042 μs 0.0039 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 1KB 18.424 μs 0.0485 μs 0.0454 μs 3736 B
Decrypt · ARIA-256-CBC (CryptoHives-Scalar) 8KB 68.079 μs 0.0810 μs 0.0758 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 8KB 143.145 μs 0.3001 μs 0.2807 μs 21656 B
Encrypt · ARIA-256-CBC (CryptoHives-Scalar) 8KB 70.319 μs 0.1612 μs 0.1346 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 8KB 140.554 μs 0.2828 μs 0.2645 μs 21656 B
Decrypt · ARIA-256-CBC (CryptoHives-Scalar) 128KB 1,088.538 μs 2.2789 μs 2.1317 μs -
Decrypt · ARIA-256-CBC (BouncyCastle) 128KB 2,266.721 μs 6.7677 μs 6.3305 μs 328856 B
Encrypt · ARIA-256-CBC (CryptoHives-Scalar) 128KB 1,124.559 μs 0.7575 μs 0.7085 μs -
Encrypt · ARIA-256-CBC (BouncyCastle) 128KB 2,246.388 μs 6.5943 μs 6.1683 μs 328856 B

Camellia-128-CBC (Japan)

Camellia is a Japanese CRYPTREC/NESSIE cipher (RFC 3713) with a Feistel structure and FL/FL⁻¹ key-dependent layers.

  • Managed: Pre-computed SP-box tables with 6 S-boxes. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · Camellia-128-CBC (CryptoHives-Scalar) 128B 583.4 ns 2.61 ns 2.44 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 128B 900.0 ns 3.16 ns 2.96 ns 576 B
Encrypt · Camellia-128-CBC (CryptoHives-Scalar) 128B 635.4 ns 3.71 ns 3.47 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 128B 892.1 ns 2.70 ns 2.52 ns 576 B
Decrypt · Camellia-128-CBC (CryptoHives-Scalar) 1KB 4,109.9 ns 20.70 ns 19.36 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 1KB 5,837.9 ns 17.67 ns 16.53 ns 2816 B
Encrypt · Camellia-128-CBC (CryptoHives-Scalar) 1KB 4,670.0 ns 14.73 ns 13.78 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 1KB 5,871.9 ns 20.41 ns 19.09 ns 2816 B
Decrypt · Camellia-128-CBC (CryptoHives-Scalar) 8KB 32,465.9 ns 140.94 ns 131.84 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 8KB 45,009.8 ns 137.31 ns 128.44 ns 20736 B
Encrypt · Camellia-128-CBC (CryptoHives-Scalar) 8KB 37,064.0 ns 164.05 ns 153.45 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 8KB 45,261.6 ns 154.66 ns 144.67 ns 20736 B
Decrypt · Camellia-128-CBC (CryptoHives-Scalar) 128KB 521,839.9 ns 2,451.14 ns 2,292.80 ns -
Decrypt · Camellia-128-CBC (BouncyCastle) 128KB 728,615.9 ns 2,845.84 ns 2,662.00 ns 327936 B
Encrypt · Camellia-128-CBC (CryptoHives-Scalar) 128KB 595,950.4 ns 2,549.53 ns 2,384.83 ns -
Encrypt · Camellia-128-CBC (BouncyCastle) 128KB 718,452.6 ns 2,779.84 ns 2,321.29 ns 327936 B

Camellia-256-CBC (Japan)

Camellia-256 uses 24 rounds (vs 18 for 128-bit). The additional FL/FL⁻¹ layers add minimal overhead.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · Camellia-256-CBC (CryptoHives-Scalar) 128B 831.3 ns 3.36 ns 2.98 ns -
Decrypt · Camellia-256-CBC (BouncyCastle) 128B 1,196.5 ns 3.71 ns 3.29 ns 592 B
Encrypt · Camellia-256-CBC (CryptoHives-Scalar) 128B 888.3 ns 4.69 ns 4.39 ns -
Encrypt · Camellia-256-CBC (BouncyCastle) 128B 1,167.4 ns 5.89 ns 5.51 ns 592 B
Decrypt · Camellia-256-CBC (CryptoHives-Scalar) 1KB 5,884.4 ns 71.64 ns 63.50 ns -
Decrypt · Camellia-256-CBC (BouncyCastle) 1KB 7,653.6 ns 18.85 ns 17.63 ns 2832 B
Encrypt · Camellia-256-CBC (CryptoHives-Scalar) 1KB 6,439.5 ns 20.14 ns 18.84 ns -
Encrypt · Camellia-256-CBC (BouncyCastle) 1KB 7,675.2 ns 25.75 ns 20.11 ns 2832 B
Decrypt · Camellia-256-CBC (CryptoHives-Scalar) 8KB 46,431.4 ns 179.93 ns 168.31 ns -
Decrypt · Camellia-256-CBC (BouncyCastle) 8KB 59,034.3 ns 99.45 ns 93.03 ns 20752 B
Encrypt · Camellia-256-CBC (CryptoHives-Scalar) 8KB 51,102.3 ns 236.79 ns 209.91 ns -
Encrypt · Camellia-256-CBC (BouncyCastle) 8KB 59,348.3 ns 200.40 ns 187.46 ns 20752 B
Decrypt · Camellia-256-CBC (CryptoHives-Scalar) 128KB 742,813.1 ns 3,121.44 ns 2,606.54 ns -
Decrypt · Camellia-256-CBC (BouncyCastle) 128KB 939,621.1 ns 2,158.14 ns 1,802.14 ns 327952 B
Encrypt · Camellia-256-CBC (CryptoHives-Scalar) 128KB 818,136.6 ns 4,710.09 ns 3,933.14 ns -
Encrypt · Camellia-256-CBC (BouncyCastle) 128KB 946,057.0 ns 4,722.34 ns 3,943.36 ns 327952 B

Kuznyechik-CBC (Russia)

Kuznyechik (GOST R 34.12-2015) is the modern Russian cipher with a 256-bit key and 10 rounds. It replaces the older GOST 28147-89.

  • Managed: Pre-computed S-box and linear transformation tables. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · Kuznyechik-CBC (CryptoHives-Scalar) 128B 391.7 μs 7.41 μs 6.57 μs -
Encrypt · Kuznyechik-CBC (CryptoHives-Scalar) 128B 408.6 μs 8.11 μs 15.23 μs -
Decrypt · Kuznyechik-CBC (CryptoHives-Scalar) 1KB 3,107.0 μs 17.59 μs 14.68 μs -
Encrypt · Kuznyechik-CBC (CryptoHives-Scalar) 1KB 3,284.4 μs 28.57 μs 22.31 μs -
Decrypt · Kuznyechik-CBC (CryptoHives-Scalar) 8KB 26,156.3 μs 186.13 μs 155.42 μs -
Encrypt · Kuznyechik-CBC (CryptoHives-Scalar) 8KB 27,158.4 μs 532.26 μs 612.95 μs -
Decrypt · Kuznyechik-CBC (CryptoHives-Scalar) 128KB 418,542.0 μs 5,012.39 μs 4,185.57 μs -
Encrypt · Kuznyechik-CBC (CryptoHives-Scalar) 128KB 420,882.6 μs 2,505.93 μs 2,221.44 μs -

Kalyna-128-CBC (Ukraine)

Kalyna (DSTU 7624:2014) is the Ukrainian national cipher paired with the Kupyna hash family. Uses MDS matrix diffusion.

  • Managed: S-box substitution with MDS matrix multiplication. Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · Kalyna-128-CBC (CryptoHives-Scalar) 128B 798.7 ns 2.73 ns 2.42 ns -
Decrypt · Kalyna-128-CBC (BouncyCastle) 128B 2,425.8 ns 3.09 ns 2.89 ns 872 B
Encrypt · Kalyna-128-CBC (CryptoHives-Scalar) 128B 396.9 ns 1.65 ns 1.38 ns -
Encrypt · Kalyna-128-CBC (BouncyCastle) 128B 1,252.7 ns 4.56 ns 4.26 ns 872 B
Decrypt · Kalyna-128-CBC (CryptoHives-Scalar) 1KB 5,642.5 ns 17.20 ns 16.09 ns -
Decrypt · Kalyna-128-CBC (BouncyCastle) 1KB 15,461.9 ns 12.48 ns 11.06 ns 872 B
Encrypt · Kalyna-128-CBC (CryptoHives-Scalar) 1KB 2,886.6 ns 12.01 ns 11.23 ns -
Encrypt · Kalyna-128-CBC (BouncyCastle) 1KB 7,027.7 ns 23.47 ns 21.96 ns 872 B
Decrypt · Kalyna-128-CBC (CryptoHives-Scalar) 8KB 44,408.3 ns 191.24 ns 169.53 ns -
Decrypt · Kalyna-128-CBC (BouncyCastle) 8KB 119,614.1 ns 88.51 ns 78.46 ns 872 B
Encrypt · Kalyna-128-CBC (CryptoHives-Scalar) 8KB 22,890.9 ns 66.92 ns 62.60 ns -
Encrypt · Kalyna-128-CBC (BouncyCastle) 8KB 53,186.3 ns 138.19 ns 122.50 ns 872 B
Decrypt · Kalyna-128-CBC (CryptoHives-Scalar) 128KB 706,983.4 ns 2,881.66 ns 2,249.81 ns -
Decrypt · Kalyna-128-CBC (BouncyCastle) 128KB 1,902,282.2 ns 2,917.11 ns 2,277.49 ns 872 B
Encrypt · Kalyna-128-CBC (CryptoHives-Scalar) 128KB 369,102.1 ns 873.51 ns 817.08 ns -
Encrypt · Kalyna-128-CBC (BouncyCastle) 128KB 847,327.0 ns 1,901.11 ns 1,685.28 ns 872 B

Kalyna-256-CBC (Ukraine)

Kalyna-256 uses 14 rounds (vs 10 for 128-bit key). The same MDS-based architecture applies.

Description TestDataSize Mean Error StdDev Allocated
Decrypt · Kalyna-256-CBC (CryptoHives-Scalar) 128B 1,128.5 ns 3.81 ns 3.18 ns -
Decrypt · Kalyna-256-CBC (BouncyCastle) 128B 3,303.0 ns 2.29 ns 2.03 ns 1112 B
Encrypt · Kalyna-256-CBC (CryptoHives-Scalar) 128B 554.4 ns 2.69 ns 2.10 ns -
Encrypt · Kalyna-256-CBC (BouncyCastle) 128B 1,699.2 ns 3.38 ns 2.82 ns 1112 B
Decrypt · Kalyna-256-CBC (CryptoHives-Scalar) 1KB 8,036.5 ns 31.95 ns 24.95 ns -
Decrypt · Kalyna-256-CBC (BouncyCastle) 1KB 21,269.0 ns 16.57 ns 12.93 ns 1112 B
Encrypt · Kalyna-256-CBC (CryptoHives-Scalar) 1KB 4,010.7 ns 17.98 ns 15.01 ns -
Encrypt · Kalyna-256-CBC (BouncyCastle) 1KB 9,626.2 ns 19.34 ns 17.15 ns 1112 B
Decrypt · Kalyna-256-CBC (CryptoHives-Scalar) 8KB 63,253.3 ns 244.99 ns 229.16 ns -
Decrypt · Kalyna-256-CBC (BouncyCastle) 8KB 164,958.9 ns 155.34 ns 145.31 ns 1112 B
Encrypt · Kalyna-256-CBC (CryptoHives-Scalar) 8KB 31,585.1 ns 164.62 ns 145.93 ns -
Encrypt · Kalyna-256-CBC (BouncyCastle) 8KB 73,257.0 ns 363.53 ns 322.26 ns 1112 B
Decrypt · Kalyna-256-CBC (CryptoHives-Scalar) 128KB 1,005,918.9 ns 3,380.08 ns 3,161.73 ns -
Decrypt · Kalyna-256-CBC (BouncyCastle) 128KB 2,628,385.5 ns 1,761.63 ns 1,375.36 ns 1112 B
Encrypt · Kalyna-256-CBC (CryptoHives-Scalar) 128KB 506,426.0 ns 1,779.25 ns 1,577.25 ns -
Encrypt · Kalyna-256-CBC (BouncyCastle) 128KB 1,164,810.6 ns 4,690.58 ns 4,158.07 ns 1112 B

SEED-CBC (Korea)

SEED is a Korean cipher (RFC 4269, KISA) with a 128-bit key and 16-round Feistel structure. S-boxes are derived from the golden ratio.

  • Managed: Pre-computed 32-bit SS-boxes (SS0–SS3). Zero allocation.
Description TestDataSize Mean Error StdDev Allocated
Decrypt · SEED-CBC (CryptoHives-Scalar) 128B 1.352 μs 0.0154 μs 0.0128 μs -
Decrypt · SEED-CBC (BouncyCastle) 128B 1.438 μs 0.0145 μs 0.0121 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 128B 1.475 μs 0.0066 μs 0.0058 μs 152 B
Encrypt · SEED-CBC (CryptoHives-Scalar) 128B 1.493 μs 0.0072 μs 0.0067 μs -
Decrypt · SEED-CBC (CryptoHives-Scalar) 1KB 9.553 μs 0.0354 μs 0.0314 μs -
Decrypt · SEED-CBC (BouncyCastle) 1KB 9.780 μs 0.0552 μs 0.0490 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 1KB 10.390 μs 0.1815 μs 0.1698 μs 152 B
Encrypt · SEED-CBC (CryptoHives-Scalar) 1KB 11.119 μs 0.1949 μs 0.2733 μs -
Decrypt · SEED-CBC (CryptoHives-Scalar) 8KB 74.943 μs 0.3096 μs 0.2896 μs -
Decrypt · SEED-CBC (BouncyCastle) 8KB 76.362 μs 0.3769 μs 0.3342 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 8KB 80.960 μs 1.2714 μs 1.0617 μs 152 B
Encrypt · SEED-CBC (CryptoHives-Scalar) 8KB 85.853 μs 0.4465 μs 0.3958 μs -
Decrypt · SEED-CBC (CryptoHives-Scalar) 128KB 1,192.777 μs 8.1407 μs 7.2165 μs -
Decrypt · SEED-CBC (BouncyCastle) 128KB 1,225.473 μs 8.5432 μs 7.9913 μs 152 B
Encrypt · SEED-CBC (BouncyCastle) 128KB 1,286.279 μs 4.3548 μs 3.8605 μs 152 B
Encrypt · SEED-CBC (CryptoHives-Scalar) 128KB 1,368.396 μs 9.5930 μs 7.4896 μs -

Allocation Summary

All CryptoHives cipher implementations achieve zero heap allocation for both encrypt and decrypt operations across all payload sizes. This is critical for high-throughput scenarios such as network packet processing, where GC pressure directly impacts tail latency.

Implementation Allocation Notes
CryptoHives (all variants) 0 B All tiers (Managed, ArmAes, ArmAes+ArmPmull, Neon) are zero-allocation at all payload sizes
OS (.NET) — GCM / ChaCha20-Poly1305 0 B OS AEAD implementations are zero-allocation
OS (.NET) — CBC 72 B Fixed P/Invoke marshalling overhead per call, independent of payload size
BouncyCastle — CBC 832–1,024 B Fixed per-call allocation (832 B for AES-128, 1,024 B for AES-256)
BouncyCastle — GCM 1,520–1,744 B Fixed per-call allocation (1,520 B for AES-128 encrypt, 1,744 B for AES-256 decrypt)
BouncyCastle — CCM 2,424–2,848 B Fixed per-call allocation (2,424 B for AES-128 decrypt, 2,848 B for AES-256 encrypt)
BouncyCastle — ChaCha20-Poly1305 336–416 B Varies slightly by payload size
BouncyCastle — ChaCha20 96 B Fixed per-call allocation
NaCl.Core — ChaCha20 24 B Small fixed allocation
NaCl.Core — ChaCha20-Poly1305 / XChaCha20 48–72 B Small allocation, varies by payload size