Three operations sit at the foundation of every cryptographic system ever built: encryption, hashing, and signing. They are distinct operations with distinct purposes, distinct mathematical properties, and distinct failure modes. Confusing them – using a hash where a signature was needed, or a signature where encryption was required – is not an academic error. It is the kind of mistake that exposes medical records, drains cryptocurrency wallets, and undermines national security infrastructure.

Yet the confusion is rampant. A 2023 survey by the SANS Institute found that 41% of software developers could not correctly distinguish between encryption and hashing when asked to identify the appropriate primitive for password storage. The same survey found that 67% of respondents could not explain when a digital signature is required versus when encryption alone is sufficient. These are not obscure edge cases. These are the foundational decisions that determine whether a system is secure or merely appears to be.

This is the definitive breakdown: what each operation does, how it works at the mathematical level, when to use each, and what happens when you choose wrong.

Encryption: Reversible Confidentiality

Encryption transforms plaintext into ciphertext using a key, such that only holders of the corresponding decryption key can reverse the process. The purpose is confidentiality – preventing unauthorized parties from reading the data.

The defining property: encryption is reversible. Given the ciphertext and the correct key, the original plaintext is recovered exactly. Without the key, the ciphertext is computationally indistinguishable from random noise.

Symmetric Encryption

In symmetric encryption, the same key encrypts and decrypts. AES-256-GCM is the dominant symmetric cipher, processing an estimated 90%+ of encrypted internet traffic. The key is 256 bits. The algorithm operates on 128-bit blocks. The GCM mode provides both confidentiality (via CTR-mode encryption) and integrity (via a GHASH authentication tag).

The operational constraint: both parties must share the same secret key. In a two-party conversation, this requires a secure key exchange. In a multi-party system, the number of required keys scales quadratically. This is why symmetric encryption is almost always combined with an asymmetric key exchange mechanism.

Performance is the advantage. AES-256-GCM on modern hardware (with AES-NI instruction support) encrypts at 4-8 GB/s. This is fast enough that encryption overhead is negligible for virtually any application, from chat messages to video streams.

Asymmetric Encryption

In asymmetric encryption, a public key encrypts and a different, mathematically related private key decrypts. RSA and ECIES (Elliptic Curve Integrated Encryption Scheme) are the primary asymmetric encryption algorithms. The sender needs only the recipient’s public key, eliminating the shared-secret problem.

The cost is performance. RSA-2048 encryption is approximately 1,000 times slower than AES-256 for equivalent data sizes. Elliptic curve based schemes are faster than RSA but still orders of magnitude slower than symmetric ciphers. This is why the standard pattern is hybrid encryption: use asymmetric cryptography to exchange a symmetric key, then encrypt the actual data with AES.

When Encryption is the Right Choice

Use encryption when the goal is confidentiality – when the data must be unreadable to anyone except the intended recipient. Examples: message content in end-to-end encrypted chat, files at rest on an encrypted disk, session data transmitted over a network. Encryption answers the question: “Can an unauthorized party read this data?” The answer, with correct encryption, is no.

Encryption does not answer: “Has this data been modified?” (that requires integrity verification), “Who sent this data?” (that requires authentication), or “Can the sender deny having sent it?” (that requires non-repudiation). These are the domains of hashing and signing.

Hashing: Irreversible Fingerprinting

A cryptographic hash function takes input of any length and produces a fixed-length output – the hash, digest, or fingerprint. The purpose is integrity – detecting whether data has been modified.

The defining property: hashing is irreversible. Given a hash output, there is no computational method to recover the input. This is not encryption with a lost key. It is a mathematically one-way function. The input space is infinite. The output space is finite (256 bits for SHA-256, producing 2^256 possible outputs). Information is irreversibly destroyed in the compression.

Properties of Cryptographic Hash Functions

Preimage resistance. Given a hash h, it is computationally infeasible to find any input m such that H(m) = h. For SHA-256, this requires approximately 2^256 operations.

Second preimage resistance. Given an input m1, it is computationally infeasible to find a different input m2 such that H(m1) = H(m2). This prevents an attacker from substituting a different document with the same hash.

Collision resistance. It is computationally infeasible to find any two distinct inputs m1 and m2 such that H(m1) = H(m2). For SHA-256, the birthday attack requires approximately 2^128 operations – far beyond current computational feasibility.

SHA-256 (part of the SHA-2 family, standardized by NIST in 2001) and SHA-3 (standardized in 2015, based on the Keccak sponge construction) are the current standards. SHA-1 was formally deprecated by NIST in 2022 after Google demonstrated a practical collision in 2017 (the SHAttered attack), requiring approximately 2^63 SHA-1 evaluations – a computation that cost roughly $110,000 in cloud GPU time.

Password Storage: The Canonical Hash Application

Passwords must never be stored in plaintext. They must never be stored encrypted (because the decryption key would become a single point of compromise for all passwords). They must be hashed.

The storage process: when a user creates a password, the system computes hash(password + salt) where salt is a unique random value per user, and stores the hash and salt. When the user authenticates, the system re-computes the hash and compares. The server never stores or sees the plaintext password after the initial hashing.

Critical distinction: general-purpose hash functions (SHA-256) are the wrong choice for password hashing. They are designed to be fast. An NVIDIA RTX 4090 computes approximately 22 billion SHA-256 hashes per second. At that rate, every possible 8-character alphanumeric password (62^8 = 218 trillion combinations) is exhausted in under three hours.

Password-specific hash functions – bcrypt, scrypt, and Argon2id – are deliberately slow and memory-hard. Argon2id with recommended parameters (19 MiB memory, 2 iterations) reduces the hashing rate to approximately 10 hashes per second per core, making brute-force attacks on reasonable passwords computationally intractable. The OWASP 2024 guidelines explicitly recommend Argon2id as the primary choice for password storage.

When Hashing is the Right Choice

Use hashing when the goal is integrity verification or commitment: confirming that data has not been modified, generating a fixed-size fingerprint of arbitrary data, or storing secrets (like passwords) that must be verifiable but never recoverable. Hashing answers the question: “Is this data identical to the original?” It does not provide confidentiality (the original data is needed to verify), and it does not provide authentication (anyone can compute a hash).

Signing: Authenticated Non-Repudiation

A digital signature binds a message to an identity. The signer uses their private key to produce a signature over a message. Any party with the signer’s public key can verify the signature, confirming both that the message has not been modified (integrity) and that it was produced by the holder of the private key (authentication). The purpose is authentication and non-repudiation.

The defining property: signing provides verifiable attribution. A valid signature proves that the holder of a specific private key endorsed a specific message. The signer cannot later deny having signed (non-repudiation), because only they possess the private key.

How Digital Signatures Work

Digital signature schemes combine hashing and asymmetric cryptography. The standard process:

  1. Hash the message. Compute h = H(message) using a cryptographic hash function. This produces a fixed-size digest regardless of message length.

  2. Sign the hash. Apply the signing algorithm with the private key to produce a signature: sig = Sign(private_key, h). For ECDSA on secp256k1 (the Ethereum signature scheme), this involves generating a random nonce, computing an elliptic curve point, and performing modular arithmetic.

  3. Verify. Any party computes h’ = H(message) and checks Verify(public_key, h’, sig). If the check passes, the message is authentic and unmodified.

The signature is typically much shorter than the message – 64 bytes for Ed25519, 65 bytes for Ethereum’s ECDSA (including the recovery parameter). This compactness is enabled by hashing the message first; the signature is over the fixed-size hash, not the variable-size message.

HMAC: Symmetric Authentication

Hash-based Message Authentication Code (HMAC) provides integrity and authentication using a shared symmetric key. HMAC-SHA256(key, message) produces a tag that can be verified by anyone who holds the same key. This provides authentication (the message came from someone with the key) and integrity (the message has not been modified), but not non-repudiation (either party could have generated the tag).

HMAC is used extensively in API authentication (AWS Signature V4, Stripe webhooks), cookie security (signed session cookies), and key derivation (HKDF uses HMAC internally). It is the correct choice when both parties share a secret and non-repudiation is not required.

When Signing is the Right Choice

Use signing when the goal is proving authorship or detecting tampering by an authenticated party. Ethereum transactions are signed to prove the sender authorized the transfer. Software packages are signed to prove the publisher released the binary. TLS certificates are signed to prove a certificate authority vouched for the domain. Signing answers: “Who produced this data, and has it been modified since?”

The Confusion Matrix: Common Mistakes

Encrypting Passwords Instead of Hashing Them

If passwords are encrypted instead of hashed, compromising the encryption key exposes every password in the database simultaneously. This is a systemic failure mode. Adobe’s 2013 breach exposed 153 million passwords that were encrypted (not hashed) with 3DES in ECB mode. The encryption key was extracted, and every password was recoverable. Had the passwords been properly hashed with bcrypt, the breach would have exposed hash values requiring individual brute-force attacks, most of which would fail against strong passwords.

Hashing for Confidentiality

Hashing is not encryption. It does not hide data from an adversary who can guess the input. If a system “protects” credit card numbers by hashing them with SHA-256, an attacker can simply hash all valid credit card numbers (at most 10^16 possibilities, or about 55 bits of entropy) and build a lookup table. At 22 billion hashes per second, the entire space is exhausted in under 15 minutes. This is why hashing sensitive data with low entropy is not a confidentiality mechanism.

Encrypting Without Authenticating

Encryption without authentication is vulnerable to ciphertext manipulation. An attacker who cannot read the plaintext may still be able to modify the ciphertext in ways that produce a predictable change in the decrypted plaintext. This is why AES-256-GCM includes an authentication tag – the “AM” in AEAD (Authenticated Encryption with Associated Data). Encryption modes without authentication (AES-CBC without HMAC, for example) are vulnerable to padding oracle attacks, as demonstrated against ASP.NET in 2010 and TLS in the POODLE and Lucky Thirteen attacks.

Signing Without Encrypting

A signature proves who sent a message and that it has not been modified, but it does not hide the message content. A signed-but-unencrypted email is authenticated but not confidential – anyone who intercepts it can read the contents, they just cannot modify them without invalidating the signature. Systems that need both confidentiality and authentication must use both encryption and signing (or an AEAD mode that combines them).

Combining the Primitives

Secure systems rarely use a single primitive in isolation. They compose them:

TLS 1.3 uses ECDHE key exchange (asymmetric) to establish a shared secret, derives symmetric keys via HKDF (hashing), encrypts data with AES-256-GCM (authenticated encryption), and authenticates the server via X.509 certificates (signing).

Ethereum transactions hash the transaction data (Keccak-256), sign the hash with ECDSA (signing), and the recipient address is derived from a hash of the public key (hashing). The transaction is not encrypted – Ethereum is a public ledger where transaction contents are visible to all nodes.

Signal Protocol uses X3DH key agreement (asymmetric), Double Ratchet (symmetric key derivation via hashing), AES-256-CBC + HMAC-SHA256 (encryption + authentication), and Ed25519 (signing for identity keys). Each primitive serves a specific purpose in the protocol stack, and the end-to-end encryption guarantee emerges from their correct composition.

Zero-knowledge architectures layer these primitives hierarchically. In Stealth Cloud’s Ghost Chat, the wallet signature (signing) authenticates the user, the session key is derived via HKDF (hashing), messages are encrypted with AES-256-GCM (encryption), and the server verifies JWT tokens using HMAC (symmetric authentication). Each primitive is chosen for its specific security property, and the system’s overall guarantee is the composition of those properties.

The Stealth Cloud Perspective

The distinction between encryption, hashing, and signing is not a taxonomy for textbooks. It is a decision framework for engineering. Every piece of data in a system has a specific security requirement – confidentiality, integrity, authentication, non-repudiation, or some combination – and the correct primitive follows directly from that requirement.

In zero-knowledge systems, getting the primitive selection wrong is not recoverable. There is no server-side safety net to catch a hashed-but-not-encrypted message that should have been confidential. There is no administrator who can retroactively sign an unsigned session token. The primitives must be correct by construction, because they cannot be corrected by intervention.

Stealth Cloud applies each primitive to its appropriate function: wallet signatures for authentication, AES-256-GCM for confidentiality with integrity, HKDF for key derivation, and SHA-256 for commitment and fingerprinting. No primitive is asked to serve a purpose it was not designed for. No security property is assumed to emerge from the wrong operation.

The three primitives have existed for decades. The failure mode is equally old: using the right tool for the wrong job. Understanding the distinction is not merely educational. It is the minimum viable competence for building systems that handle private data.