What Is Cryptography?
Cryptography is the practice and study of techniques for secure communication in the presence of adversaries. The word originates from the Greek words kryptos (hidden) and graphein (to write) ā literally, "hidden writing."
At its core, cryptography transforms readable information (plaintext) into an unreadable form (ciphertext) using mathematical algorithms and secret keys. Only those possessing the correct key can reverse the transformation and recover the original message.
Why It Matters
Every time you browse a website with HTTPS, send an encrypted message, or authenticate with a password, you are relying on cryptography. It is the invisible shield protecting our digital lives.
History of Cryptography
The art of secret writing stretches back thousands of years. What began as simple letter substitutions has evolved into sophisticated mathematical systems that protect billions of transactions daily.
Ancient Times (~1900 BC)
Egyptian scribes used non-standard hieroglyphs in inscriptions, creating one of the earliest known examples of text transformation for secrecy.
500 BC ā Sparta
The Spartans used the scytale, a cylindrical tool that wrapped a strip of parchment around a rod of specific diameter. The message was only readable when wrapped around a rod of the same size ā an early transposition cipher.
58 BC ā Caesar Cipher
Julius Caesar used a simple letter shift cipher (shift of 3) for military correspondence. A ā D, B ā E, etc. While trivial by modern standards, it was effective against literate opponents of the era.
1467 ā Alberti's Polyalphabetic Cipher
Leon Battista Alberti invented the polyalphabetic cipher, using multiple substitution alphabets to defeat frequency analysis. This was a quantum leap in cryptographic security.
1553 ā Vigenere Cipher
Giovan Battista Bellaso created what we now call the Vigenere cipher (often misattributed to Blaise de Vigenere). It uses a keyword to determine different shift values for each letter position.
1918 ā One-Time Pad
Gilbert Vernam invented the one-time pad, proven to be theoretically unbreakable when used correctly. It requires a random key as long as the message, used only once.
1920sā1945 ā Enigma Machine
The Enigma machine, used by Nazi Germany, implemented a complex electro-mechanical rotor cipher. Its breaking by Alan Turing's team at Bletchley Park is considered one of the greatest intellectual achievements of the 20th century.
1976 ā Diffie-Hellman Key Exchange
Whitfield Diffie and Martin Hellman published the first public-key protocol, enabling two parties to establish a shared secret over an insecure channel ā a foundational concept in modern cryptography.
1977 ā RSA Algorithm
Ron Rivest, Adi Shamir, and Leonard Adleman published the RSA algorithm, the first practical public-key encryption system. It remains widely used today for secure data transmission.
2001 ā AES Standard
The Advanced Encryption Standard (AES) was adopted by NIST after a public competition. The Rijndael algorithm, created by Vincent Rijmen and Joan Daemen, became the global standard for symmetric encryption.
Core Concepts
Understanding cryptography begins with these foundational terms:
Plaintext
The original, readable message or data before encryption. Also called cleartext.
Ciphertext
The scrambled, unreadable output of the encryption process. Appears as random characters.
Key
A piece of information (a number, password, or string) that controls the encryption/decryption process.
Encryption
The process of converting plaintext to ciphertext using an algorithm and a key.
Decryption
The reverse process ā converting ciphertext back to plaintext using the correct key.
Algorithm
The mathematical procedure used for encryption/decryption. Also called a cipher.
Kerckhoffs's Principle
A cryptosystem should be secure even if everything about the system, except the key, is public knowledge. Security should depend solely on the secrecy of the key, not the algorithm.
Symmetric Encryption
Symmetric encryption uses the same key for both encryption and decryption. Both the sender and receiver must share this secret key before communication begins.
How It Works
- Alice and Bob agree on a shared secret key.
- Alice encrypts her message using the key and sends the ciphertext.
- Bob decrypts the ciphertext using the same key to read the message.
Advantages
- Fast: Symmetric algorithms are computationally efficient, suitable for encrypting large amounts of data.
- Proven: Well-established algorithms like AES have been extensively analyzed and are considered secure.
Disadvantages
- Key distribution problem: How do you securely share the key with the other party? If an attacker intercepts the key, all communication is compromised.
- Scalability: In a group of N people, you need NĆ(N-1)/2 unique keys for pairwise communication.
Common Symmetric Algorithms
- AES (Advanced Encryption Standard): The gold standard for symmetric encryption. Supports 128, 192, and 256-bit keys. Used by governments, banks, and virtually all HTTPS connections.
- DES (Data Encryption Standard): The predecessor to AES. Uses a 56-bit key, now considered insecure due to its short key length.
- 3DES (Triple DES): Applies DES three times per block. More secure than DES but much slower than AES.
- ChaCha20: A modern stream cipher used in TLS and mobile platforms. Designed by Daniel Bernstein.
Asymmetric Encryption
Asymmetric encryption (also called public-key cryptography) uses two mathematically related keys: a public key and a private key. The public key can be freely shared, while the private key must be kept secret.
How It Works
- Bob generates a key pair: public key and private key.
- Bob shares his public key with Alice (and anyone else).
- Alice encrypts her message using Bob's public key.
- Only Bob's private key can decrypt the message.
Advantages
- No key distribution problem: Public keys can be freely shared. No secure channel is needed to exchange keys.
- Digital signatures: The private key can sign messages, and anyone can verify the signature with the public key.
- Scalable: Only N key pairs are needed for N users.
Disadvantages
- Slow: Asymmetric operations are 100ā1000x slower than symmetric operations. Not suitable for large data.
- Key size: Requires much larger key sizes for equivalent security (RSA-2048 ā AES-112 in security level).
Common Asymmetric Algorithms
- RSA: Based on the difficulty of factoring large numbers. The most widely used public-key algorithm. Common key sizes: 2048, 3072, 4096 bits.
- ECC (Elliptic Curve Cryptography): Based on the algebraic structure of elliptic curves. Provides equivalent security with much smaller keys (256-bit ECC ā 3072-bit RSA).
- Diffie-Hellman: Enables two parties to establish a shared secret over an insecure channel. The foundation of key exchange in TLS.
- DSA (Digital Signature Algorithm): Used specifically for digital signatures, not encryption.
Hybrid Encryption in Practice
Modern systems combine both approaches: asymmetric encryption is used to exchange a symmetric session key, and then symmetric encryption handles the actual data. This is how HTTPS/TLS works ā you get the security benefits of asymmetric cryptography with the speed of symmetric cryptography.
Hash Functions
A cryptographic hash function takes input of any size and produces a fixed-size output (the hash or digest). Hash functions are one-way: you cannot recover the input from the output.
Key Properties
- Deterministic: The same input always produces the same hash.
- Quick computation: Computing a hash should be fast.
- Preimage resistance: Given a hash, it should be infeasible to find the original input.
- Avalanche effect: A tiny change in input produces a completely different hash.
- Collision resistance: It should be infeasible to find two different inputs that produce the same hash.
Common Hash Algorithms
- SHA-256: Produces a 256-bit hash. Part of the SHA-2 family. The current standard for most applications.
- SHA-3: The latest SHA standard, based on the Keccak algorithm. Designed as a backup in case SHA-2 is compromised.
- MD5: Produces a 128-bit hash. Now considered cryptographically broken due to collision vulnerabilities, but still used for non-security purposes like checksums.
Use Cases
- Password storage (store the hash, not the password)
- Data integrity verification (detect if a file has been modified)
- Digital signatures (sign the hash instead of the full document)
- Blockchain (cryptographic links between blocks)
Digital Signatures
A digital signature is the asymmetric equivalent of a handwritten signature. It provides three guarantees:
- Authentication: Proves who sent the message.
- Integrity: Proves the message has not been altered.
- Non-repudiation: The sender cannot deny having signed the message.
How Digital Signatures Work
- Alice hashes her message to create a digest.
- Alice encrypts the digest with her private key ā this is the signature.
- Alice sends the message and the signature to Bob.
- Bob hashes the received message and decrypts the signature with Alice's public key.
- If the two hashes match, the signature is valid.
Key Management
Key management is often the weakest link in a cryptographic system. Even the strongest algorithm is useless if keys are handled improperly.
Best Practices
- Key generation: Use cryptographically secure random number generators (CSPRNG), not standard random functions.
- Key storage: Never hardcode keys in source code. Use hardware security modules (HSMs) or secure key stores.
- Key rotation: Regularly replace keys. Even if a key is compromised, the damage is limited to the rotation period.
- Key destruction: Securely delete old keys when they are no longer needed.
- Key length: Use appropriate key lengths ā longer keys provide more security but are slower.
Remember
Security is only as strong as its weakest link. A perfectly implemented AES-256 encryption is worthless if the encryption key is stored in a plaintext file or sent via unencrypted email.
Block Cipher Modes of Operation
Block ciphers like AES operate on fixed-size blocks of data (128 bits for AES). To encrypt messages longer than one block, a mode of operation defines how to repeatedly apply the cipher. The choice of mode significantly affects both security and functionality.
Common Modes
ECB (Electronic Codebook)
Each block encrypted independently. Insecure ā identical plaintext blocks produce identical ciphertext blocks, revealing patterns. Never use for data with structure.
CBC (Cipher Block Chaining)
Each plaintext block is XORed with the previous ciphertext block before encryption. Requires an Initialization Vector (IV). Widely used but vulnerable to padding oracle attacks if not implemented carefully.
CTR (Counter)
Turns a block cipher into a stream cipher by encrypting a counter value. Supports random access and parallel encryption. Secure when the counter is never reused with the same key.
GCM (Galois/Counter)
Combines CTR mode encryption with authentication. Provides both confidentiality and integrity ā the gold standard for modern applications. Used in TLS 1.3 and most current systems.
Best Practice
Always use authenticated encryption (AEAD) modes like GCM or ChaCha20-Poly1305. Encryption without authentication leaves you vulnerable to bit-flipping attacks where an attacker modifies ciphertext without detection.
Public Key Infrastructure (PKI)
Public Key Infrastructure is the framework that enables secure distribution and verification of public keys. Without PKI, there's no way to know if a public key truly belongs to the person or organization it claims to represent.
How PKI Works
- Certificate Authority (CA): A trusted third party that verifies identity and issues digital certificates binding a public key to an identity.
- Digital Certificates: Electronic documents containing a public key and identity information, signed by the CA. The X.509 standard defines the format.
- Certificate Chain: Your certificate is signed by an intermediate CA, which is signed by a root CA. Browsers trust root CAs and verify the entire chain.
- Revocation: Compromised certificates can be revoked via CRL (Certificate Revocation List) or OCSP (Online Certificate Status Protocol).
Real-World PKI
- HTTPS/TLS: When you visit a secure website, your browser verifies the server's certificate against its trusted CA store.
- Code signing: Software developers sign their applications with certificates, allowing users to verify the software hasn't been tampered with.
- Email encryption: S/MIME uses PKI for encrypting and signing email messages.
Post-Quantum Cryptography
Quantum computers, once sufficiently powerful, could break widely used public-key cryptosystems. Shor's algorithm can factor large numbers and compute discrete logarithms in polynomial time, rendering RSA, ECC, and Diffie-Hellman insecure. Post-quantum cryptography (PQC) develops algorithms resistant to both classical and quantum attacks.
NIST Post-Quantum Standards (2024)
After a multi-year evaluation process, NIST has standardized three post-quantum algorithms:
- ML-KEM (Module-Lattice-Based Key Encapsulation Mechanism): Based on the CRYSTALS-Kyber algorithm. Used for key exchange. The primary replacement for RSA and ECDH key exchange.
- ML-DSA (Module-Lattice-Based Digital Signature Algorithm): Based on CRYSTALS-Dilithium. Used for digital signatures. The primary replacement for RSA and ECDSA signatures.
- SLH-DSA (Stateless Hash-Based Digital Signature Algorithm): Based on SPHINCS+. A hash-based signature scheme providing a conservative alternative to ML-DSA.
The Migration Challenge
Migrating to post-quantum cryptography is a massive undertaking:
- Larger keys and signatures: ML-KEM public keys are ~1,184 bytes vs. 256 bytes for ECDH. ML-DSA signatures are ~2,420 bytes vs. 64 bytes for ECDSA.
- Performance: PQC operations are generally slower than classical counterparts, though some lattice-based operations are competitive.
- Hybrid approaches: During the transition, systems use both classical and post-quantum algorithms simultaneously, maintaining security even if one is broken.
- "Harvest now, decrypt later": Adversaries may be collecting encrypted data today with the intent to decrypt it once quantum computers become available. This makes the migration urgent for long-lived secrets.
Timeline
Current estimates suggest cryptographically relevant quantum computers could emerge within 10-15 years. Organizations handling data that must remain confidential for decades should begin PQC migration now.
Steganography
Steganography is the art of hiding information within other, seemingly innocuous data. Unlike cryptography, which makes data unreadable, steganography makes data invisible ā an observer doesn't even know a secret message exists.
Steganography vs. Cryptography
- Cryptography: Protects the content of a message. Everyone can see that a secret message exists, but cannot read it.
- Steganography: Protects the existence of a message. No one knows a secret message is present.
- Combined: Encrypt first, then hide ā even if the steganography is detected, the message remains encrypted.
Common Techniques
- Image steganography: Modify the least significant bits (LSB) of pixel values in an image. The visual change is imperceptible, but the bits encode a hidden message.
- Audio steganography: Embed data in audio files by modifying inaudible frequencies or using echo hiding.
- Text steganography: Use formatting, whitespace, or linguistic patterns to encode information.
- Network steganography: Hide data in network protocol headers, timing patterns, or packet ordering.
Historical Examples
- Ancient Greeks shaved a slave's head, tattooed a message, and sent them after the hair grew back (documented by Herodotus).
- During WWII, German spies used microdots ā photographs shrunk to the size of a printed period ā to hide messages in seemingly ordinary letters.
- Modern watermarking techniques use steganographic principles to embed ownership information in digital media.