Matrix Multiplication Calculator
Multiply matrices of general dimensions — from 2×2 up to 4×4. Choose independent row/column counts for A and B. Each result element is computed as a dot product.
Matrix Dimensions
Matrix A (2×3)
Matrix B (3×2)
| A dimensions | B dimensions | A×B valid? | Result size |
|---|---|---|---|
| 2×3 | 3×4 | ✅ Yes (cols A = rows B = 3) | 2×4 |
| 3×3 | 3×3 | ✅ Yes (square) | 3×3 |
| 1×n | n×1 | ✅ Yes (dot product!) | 1×1 (scalar) |
| n×1 | 1×n | ✅ Yes (outer product) | n×n |
| 2×3 | 2×3 | ❌ No (cols 3 ≠ rows 2) | Undefined |
| m×n | n×p | ✅ General case | m×p |
Why Matrix Multiplication Matters
Every neural network forward pass is essentially a sequence of matrix multiplications. Google's search ranking uses matrix operations. Computer graphics transformations (rotate, scale, translate) are matrix multiplications chained together. Understanding matrix multiplication is central to modern computing.
Real-World Examples
- Neural Networks: A layer with n inputs and m outputs applies a m×n weight matrix to an n-dimensional input vector (n×1), giving an m×1 output.
- 3D Transformations: Rotation, scaling, and translation are all 4×4 matrices in homogeneous coordinates. Composing transformations = multiplying matrices.
- Markov Chains: Transition matrix raised to power n gives probabilities after n steps — used in Google PageRank.
- Cryptography: The Hill cipher uses matrix multiplication modulo 26 to encrypt text.
Frequently Asked Questions
What is the computational complexity of matrix multiplication?
Naive matrix multiplication of two n×n matrices takes O(n³) operations. Strassen's algorithm (1969) reduces this to O(n^2.807). The current best theoretical algorithm is O(n^2.371). For practical use, libraries like NumPy, BLAS, and cuBLAS use highly optimized algorithms with cache-friendly memory access patterns and SIMD instructions.
How does matrix multiplication relate to linear transformations?
Every matrix multiplication represents the composition of two linear transformations. If T₁ is represented by A and T₂ by B, then applying T₁ then T₂ corresponds to computing B×A (not A×B — note the order reversal). This is why order matters: "first rotate, then scale" gives a different result than "first scale, then rotate."