The Analogy
Two people can be different heights but have the same taste in music. Cosine similarity ignores “how big” the vectors are and only cares about direction. It’s like comparing the angle between two arrows: angle 0° = identical direction (similarity = 1), angle 90° = unrelated (similarity = 0), angle 180° = opposite (similarity = −1).
Key insight: This is exactly how Spotify finds songs similar to your taste. Your listening history is a vector, each song is a vector, and Spotify computes cosine similarity to rank which songs “point in the same direction” as your preferences.
Worked Example
# Cosine similarity = dot(a,b) / (‖a‖ × ‖b‖)
a = np.array([1, 2, 3])
b = np.array([2, 4, 6]) # same direction, 2× longer
cos_sim = np.dot(a, b) / (
np.linalg.norm(a) * np.linalg.norm(b)
) # 1.0 — perfectly aligned!
# Perpendicular vectors
c = np.array([1, 0])
d = np.array([0, 1])
# cos_sim = 0 / (1 × 1) = 0.0 — unrelated
Formula: cos(θ) = (a · b) / (‖a‖ × ‖b‖), range [−1, 1]