Part I.b: EASE - Embarrassingly Shallow Autoencoders

A Simple Yet Powerful Baseline for Collaborative Filtering

From Matrix Factorization to EASE

We just learned: Matrix Factorization decomposes \(R \approx U \times V^T\)

But what if we could skip the factorization?

EASE (Steck, 2019): Learn item-item similarities directly

No iterations
No neural networks
Closed-form solution!

The EASE Idea

Classic autoencoder: Learn to reconstruct input through bottleneck

\[\text{Input} \rightarrow \text{Encoder} \rightarrow \text{Bottleneck} \rightarrow \text{Decoder} \rightarrow \text{Output}\]

EASE: Skip the bottleneck! Learn reconstruction weights directly

\[X \approx X \cdot B\]

where \(B\) is an item-item similarity matrix

Constraint: \(\text{diag}(B) = 0\) (can’t predict item from itself)

Intuition: Item-Item Similarities

How does \(\hat{X} = X \cdot B\) make recommendations?

Users: Alice watched [Toy Story], Bob watched [Godfather, Die Hard]

Item-item similarity matrix B (learned from all users):

	Toy Story	Godfather	Die Hard	Inside Out	Heat
Toy Story	0.0	0.01	0.02	0.18	0.01
Godfather	0.01	0.0	0.08	0.02	0.15
Die Hard	0.02	0.08	0.0	0.01	0.12

Alice: Inside Out = 0.18 ✅ | Heat = 0.01 ❌

Bob: Heat = 0.15 + 0.12 = 0.27 ✅ | Inside Out = 0.02 + 0.01 = 0.03 ❌

Let’s Build EASE!

EASE Strengths & Weaknesses

Strengths ✅

Simple: Closed-form solution
Fast: No iterations needed
Strong baseline: Competitive with complex models
Interpretable: Item-item similarities

Weaknesses ❌

Memory: Dense \(B\) matrix (O(n²) for n items)
Cold start: Can’t handle new items without retraining
No sequences: Doesn’t model temporal patterns

Bottom line: Excellent baseline, but not the final answer!

EASE in Practice

When to use EASE:

As a strong baseline to compare against
For medium-scale item catalogs (< 100K items)
When you need fast training and simple deployment
For implicit feedback data (clicks, views, purchases)

Real-world impact:

Many production systems use EASE as:

Initial baseline before investing in complex models
Fallback when neural models fail
Component in ensemble systems

References

Steck, H. (2019). Embarrassingly shallow autoencoders for sparse data. The World Wide Web Conference, 3251–3257. https://doi.org/10.1145/3308558.3313710