From Slates to Complete Recommendation Experiences
Previous section: Optimized individual slates
Slate = [item₁, item₂, item₃, item₄, item₅]
This section: Optimize entire pages with multiple carousels
Page = [
Carousel₁: "Comedies" → [items],
Carousel₂: "Based on a book" → [items],
Carousel₃: "Recently added" → [items],
...
]
Real-world recommendation interfaces have multiple carousels:
Homepage:
┌──────────────────────
│ Continue Watching
│ [🎬 🎬 🎬 🎬 🎬 🎬
├──────────────────────
│ Trending Now
│ [🎬 🎬 🎬 🎬 🎬 🎬
├──────────────────────
│ Comedies
│ [🎬 🎬 🎬 🎬 🎬 🎬
├──────────────────────
│ Based on a Book
│ [🎬 🎬 🎬 🎬 🎬 🎬
└──────────────────────
Key Properties:
Challenge: Optimize the entire page jointly, not each carousel independently
Why? Carousels compete for:
Goal: Maximize total page utility
U[page | user] = relevance + diversity + coverage + coherence
(Ding et al., 2019) formalized the whole page optimization problem:
Problem formulation:
Maximize: Σᵢ reward(carouselᵢ | user, page₋ᵢ)
Subject to:
- No item duplication across carousels
- Global impression constraints
- Budget constraints
- Diversity requirements
Solution: Primal-dual algorithm that:
(Kislinger, 2025) presented a modular framework for multi-carousel recommendations:
Key insight:
This allows optimizing different objectives at different stages!
How to assign items to topics?
Trade-off: Rigid rules vs. flexible semantic grouping
Visual layout: Page is a 2D grid
Row 1: [Item₁ Item₂ Item₃ Item₄ Item₅ Item₆ ]
Row 2: [Item₇ Item₈ Item₉ Item₁₀ Item₁₁ Item₁₂]
Row 3: [Item₁₃ Item₁₄ Item₁₅ Item₁₆ Item₁₇ Item₁₈]
Transformer input: Need a 1D sequence!
[Item₁, Item₂, Item₃, ..., Item₁₈]
Problem: Lost spatial information! Which item is in which row/column?
Solution: Add 2D position embeddings to preserve grid structure
For each item at position (row, col):
item_embedding = item_emb + row_emb + col_emb
Similar to Image Transformer by (Parmar et al., 2018)
1D sequence attention (traditional):
Item₁ → Item₂ → Item₃ → ... → Item₁₈
↓ ↓ ↓ ↓
All items attend to all other items equally
2D grid attention (with position embeddings):
Row 1: Item₁ → Item₂ → Item₃ (strong within-row attention)
↓ ↓ ↓
Row 2: Item₄ → Item₅ → Item₆ (weaker across-row attention)