SERUM: State Extraction and Refinement for User Modeling

SERUM live: generalized runner

A live runner meant to help you understand the paper by watching SERUM work on a video of your choice. Pick a pre-loaded video or paste a YouTube URL, and stream pass-by-pass labels as the model produces them. To keep latency reasonable, the post-hoc analysis stages (normalization, Markov modeling, chart generation) are skipped here. Those live in the dashboard below.

1. Choose a video

Preloaded library …or paste a YouTube URL

Some restricted YouTube videos may fail to fetch (anti-scraping protections).

Inference takes a few minutes per video. Pass 1 results appear at ~90 seconds; full convergence by ~18 minutes for a 10-minute video. Leave the tab open to watch.

2. Live status

Submit a video to see live inference.

SERUM dashboard: full pipeline output, 400+ videos

The dashboard lets you explore the complete pipeline output (including the post-hoc analyses skipped by the live runner above) across 400+ videos: per-pass refinement chains, FSM graphs synchronized to the video, normalized Markov transition matrices, and confidence / perplexity diagnostics.

Open Dashboard Mirror

If the first link is offline, try the mirror. Some ISPs may block ngrok; visit the HTTP version once to accept the certificate before the HTTPS link will load reliably.

BibTeX

@inproceedings{serum2026,
  title     = {SERUM: State Extraction and Refinement for User Modeling},
  author    = {Phu, Andy J. and Mooney, James and de Langis, Karin and Le, Khanh Chi and Kang, Dongyeop},
  booktitle = {Conference on Language Modeling (CoLM)},
  year      = {2026},
  note      = {Under review}
}

SERUM

State Extraction and Refinement for User Modeling from Egocentric Video

SERUM Overview