18 KiB
LLM Media Server — Architecture Design
Patterns derived from Jellyfin's architecture, implemented in Rust.
1. Cargo Workspace Layout
lms/ ← workspace root
├── Cargo.toml ← workspace members
├── crates/
│ ├── lms-server/ ← binary entry, Axum router, DI wiring
│ ├── lms-core/ ← shared traits, domain types, error types
│ ├── lms-library/ ← LibraryManager, FileWatcher, MediaResolver
│ ├── lms-metadata/ ← MetadataRouter + provider implementations
│ ├── lms-llm/ ← LlmRouter + LLM provider implementations
│ ├── lms-media/ ← ffmpeg wrapper, MediaProbe, thumbnail extractor
│ ├── lms-stream/ ← StreamBuilder, direct play, HLS handler
│ └── lms-db/ ← SQLite/sqlx, migrations, repositories
├── docs/
└── docker/
├── Dockerfile
└── docker-compose.yml
Dependency direction (one-way, no cycles):
lms-server
├── lms-library → lms-core, lms-db
├── lms-metadata → lms-core, lms-llm
├── lms-llm → lms-core
├── lms-media → lms-core
├── lms-stream → lms-core, lms-media, lms-db
└── lms-db → lms-core
lms-core has zero internal dependencies. All traits are defined here.
2. Core Traits (lms-core)
Mirroring Jellyfin's IMetadataProvider hierarchy using Rust trait objects.
2.1 Metadata Provider
// crates/lms-core/src/providers/metadata.rs
#[async_trait]
pub trait MetadataProvider: Send + Sync {
fn name(&self) -> &str;
/// Lower number = higher priority (mirrors Jellyfin IHasOrder, default 50)
fn priority(&self) -> u8 { 50 }
fn supports(&self, item_type: ItemType) -> bool;
/// Returns None if this provider has no result for the query
async fn fetch(&self, query: &MetadataQuery) -> Result<Option<MetadataResult>>;
}
pub struct MetadataQuery {
pub title: String,
pub year: Option<u16>,
pub item_type: ItemType,
pub external_ids: HashMap<String, String>, // "tmdb" -> "12345"
}
pub struct MetadataResult {
pub source: String, // "tmdb" | "tvdb" | "llm"
pub external_id: Option<String>,
pub title: String,
pub overview: Option<String>,
pub genres: Vec<String>,
pub year: Option<u16>,
pub rating: Option<f32>,
pub poster_url: Option<String>,
pub backdrop_url: Option<String>,
pub cast: Vec<PersonInfo>,
pub llm_generated: bool, // transparency flag for LLM-generated fields
pub raw_json: Option<String>, // cached raw API response
}
2.2 LLM Provider
// crates/lms-core/src/providers/llm.rs
#[async_trait]
pub trait LlmProvider: Send + Sync {
fn name(&self) -> &str;
async fn is_available(&self) -> bool;
async fn complete(&self, prompt: &str, opts: &LlmOptions) -> Result<String>;
}
pub struct LlmOptions {
pub model: Option<String>,
pub max_tokens: u32,
pub temperature: f32,
}
2.3 Domain Model
Inspired by Jellyfin's BaseItem, but using a Rust enum hierarchy instead of class inheritance.
// crates/lms-core/src/domain/item.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MediaItem {
pub id: Uuid,
pub item_type: ItemType,
pub title: String,
pub sort_title: String,
pub file_path: PathBuf,
pub file_hash: String, // SHA-256, deduplication
pub duration_secs: Option<u32>,
pub created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub enum ItemType {
Movie,
Series,
Episode { season: u8, episode: u16, series_id: Uuid },
HomeVideo,
}
#[derive(Debug, Clone)]
pub struct ClassificationResult {
pub item_type: ItemType,
pub confidence: f32, // 0.0–1.0; < 0.85 flags for manual review
pub llm_used: bool,
pub model: Option<String>,
}
2.4 Client Capabilities (mirrors Jellyfin DeviceProfile)
pub struct ClientCapabilities {
pub supported_containers: Vec<String>,
pub supported_video_codecs: Vec<String>,
pub supported_audio_codecs: Vec<String>,
pub max_bitrate_bps: u64,
}
3. Processing Pipeline
Inspired by Jellyfin's LibraryManager + TaskManager. Key improvement over Jellyfin: jobs are persisted to SQLite (Jellyfin uses a pure in-memory ConcurrentQueue), enabling crash recovery.
notify::Watcher (inotify)
│ IngestEvent { path, event_type }
▼
mpsc::channel<IngestEvent>
│
▼
┌──────────────────────────────────────────┐
│ IngestWorker │
│ 1. Validate file extension │
│ 2. SHA-256 fingerprint → check dup │
│ 3. Insert media_items (status: pending) │
│ 4. Insert processing_jobs record │
└──────────────────────┬───────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ ClassificationWorker │
│ 1. MediaResolver: parse filename │
│ - Regex match S##E## → Episode │
│ - Directory structure → Series/Movie │
│ 2. confidence >= 0.85 → classify │
│ 3. confidence < 0.85 → LlmRouter │
│ 4. Update media_items.item_type │
└──────────────────────┬───────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ MetadataWorker │
│ 1. MetadataRouter.fetch() │
│ Iterate providers by priority: │
│ TmdbProvider → TvdbProvider → ... │
│ 2. First Some() wins; stop iteration │
│ 3. All None → LlmFallbackProvider │
│ 4. Write to metadata table │
└──────────────────────┬───────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ ThumbnailWorker │
│ 1. metadata.poster_url exists → download│
│ 2. No poster → ffmpeg keyframe extract │
│ 3. Save to {config_dir}/thumbs/{id}.jpg │
└──────────────────────┬───────────────────┘
│
▼
processing_jobs.status = done
Crash recovery: On startup, scan processing_jobs for status = 'running', reset to pending and re-enqueue.
4. MediaResolver — Filename Parsing (ref: Emby.Naming)
// crates/lms-library/src/resolver.rs
pub struct MediaResolver {
episode_patterns: Vec<Regex>, // compiled once at startup
}
impl MediaResolver {
pub fn resolve(&self, path: &Path) -> ParsedMedia {
// 1. Directory structure analysis (ref: Jellyfin BaseVideoResolver)
if self.has_season_dirs(path) {
return ParsedMedia::Series { ... };
}
// 2. Filename regex match
if let Some(ep) = self.try_parse_episode(path) {
return ParsedMedia::Episode(ep);
}
// 3. Default: Movie
ParsedMedia::Movie { title: self.clean_title(path) }
}
}
// Regex patterns (ref: Jellyfin NamingOptions, 20+ common formats)
const EPISODE_PATTERNS: &[&str] = &[
r"[Ss](?P<s>\d{1,2})[\s._-]*[Ee](?P<e>\d{1,3})", // S01E01
r"Season\s*(?P<s>\d+)\s*Episode\s*(?P<e>\d+)", // Season 1 Episode 2
r"[\s_-](?P<e>\d{2,4})[\s_-]", // Absolute (anime)
r"(?P<y>\d{4})[\.\-](?P<m>\d{2})[\.\-](?P<d>\d{2})", // Date-based
];
5. MetadataRouter — Priority Routing (ref: Jellyfin ProviderManager)
// crates/lms-metadata/src/router.rs
pub struct MetadataRouter {
providers: Vec<Box<dyn MetadataProvider>>, // sorted by priority()
}
impl MetadataRouter {
pub fn new(mut providers: Vec<Box<dyn MetadataProvider>>) -> Self {
providers.sort_by_key(|p| p.priority());
Self { providers }
}
pub async fn fetch(&self, query: &MetadataQuery) -> Result<Option<MetadataResult>> {
for provider in &self.providers {
if !provider.supports(query.item_type) { continue; }
match provider.fetch(query).await {
Ok(Some(result)) => return Ok(Some(result)), // first hit wins
Ok(None) => continue,
Err(e) => { tracing::warn!("{} failed: {e}", provider.name()); continue; }
}
}
Ok(None)
}
}
Provider Priority Table
| Provider | priority | Item Types |
|---|---|---|
| TmdbProvider | 10 | Movie, Series, Episode |
| TvdbProvider | 20 | Series, Episode |
| AniDbProvider | 30 | Series (anime) |
| LlmFallbackProvider | 90 | All (last resort) |
6. LlmRouter — Multi-Provider with Fallback
// crates/lms-llm/src/router.rs
pub struct LlmRouter {
primary: Box<dyn LlmProvider>,
fallback: Option<Box<dyn LlmProvider>>,
}
impl LlmRouter {
pub async fn complete(&self, prompt: &str, opts: &LlmOptions) -> Result<String> {
if self.primary.is_available().await {
return self.primary.complete(prompt, opts).await;
}
if let Some(fb) = &self.fallback {
tracing::warn!("Primary LLM unavailable, falling back to {}", fb.name());
return fb.complete(prompt, opts).await;
}
Err(LlmError::NoAvailableProvider)
}
pub async fn classify(&self, file_name: &str, context: &str) -> Result<ClassificationResult> {
let prompt = prompts::classification(file_name, context);
let raw = self.complete(&prompt, &LlmOptions::default()).await?;
serde_json::from_str(&raw).map_err(LlmError::ParseError)
}
}
// Provider implementations:
// crates/lms-llm/src/providers/
// ollama.rs → GET http://localhost:11434/api/generate
// claude.rs → POST https://api.anthropic.com/v1/messages
// openai.rs → POST https://api.openai.com/v1/chat/completions
7. StreamBuilder — Decision Tree (ref: Jellyfin StreamBuilder.cs)
// crates/lms-stream/src/builder.rs
pub enum StreamPlan {
DirectPlay { path: PathBuf },
Remux { path: PathBuf, target_container: String },
Transcode { path: PathBuf, video_codec: String, audio_codec: String, bitrate: u64 },
}
impl StreamBuilder {
pub async fn build(item: &MediaItem, probe: &MediaProbe, caps: &ClientCapabilities) -> StreamPlan {
if Self::can_direct_play(probe, caps) {
return StreamPlan::DirectPlay { path: item.file_path.clone() };
}
if Self::can_remux(probe, caps) {
return StreamPlan::Remux { path: item.file_path.clone(), target_container: "mp4".into() };
}
StreamPlan::Transcode {
path: item.file_path.clone(),
video_codec: "h264".into(),
audio_codec: "aac".into(),
bitrate: caps.max_bitrate_bps.min(8_000_000),
}
}
fn can_direct_play(probe: &MediaProbe, caps: &ClientCapabilities) -> bool {
caps.supported_containers.contains(&probe.container)
&& caps.supported_video_codecs.contains(&probe.video_codec)
&& caps.supported_audio_codecs.contains(&probe.audio_codec)
&& probe.bitrate_bps <= caps.max_bitrate_bps
}
fn can_remux(probe: &MediaProbe, caps: &ClientCapabilities) -> bool {
caps.supported_video_codecs.contains(&probe.video_codec)
&& caps.supported_audio_codecs.contains(&probe.audio_codec)
}
}
8. Database Schema (lms-db)
CREATE TABLE media_items (
id TEXT PRIMARY KEY,
item_type TEXT NOT NULL, -- 'movie'|'series'|'episode'|'home_video'
title TEXT NOT NULL,
sort_title TEXT NOT NULL,
file_path TEXT NOT NULL UNIQUE,
file_hash TEXT NOT NULL, -- SHA-256 deduplication
duration_s INTEGER,
status TEXT NOT NULL DEFAULT 'pending',
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE episode_info (
item_id TEXT PRIMARY KEY REFERENCES media_items(id) ON DELETE CASCADE,
series_id TEXT REFERENCES media_items(id),
season_num INTEGER,
episode_num INTEGER NOT NULL
);
CREATE TABLE metadata (
item_id TEXT PRIMARY KEY REFERENCES media_items(id) ON DELETE CASCADE,
source TEXT NOT NULL, -- 'tmdb'|'tvdb'|'llm'
external_id TEXT,
overview TEXT,
genres TEXT, -- JSON array
cast_crew TEXT, -- JSON array
rating REAL,
poster_url TEXT,
backdrop_url TEXT,
year INTEGER,
llm_generated INTEGER NOT NULL DEFAULT 0, -- transparency flag
raw_json TEXT, -- cached API response
fetched_at TEXT NOT NULL
);
CREATE TABLE tags (
item_id TEXT REFERENCES media_items(id) ON DELETE CASCADE,
tag TEXT NOT NULL,
confidence REAL NOT NULL,
llm_model TEXT NOT NULL,
PRIMARY KEY (item_id, tag)
);
CREATE TABLE processing_jobs (
id TEXT PRIMARY KEY,
item_id TEXT REFERENCES media_items(id) ON DELETE CASCADE,
job_type TEXT NOT NULL, -- 'classify'|'metadata'|'thumbnail'
status TEXT NOT NULL DEFAULT 'pending',
attempts INTEGER NOT NULL DEFAULT 0,
error TEXT,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE INDEX idx_items_type ON media_items(item_type);
CREATE INDEX idx_items_hash ON media_items(file_hash);
CREATE INDEX idx_jobs_status ON processing_jobs(status);
CREATE INDEX idx_episode_series ON episode_info(series_id);
Repository split (ref: Jellyfin's ongoing service decomposition):
| Service | Responsibility |
|---|---|
ItemRepository |
CRUD on media_items |
MetadataRepository |
CRUD on metadata + tags |
JobRepository |
Job queue, crash recovery queries |
SearchService |
Full-text search (SQLite FTS5) |
EpisodeRepository |
Series/episode relationship queries |
9. REST API
All responses Content-Type: application/json. Errors:
{ "error": "NOT_FOUND", "message": "Item 123 not found" }
# Library
GET /api/library list items (paginated, type/genre filter)
GET /api/library/:id item detail with metadata + tags
POST /api/library/scan trigger full rescan
DELETE /api/library/:id remove from library (does not delete file)
# Streaming
GET /api/stream/:id video stream (Range request support)
GET /api/stream/:id/thumbnail thumbnail image (JPEG)
# Search
GET /api/search?q=&type=&genre=&year=
# Jobs
GET /api/jobs list all jobs + status
GET /api/jobs/:id
POST /api/jobs/:id/retry retry failed job
# Classification
POST /api/classify/:id force LLM reclassification
# Config
GET /api/config current config (API keys redacted)
PATCH /api/config partial update
10. Configuration (TOML)
[server]
host = "0.0.0.0"
port = 3000
[library]
paths = ["/media/movies", "/media/tv"]
scan_interval_secs = 3600
[metadata]
tmdb_api_key = ""
tvdb_api_key = ""
provider_order = ["tmdb", "tvdb", "llm"]
[llm]
default_provider = "ollama"
fallback_provider = "ollama"
[llm.ollama]
base_url = "http://localhost:11434"
model = "llama3.2"
[llm.claude]
api_key = ""
model = "claude-sonnet-4-6"
[llm.openai]
api_key = ""
base_url = "https://api.openai.com/v1"
model = "gpt-4o"
[streaming]
transcode_dir = "/tmp/lms-transcode"
max_concurrent_jobs = 2
[db]
path = "/data/lms.db"
11. Docker Deployment
# docker-compose.yml
services:
lms:
build: .
ports:
- "3000:3000"
volumes:
- ./config:/config # config + SQLite DB
- /your/media:/media:ro # media library (read-only)
- /tmp/lms-transcode:/transcode
environment:
- LMS_DB_PATH=/config/lms.db
- LMS_CONFIG=/config/lms.toml
restart: unless-stopped
ollama:
image: ollama/ollama
volumes:
- ollama-data:/root/.ollama
ports:
- "11434:11434"
volumes:
ollama-data:
FROM rust:1.82-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release --bin lms-server
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ffmpeg ca-certificates && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/lms-server /usr/local/bin/lms-server
EXPOSE 3000
ENTRYPOINT ["lms-server"]
12. Key Crate Dependencies
# lms-server
axum = "0.7"
tokio = { version = "1", features = ["full"] }
tower-http = { version = "0.5", features = ["cors", "trace"] }
tracing = "0.1"
tracing-subscriber = "0.3"
# lms-core
serde = { version = "1", features = ["derive"] }
serde_json = "1"
uuid = { version = "1", features = ["v4"] }
chrono = { version = "0.4", features = ["serde"] }
async-trait = "0.1"
thiserror = "1"
# lms-library
notify = "6"
regex = "1"
# lms-db
sqlx = { version = "0.7", features = ["sqlite", "runtime-tokio", "migrate", "uuid", "chrono"] }
# lms-llm / lms-metadata
reqwest = { version = "0.12", features = ["json"] }
# lms-media
tokio-process = "1"
13. Key Differences from Jellyfin
| Concern | Jellyfin | This Project |
|---|---|---|
| Language | C# / .NET 9 | Rust |
| Metadata | External providers primary | External providers + LLM fallback |
| Job queue | In-memory ConcurrentQueue | SQLite-persisted, crash-recoverable |
| Plugin system | Dynamic assembly loading | Compiled-in, no dynamic loading in MVP |
| Frontend | Built-in web client | TBD / separate, pure REST API |
| Subtitles | Supported | Out of scope |
| Config format | JSON | TOML |