How I Built a YouTube Downloader API with Quality Selection

I built a YouTube downloader API with quality selection to solve a specific problem: downloading educational content for offline review. The API lets authenticated users fetch videos in their preferred resolution through a simple REST endpoint, using yt-dlp for reliable extraction and Express.js for the web layer.

Architecture Overview

The system follows a straightforward three-layer design. A client sends an authenticated request, the Express server validates it and queues a download job, and yt-dlp handles the platform-specific heavy lifting. I chose cookie-based authentication for simplicity in a controlled environment.

graph TD
    A[Client Request] --> B[Auth Middleware]
    B --> C[Express Router]
    C --> D[Download Controller]
    D --> E[yt-dlp Wrapper]
    E --> F[File System]
    F --> G[Stream Response]

This flow ensures the download logic is isolated. The wrapper around yt-dlp manages child processes and parses its JSON output, which is crucial for quality selection.

Key Technical Decisions

Using yt-dlp over other libraries was the first major decision. Its active maintenance and format selection capabilities were essential. I wrapped its command-line interface in a Promise-based function to keep the API asynchronous.

import { exec } from 'child_process';
import { promisify } from 'util';

const execAsync = promisify(exec);

interface DownloadResult {
  filePath: string;
  title: string;
  selectedQuality: string;
}

async function downloadVideo(
  url: string,
  qualityCode: string = 'best'
): Promise<DownloadResult> {
  // -f merges best video+audio or selects specific format
  const command = `yt-dlp -f "${qualityCode}" --print-json -o "downloads/%(title)s.%(ext)s" ${url}`;
  
  try {
    const { stdout } = await execAsync(command);
    const metadata = JSON.parse(stdout);
    
    return {
      filePath: metadata._filename,
      title: metadata.title,
      selectedQuality: metadata.format_id,
    };
  } catch (error) {
    throw new Error(`Download failed: ${error.message}`);
  }
}

The second decision was implementing cookie-based auth instead of JWT. For this internal tool, sessions were simpler. I used Express middleware to protect routes.

// authMiddleware.js
import { verifySession } from '../auth/session-store.js';

export function requireAuth(req, res, next) {
  const sessionId = req.cookies.sessionId;
  
  if (!sessionId || !verifySession(sessionId)) {
    return res.status(401).json({ error: 'Authentication required' });
  }
  
  req.userId = verifySession(sessionId).userId;
  next();
}

What Broke and How I Fixed It

The first major breakage was memory exhaustion during concurrent downloads. yt-dlp buffers video data, and multiple requests crashed the server. I implemented a simple queue with a maximum of two concurrent downloads.

class DownloadQueue {
  private queue: Array<() => Promise<any>> = [];
  private active = 0;
  private maxConcurrent = 2;

  async add<T>(job: () => Promise<T>): Promise<T> {
    return new Promise((resolve, reject) => {
      const wrappedJob = async () => {
        try {
          const result = await job();
          resolve(result);
        } catch (error) {
          reject(error);
        } finally {
          this.active--;
          this.next();
        }
      };

      this.queue.push(wrappedJob);
      this.next();
    });
  }

  private next() {
    if (this.active >= this.maxConcurrent || this.queue.length === 0) return;
    
    this.active++;
    const job = this.queue.shift();
    job();
  }
}

export const globalDownloadQueue = new DownloadQueue();

The second issue was filename sanitization. YouTube titles can contain special characters that broke the filesystem paths. I now clean filenames before passing them to yt-dlp's output template.

function sanitizeFilename(title) {
  return title.replace(/[<>:"/\\|?*]/g, '_').substring(0, 100);
}

// Used in command: -o "downloads/${sanitizeFilename(title)}.%(ext)s"

How to Build Something Similar

Start by testing yt-dlp commands manually in your terminal. Understand its format selection syntax—yt-dlp -F [url] lists all available formats. Once you can reliably download what you need, wrap it in a Node.js script.

Your Express server needs just two main endpoints: /api/formats to list qualities and /api/download to fetch the video. Use the express-rate-limit middleware immediately to prevent abuse. Store downloads in a temporary directory with cron job cleanup.

For authentication, begin with a simple hardcoded API key if it's for personal use. You can evolve to a database later. The core complexity lies in streaming large files back efficiently; use res.download() or create read streams with proper error handling.

Would I Build It the Same Way Again?

For a personal tool, yes. yt-dlp remains the most reliable YouTube extraction library, and Express is perfect for simple APIs. However, for a public service, I'd make three changes: use object storage instead of local files, switch to JWT for stateless auth, and implement a proper task queue like Bull for job management.

The cookie-based auth works well when the client is a browser you control. For mobile or third-party clients, tokens are better. I'd also consider adding a progress WebSocket endpoint for long downloads, since videos can take minutes to process.

Always rate-limit by user and implement format validation—never pass raw user input to yt-dlp's -f flag without checking against a list of allowed quality codes. You can fetch this list dynamically from yt-dlp's JSON output for each video.

Before you start, know that YouTube's terms of service prohibit automated downloading without permission—build this only for content you own or have explicit rights to access.

Architecture Overview

Key Technical Decisions

What Broke and How I Fixed It

How to Build Something Similar

Would I Build It the Same Way Again?

Related posts