All posts
next.jsgithub-apitypescriptmongodb

Building a GitHub Developer Search Tool: How I Built It and What I Learned

How I built a tool that searches GitHub profiles by tech stack and keywords to help HR find developers — architecture decisions, key challenges, and what I'd do differently.

SR

Suhail Roushan

May 6, 2026

·
5 min read

I built a GitHub developer search tool to help recruiters find developers by tech stack, not just keywords. It uses the GitHub API to search profiles, analyzes repositories for technologies, and stores enriched data for faster searches. The goal was to solve a real pain point: matching candidates by what they actually build with, not just what’s in their bio.

Architecture Overview

The system has three main layers: a Next.js frontend, a backend API layer that talks to GitHub, and a MongoDB database for caching enriched profiles. The frontend sends search queries (like “React + TypeScript + Hyderabad”). The backend fetches basic user data from GitHub’s search API, then enriches each profile by fetching their repositories to detect technologies from package.json or similar files. Enriched profiles are stored to avoid hitting GitHub rate limits on repeat searches.

graph TD
    A[Next.js Frontend] -->|Search Query| B[API Route Handler]
    B -->|Basic Search| C[GitHub Search API]
    C -->|User List| D[Profile Enricher]
    D -->|Fetch Repos| E[GitHub Repo API]
    D -->|Parse package.json| F[Tech Detection]
    F -->|Save| G[MongoDB Cache]
    G -->|Return Results| A

This flow means the first search for a query might take a few seconds, but subsequent searches are fast because profiles are cached with their detected tech stack.

Key Technical Decisions

I chose to use GitHub’s REST API over GraphQL for the main search because the search endpoints are more mature and the query syntax for location and keywords is well-documented. However, I used GraphQL for fetching repository details because you can request exactly the fields you need in a single call, which is critical when analyzing up to 100 repos per user.

// Using GitHub's REST API for the initial user search
const searchUsers = async (techStack: string[], location?: string) => {
  const queryParts = [
    ...techStack.map(tech => `"${tech}"`),
    location && `location:"${location}"`
  ].filter(Boolean);
  
  const response = await fetch(
    `https://api.github.com/search/users?q=${encodeURIComponent(queryParts.join(' '))}&per_page=50`,
    { headers: { Authorization: `Bearer ${process.env.GITHUB_TOKEN}` } }
  );
  return response.json();
};

Another key decision was implementing a background enrichment queue. When search results return basic user data, I immediately respond to the client but queue a job to fetch and analyze their repositories. This keeps the UI responsive while ensuring data improves over time.

What Broke and How I Fixed It

The first major issue was GitHub’s secondary rate limit for the Search API. Even with authentication, I was hitting limits when users experimented with multiple quick searches. The fix was twofold: implement a Redis-based request queue that spaced out API calls, and add aggressive caching of search results in MongoDB with a 24-hour TTL.

The second problem was tech detection from repositories. Many developers don’t have package.json in their root directory, or they use monorepos. My initial approach only checked the root, which missed technologies in about 40% of cases. I fixed it by scanning up to three directory levels deep and checking for common config files.

// Improved tech detection that searches deeper
const findPackageJson = async (repoContent: any[]): Promise<string | null> => {
  for (const item of repoContent) {
    if (item.name === 'package.json') return item.download_url;
    if (item.type === 'dir') {
      // Recursively check subdirectories (limited depth)
      const subContent = await fetchRepoContent(item.url);
      const found = await findPackageJson(subContent);
      if (found) return found;
    }
  }
  return null;
};

How to Build Something Similar

Start with a simple Next.js API route that searches GitHub users. Use the /search/users endpoint with query parameters for location and keywords. Authenticate with a personal access token stored in environment variables—you get much higher rate limits.

Your MVP should return basic profiles first. Then add the enrichment step: for each user, fetch their public repositories and look for package.json, requirements.txt, or Cargo.toml to detect technologies. Store the results in any database (I used MongoDB for its flexible schema) to avoid re-fetching the same data.

// Basic starting point in a Next.js API route
export async function GET(request: Request) {
  const { searchParams } = new URL(request.url);
  const query = searchParams.get('q') || '';
  
  const users = await fetchGitHubUsers(query);
  // Return immediately, enrich in background
  queueEnrichmentJob(users.items);
  
  return Response.json({ users: users.items });
}

Focus on making one search type work well before adding filters. I started with just tech stack and location, then added availability flags and repository activity filters later.

Would I Build It the Same Way Again?

For the core architecture, yes. Next.js API routes were perfect for this—they handle serverless deployment easily and keep the frontend and backend in one project. MongoDB was the right choice because the profile schema evolved constantly as I added more detection logic.

I would change the tech detection approach. Instead of only parsing config files, I’d use GitHub’s language statistics combined with file heuristics. This gives a more accurate picture of what technologies a developer uses most frequently. I’d also add continuous re-enrichment: profiles should update automatically when users push new code, not just when someone searches for them.

The one thing you should know before starting: GitHub’s search API doesn’t index repository contents, so you must fetch and analyze repositories separately—this is the most time-consuming part but also where the real value is.

Written by Suhail Roushan — Full-stack developer. More posts on AI, Next.js, and building products at suhailroushan.com/blog.

Get in touch