Recruit AI: Intelligent Recruitment Matching

Next.jsEffect.tsQdrantVector SearchRetrieve & RerankVercel AI SDKDrizzlePostgreSQLTurborepo

Two-sided marketplace matching platform with a retrieve-and-rerank pipeline: ANN vector search with pre-filtered hard constraints, followed by multi-factor re-ranking. Built on Effect.ts hexagonal architecture with compile-time dependency guarantees.

Recruit AI: Intelligent Recruitment Matching

View on GitHub

Recruit AI is a two-sided marketplace matching platform implementing a retrieve-and-rerank pipeline for recruitment. ANN vector search retrieves candidate matches with pre-filtered hard constraints, then a multi-factor scoring model re-ranks results — the same pattern used by Netflix, Spotify, and modern search engines.

The system is bidirectional: the same scoring function ranks talents for jobs and jobs for talents. Built on an Effect.ts hexagonal architecture (ports & adapters) where core domain has zero infrastructure dependencies — vector DB, LLM, and Postgres are all swappable adapters with compile-time dependency guarantees.

Retrieve-and-Rerank Pipeline

The matching engine implements the classic two-phase retrieval pattern: ANN retrieves top-50 candidates, then a multi-factor scoring model re-ranks to top-10. Current scoring weights are a baseline — the architecture is designed for eval-driven optimization with recruiter-labeled ground truth data.

Phase 1: Retrieval with Pre-Filtering

Pre-Filtered ANN Search: Qdrant filters hard constraints (work mode, location, relocation) before the approximate nearest neighbor search — not after. This eliminates wasted similarity comparisons on ineligible candidates
HNSW + Cosine Similarity: 3072-dimensional Gemini embeddings indexed with Qdrant's HNSW algorithm for sub-linear search over the candidate space

Phase 2: Multi-Factor Re-Ranking

Semantic Similarity: Cosine distance from the vector retrieval phase
Keyword Recall: Overlap between extracted skills/requirements and candidate profiles
Experience Fit: Years of experience and seniority level alignment
Constraint Satisfaction: Soft constraint matching (salary range, start date, preferences)
Bidirectional Scoring: Same scoring function works both directions — job→talent and talent→job — enabling true two-sided marketplace matching

Eval-Driven Optimization

The scoring weights are intentionally a rough starting point. The architecture targets NDCG@10 (Normalized Discounted Cumulative Gain) as the primary ranking quality metric, optimized through a systematic evaluation framework once recruiter-labeled ground truth is available. Five evaluation groups progressively validate and optimize each layer:

Component Evals

LLM Extraction Evals: Field-level accuracy and keyword F1 scores for resume and job description parsing across multiple LLM providers
Embedding Model Comparison: Recall@10, MRR, and separation gap between good/poor matches across embedding providers
Retrieval Quality: Filter correctness and eligible candidate recall with latency benchmarks across vector DB options

Scoring & End-to-End Evals

Weight Optimization: Grid and Bayesian search across weight combinations, measured by rank correlation with recruiter preferences
Factor Ablation: NDCG@10 impact analysis when removing individual scoring factors
Agent-Driven Discovery: Agentic coding loop that iteratively modifies the scoring function and optimizes toward target metrics
Full Pipeline Eval: End-to-end comparison of system rankings against recruiter-preferred orderings

The eval framework follows Anthropic's methodology — tasks, trials, and graders — with strict separation between dev sets (60%) for iteration and held-out test sets (40%) for final comparison. Both deterministic code-based graders and LLM rubric graders are used to capture objective accuracy and subjective quality.

AI-Powered Data Processing

The platform uses LLMs for structured field extraction from resumes and job descriptions, enriching raw text into queryable structured data.

Structured Extraction

Resume Parsing: LLM extracts skills, experience, preferences, and constraints from uploaded resumes
Job Description Analysis: Automatic extraction of requirements, responsibilities, and qualifications
Interactive Clarification: AI asks follow-up questions to fill gaps in submitted data
Embedding Generation: Automatic vector indexing for all processed entities

Status-Gated Visibility

Processing Pipeline: Entities marked "extracting" remain invisible until fully processed
Idempotent Upserts: Failed vector writes are safely retryable without data corruption
Consistency Model: No cross-system transactions — status fields prevent partial data visibility

Hexagonal Architecture

The system follows a ports & adapters pattern with zero infrastructure dependencies in the core domain layer.

Core Domain

Pure Scoring Functions: Matching logic isolated as testable pure functions with no Effect dependencies
Domain Models: Type-safe entity definitions for talents, jobs, and matches
Port Interfaces: Abstract service boundaries defined as Effect Context Tags

Adapters & Infrastructure

PostgreSQL Adapter: Drizzle ORM for relational data storage with type-safe migrations
Qdrant Adapter: Vector search with payload pre-filtering on hard constraints
AI Adapter: Gemini 2.0 via Vercel AI SDK for embeddings and structured extraction

Why Qdrant Over pgvector

Pre-Filtering vs Post-Filtering: Qdrant applies payload index filters before the ANN search, meaning only eligible candidates enter the similarity computation. Naive post-filtering (retrieve → filter) wastes compute on ineligible results and can return fewer than K results
High-Dimensional Support: 3072-dimensional Gemini embeddings exceed pgvector's 2000-dimension cap
HNSW Tuning: Qdrant exposes m and ef_construct parameters for precision/recall tradeoff control at index time

Effect.ts Throughout

The entire backend is built with Effect.ts, providing typed dependency injection, errors as values, and streaming support.

Compile-Time Dependency Guarantees: Services defined as Context.Tag — the type system enforces that all dependencies are provided before the program compiles, unlike runtime-only DI containers
Typed Errors: All failure modes represented as discriminated unions in the type system — no unchecked exceptions
Port-Based Abstraction: Swap vector DB, LLM provider, or database without touching core business logic
@effect/rpc: Type-safe RPC layer between frontend and backend with schema validation at the boundary

Monorepo Structure

Built as a Turborepo monorepo with Bun, the codebase is split into focused packages:

apps/web — Next.js frontend with shadcn/ui
packages/core — Domain models, business logic ports, scoring algorithms
packages/db — Drizzle schema, migrations, PostgreSQL adapters
packages/vector — Qdrant integration layer
packages/ai — LLM and embedding providers via Vercel AI SDK
packages/api — Effect HTTP API layer
packages/ui — Shared component library
packages/env — Environment validation

Technology Stack

Frontend

Next.js — App Router with React Server Components
shadcn/ui — Component library with Tailwind CSS
TypeScript — Strict mode across all packages

Backend & Data

Effect.ts — Typed dependency injection and error handling
Drizzle ORM — Type-safe PostgreSQL operations with migrations
Qdrant — Vector database with payload filtering
PostgreSQL — Relational storage for structured entity data

AI & Processing

Gemini 2.0 Flash — LLM for structured extraction and clarification
Gemini 2.0 Embedding — 3072-dimensional vector embeddings
Vercel AI SDK — Unified AI provider interface

Tooling

Turborepo — Build caching and task parallelization
Bun — Package management and runtime
Ultracite — Biome-based linting configuration

Ready to build an AI-powered platform?

Let's Discuss Your Project