Recruit AI is a two-sided marketplace matching platform implementing a retrieve-and-rerank pipeline for recruitment. ANN vector search retrieves candidate matches with pre-filtered hard constraints, then a multi-factor scoring model re-ranks results — the same pattern used by Netflix, Spotify, and modern search engines.
The system is bidirectional: the same scoring function ranks talents for jobs and jobs for talents. Built on an Effect.ts hexagonal architecture (ports & adapters) where core domain has zero infrastructure dependencies — vector DB, LLM, and Postgres are all swappable adapters with compile-time dependency guarantees.
Retrieve-and-Rerank Pipeline
The matching engine implements the classic two-phase retrieval pattern: ANN retrieves top-50 candidates, then a multi-factor scoring model re-ranks to top-10. Current scoring weights are a baseline — the architecture is designed for eval-driven optimization with recruiter-labeled ground truth data.
Phase 1: Retrieval with Pre-Filtering
- Pre-Filtered ANN Search: Qdrant filters hard constraints (work mode, location, relocation) before the approximate nearest neighbor search — not after. This eliminates wasted similarity comparisons on ineligible candidates
- HNSW + Cosine Similarity: 3072-dimensional Gemini embeddings indexed with Qdrant's HNSW algorithm for sub-linear search over the candidate space
Phase 2: Multi-Factor Re-Ranking
- Semantic Similarity: Cosine distance from the vector retrieval phase
- Keyword Recall: Overlap between extracted skills/requirements and candidate profiles
- Experience Fit: Years of experience and seniority level alignment
- Constraint Satisfaction: Soft constraint matching (salary range, start date, preferences)
- Bidirectional Scoring: Same scoring function works both directions — job→talent and talent→job — enabling true two-sided marketplace matching
Eval-Driven Optimization
The scoring weights are intentionally a rough starting point. The architecture targets NDCG@10 (Normalized Discounted Cumulative Gain) as the primary ranking quality metric, optimized through a systematic evaluation framework once recruiter-labeled ground truth is available. Five evaluation groups progressively validate and optimize each layer:
Component Evals
- LLM Extraction Evals: Field-level accuracy and keyword F1 scores for resume and job description parsing across multiple LLM providers
- Embedding Model Comparison: Recall@10, MRR, and separation gap between good/poor matches across embedding providers
- Retrieval Quality: Filter correctness and eligible candidate recall with latency benchmarks across vector DB options
Scoring & End-to-End Evals
- Weight Optimization: Grid and Bayesian search across weight combinations, measured by rank correlation with recruiter preferences
- Factor Ablation: NDCG@10 impact analysis when removing individual scoring factors
- Agent-Driven Discovery: Agentic coding loop that iteratively modifies the scoring function and optimizes toward target metrics
- Full Pipeline Eval: End-to-end comparison of system rankings against recruiter-preferred orderings
The eval framework follows Anthropic's methodology — tasks, trials, and graders — with strict separation between dev sets (60%) for iteration and held-out test sets (40%) for final comparison. Both deterministic code-based graders and LLM rubric graders are used to capture objective accuracy and subjective quality.
AI-Powered Data Processing
The platform uses LLMs for structured field extraction from resumes and job descriptions, enriching raw text into queryable structured data.
Structured Extraction
- Resume Parsing: LLM extracts skills, experience, preferences, and constraints from uploaded resumes
- Job Description Analysis: Automatic extraction of requirements, responsibilities, and qualifications
- Interactive Clarification: AI asks follow-up questions to fill gaps in submitted data
- Embedding Generation: Automatic vector indexing for all processed entities
Status-Gated Visibility
- Processing Pipeline: Entities marked "extracting" remain invisible until fully processed
- Idempotent Upserts: Failed vector writes are safely retryable without data corruption
- Consistency Model: No cross-system transactions — status fields prevent partial data visibility
Hexagonal Architecture
The system follows a ports & adapters pattern with zero infrastructure dependencies in the core domain layer.
Core Domain
- Pure Scoring Functions: Matching logic isolated as testable pure functions with no Effect dependencies
- Domain Models: Type-safe entity definitions for talents, jobs, and matches
- Port Interfaces: Abstract service boundaries defined as Effect Context Tags
Adapters & Infrastructure
- PostgreSQL Adapter: Drizzle ORM for relational data storage with type-safe migrations
- Qdrant Adapter: Vector search with payload pre-filtering on hard constraints
- AI Adapter: Gemini 2.0 via Vercel AI SDK for embeddings and structured extraction
Why Qdrant Over pgvector
- Pre-Filtering vs Post-Filtering: Qdrant applies payload index filters before the ANN search, meaning only eligible candidates enter the similarity computation. Naive post-filtering (retrieve → filter) wastes compute on ineligible results and can return fewer than K results
- High-Dimensional Support: 3072-dimensional Gemini embeddings exceed pgvector's 2000-dimension cap
- HNSW Tuning: Qdrant exposes
m and ef_construct parameters for precision/recall tradeoff control at index time
Effect.ts Throughout
The entire backend is built with Effect.ts, providing typed dependency injection, errors as values, and streaming support.
- Compile-Time Dependency Guarantees: Services defined as
Context.Tag — the type system enforces that all dependencies are provided before the program compiles, unlike runtime-only DI containers
- Typed Errors: All failure modes represented as discriminated unions in the type system — no unchecked exceptions
- Port-Based Abstraction: Swap vector DB, LLM provider, or database without touching core business logic
- @effect/rpc: Type-safe RPC layer between frontend and backend with schema validation at the boundary
Monorepo Structure
Built as a Turborepo monorepo with Bun, the codebase is split into focused packages:
- apps/web — Next.js frontend with shadcn/ui
- packages/core — Domain models, business logic ports, scoring algorithms
- packages/db — Drizzle schema, migrations, PostgreSQL adapters
- packages/vector — Qdrant integration layer
- packages/ai — LLM and embedding providers via Vercel AI SDK
- packages/api — Effect HTTP API layer
- packages/ui — Shared component library
- packages/env — Environment validation
Technology Stack
Frontend
- Next.js — App Router with React Server Components
- shadcn/ui — Component library with Tailwind CSS
- TypeScript — Strict mode across all packages
Backend & Data
- Effect.ts — Typed dependency injection and error handling
- Drizzle ORM — Type-safe PostgreSQL operations with migrations
- Qdrant — Vector database with payload filtering
- PostgreSQL — Relational storage for structured entity data
AI & Processing
- Gemini 2.0 Flash — LLM for structured extraction and clarification
- Gemini 2.0 Embedding — 3072-dimensional vector embeddings
- Vercel AI SDK — Unified AI provider interface
Tooling
- Turborepo — Build caching and task parallelization
- Bun — Package management and runtime
- Ultracite — Biome-based linting configuration