Andrey Markin
BlogServicesProjectsReviewsPricingContact
Meet
BlogServicesProjectsReviewsPricingContact
Meet

Mark Life Ltd

BG208147965

HomeContactPrivacyLLM-friendly
Back to Projects

Recruit AI: Intelligent Recruitment Matching

Next.jsEffect.tsAIVercel AI SDKDrizzleQdrantTurborepoPostgreSQL

AI-powered recruitment platform that connects job descriptions with qualified talent using vector embeddings, multi-factor scoring, and bidirectional matching. Built with Effect.ts hexagonal architecture.

Recruit AI: Intelligent Recruitment Matching
View on GitHub

Recruit AI is an AI-driven recruitment platform that performs bidirectional matching between job descriptions and talent profiles. The system uses vector embeddings and multi-factor scoring to identify the best fits — recruiters find candidates for positions, and talent profiles get matched to relevant opportunities.

Built with a hexagonal architecture powered by Effect.ts, the platform separates core business logic from infrastructure concerns, making it fully testable and adapter-swappable.

Intelligent Matching System

The matching engine operates in two phases: semantic vector retrieval followed by multi-factor scoring. The current formula is a starting baseline — the real strategy is eval-driven optimization with recruiter-labeled ground truth data.

Matching Pipeline

  • Hard Constraint Filtering: Work mode, location, and relocation preferences applied before vector search
  • Vector Retrieval: Semantic similarity search via Qdrant with 3072-dimensional Gemini embeddings
  • Multi-Factor Scoring: Weighted combination of similarity, keywords, experience, and constraints
  • Bidirectional Matching: Jobs to talents and talents to jobs with the same scoring engine

Evaluation-Driven Development

The scoring formula weights are intentionally a rough starting point. The architecture is designed so that the formula can be systematically improved through evaluations once a proper labeled dataset is available. The project defines five evaluation groups that progressively validate and optimize each layer of the system:

Component Evals

  • LLM Extraction Evals: Field-level accuracy and keyword F1 scores for resume and job description parsing across multiple LLM providers
  • Embedding Model Comparison: Recall@10, MRR, and separation gap between good/poor matches across embedding providers
  • Retrieval Quality: Filter correctness and eligible candidate recall with latency benchmarks across vector DB options

Scoring & End-to-End Evals

  • Weight Optimization: Grid and Bayesian search across weight combinations, measured by rank correlation with recruiter preferences
  • Factor Ablation: NDCG@10 impact analysis when removing individual scoring factors
  • Agent-Driven Discovery: Agentic coding loop that iteratively modifies the scoring function and optimizes toward target metrics
  • Full Pipeline Eval: End-to-end comparison of system rankings against recruiter-preferred orderings

The eval framework follows Anthropic's methodology — tasks, trials, and graders — with strict separation between dev sets (60%) for iteration and held-out test sets (40%) for final comparison. Both deterministic code-based graders and LLM rubric graders are used to capture objective accuracy and subjective quality.

AI-Powered Data Processing

The platform uses LLMs for structured field extraction from resumes and job descriptions, enriching raw text into queryable structured data.

Structured Extraction

  • Resume Parsing: LLM extracts skills, experience, preferences, and constraints from uploaded resumes
  • Job Description Analysis: Automatic extraction of requirements, responsibilities, and qualifications
  • Interactive Clarification: AI asks follow-up questions to fill gaps in submitted data
  • Embedding Generation: Automatic vector indexing for all processed entities

Status-Gated Visibility

  • Processing Pipeline: Entities marked "extracting" remain invisible until fully processed
  • Idempotent Upserts: Failed vector writes are safely retryable without data corruption
  • Consistency Model: No cross-system transactions — status fields prevent partial data visibility

Hexagonal Architecture

The system follows a ports & adapters pattern with zero infrastructure dependencies in the core domain layer.

Core Domain

  • Pure Scoring Functions: Matching logic isolated as testable pure functions with no Effect dependencies
  • Domain Models: Type-safe entity definitions for talents, jobs, and matches
  • Port Interfaces: Abstract service boundaries defined as Effect Context Tags

Adapters & Infrastructure

  • PostgreSQL Adapter: Drizzle ORM for relational data storage with type-safe migrations
  • Qdrant Adapter: Vector search with payload pre-filtering on hard constraints
  • AI Adapter: Gemini 2.0 via Vercel AI SDK for embeddings and structured extraction

Why Qdrant Over pgvector

  • Pre-Filtering: Native payload index filtering before ANN search eliminates wasted similarity comparisons
  • High-Dimensional Support: 3072-dimensional Gemini embeddings exceed pgvector's 2000-dimension cap
  • Batch Constraint Filtering: Efficient hard constraint application before expensive vector operations

Effect.ts Throughout

The entire backend is built with Effect.ts, providing typed dependency injection, errors as values, and streaming support.

  • Dependency Injection: Services defined as Context.Tag for compile-time verified wiring
  • Typed Errors: All failure modes represented as discriminated unions in the type system
  • Port-Based Abstraction: Swap adapter implementations without touching core business logic
  • Effect HTTP API: Backend API layer built with @effect/platform

Monorepo Structure

Built as a Turborepo monorepo with Bun, the codebase is split into focused packages:

  • apps/web — Next.js frontend with shadcn/ui
  • packages/core — Domain models, business logic ports, scoring algorithms
  • packages/db — Drizzle schema, migrations, PostgreSQL adapters
  • packages/vector — Qdrant integration layer
  • packages/ai — LLM and embedding providers via Vercel AI SDK
  • packages/api — Effect HTTP API layer
  • packages/ui — Shared component library
  • packages/env — Environment validation

Technology Stack

Next.js
TypeScript
Effect.ts
Vercel AI SDK
Google Gemini
Drizzle
Qdrant
PostgreSQL
Turborepo
shadcn/ui

Frontend

  • Next.js — App Router with React Server Components
  • shadcn/ui — Component library with Tailwind CSS
  • TypeScript — Strict mode across all packages

Backend & Data

  • Effect.ts — Typed dependency injection and error handling
  • Drizzle ORM — Type-safe PostgreSQL operations with migrations
  • Qdrant — Vector database with payload filtering
  • PostgreSQL — Relational storage for structured entity data

AI & Processing

  • Gemini 2.0 Flash — LLM for structured extraction and clarification
  • Gemini 2.0 Embedding — 3072-dimensional vector embeddings
  • Vercel AI SDK — Unified AI provider interface

Tooling

  • Turborepo — Build caching and task parallelization
  • Bun — Package management and runtime
  • Ultracite — Biome-based linting configuration

Ready to build an AI-powered platform?

Let's Discuss Your Project