Skip to content

Architecture

                 ┌────────────────────┐
                 │    Ingestion       │  LinkedIn API session + JobSpy adapters
                 │  (core/sources/*)  │  -> normalized Job rows -> Postgres
                 └─────────┬──────────┘
                           v
                 ┌────────────────────┐
                 │    Pre-filter      │  cheap LLM bucket (prompts/filter_job.md)
                 │  (flows/filter)    │  -> drops obvious non-matches
                 └─────────┬──────────┘
                           v
                 ┌────────────────────┐
                 │     Scoring        │  strong LLM (prompts/score_job.md)
                 │  (flows/score)     │  -> structured scores + reasoning
                 └─────────┬──────────┘
                           v
     ┌────────────────┐    │    ┌────────────────────┐
     │     API        │<───┴───>│      Web           │
     │   FastAPI      │  HTTP   │  Next.js App Router│
     │   (api/*)      │         │   (web/src/*)      │
     └────────┬───────┘         └──────────┬─────────┘
              │                            │
              v                            v
         Postgres                     Browser
         Redis                        (triage UI)
         Qdrant

Components

Ingestion — core/sources/

  • LinkedIn (core/sources/linkedin/) — session-authenticated HTTP client wrapping the LinkedIn private API. Resumable via an on-disk checkpoint (.hireex/ per-workdir), so interrupted batches don't re-fetch.
  • JobSpy adapters (core/sources/jobspy/) — bridges to the JobSpy library for Indeed, Glassdoor, ZipRecruiter, etc.
  • Normalized Job rows land in Postgres with deduplication (rapidfuzz-based) and provenance tracking (which source, which query).

Pre-filter — flows/filter.py

Reads unscored jobs, feeds title + description to the cheap LLM via prompts/filter_job.md, records a match_score + confidence + reasoning. Below a configurable threshold, the job skips the scorer and gets marked prefilter_skip.

Scoring — flows/score.py

The meat. For each job that passes the pre-filter:

  1. Load candidate profile (get_candidate_profile() from core/config.py).
  2. Render prompts/score_job.md with {{ job_description }} and {{ candidate_profile }}.
  3. Call LLM (OpenRouter or whatever backend is wired in core/llm/backends/).
  4. Parse strict JSON response, validate via core/llm/schemas.py (pydantic).
  5. Persist scoring axes + reasoning + tailoring hint to the jobs table.

Scoring runs as a Prefect flow for batch ops, and as a one-shot coroutine for single-job re-scoring from the API.

API — api/

FastAPI app. Routes:

  • /jobs — list, filter, archive, score again.
  • /profile — read / write config/candidate_profile.toml.
  • /prompts — read / write score_job.md and filter_job.md.
  • /stats — score buckets, stack distributions, per-source counts.
  • /rescore — trigger a re-scoring run for a subset.

Middleware: rate limiting (api/middleware/rate_limiter.py), CORS, structured logging.

Web — web/

Next.js App Router app (React 19, Tailwind 4). Key pages:

  • /jobs — triage table (filters, score columns, archive bulk actions).
  • /archived — separate view for archived rows.
  • /score — per-job detail + re-score controls.
  • /settings — edit candidate profile and prompts live.

TanStack Query for server state, TanStack Table 8 for the job grid. Dark mode is default; see web/src/lib/theme.ts.

Storage

  • Postgres — primary store. Schema in core/db/models.py; migrations under core/db/migrations/versions/. Alembic. Supabase-compatible (same postgres image).
  • Redis — rate-limiter token buckets, LLM response cache, Prefect flow state.
  • Qdrant — vector index over job embeddings for similarity search (used for the "more like this" feature).

Data flow for a single job

  1. Ingest writes a raw row to jobs with status='ingested'.
  2. Dedup pass marks duplicates against existing rows (rapidfuzz title + company match).
  3. Pre-filter updates prefilter_score and status.
  4. Scorer writes structured axes (match_score, penalty_score, risk_score, friction_score, role_fit, stack_match, ai_depth, salary_potential, growth), reasoning, tailoring_hint.
  5. API serves the joined row; the dashboard applies client-side filters and sorts.
  6. User actions (archive, shortlist, apply) write back via PATCH /jobs/{id} and get persisted with timestamps for telemetry.

Extension points

  • New source: implement a class under core/sources/ that yields normalized Job dicts; register it in the ingest flow.
  • New LLM backend: add a backend module in core/llm/backends/ implementing the protocol in core/llm/base.py.
  • New scoring axis: add the field to LLMScoringResult in core/llm/schemas.py, add an Alembic migration for the column, and update prompts/score_job.md [OUTPUT] to emit it. No other code changes needed — the UI auto-picks up new numeric axes.