Architecture¶

                 ┌────────────────────┐
                 │    Ingestion       │  LinkedIn API session + JobSpy adapters
                 │  (core/sources/*)  │  -> normalized Job rows -> Postgres
                 └─────────┬──────────┘
                           │
                           v
                 ┌────────────────────┐
                 │    Pre-filter      │  cheap LLM bucket (prompts/filter_job.md)
                 │  (flows/filter)    │  -> drops obvious non-matches
                 └─────────┬──────────┘
                           │
                           v
                 ┌────────────────────┐
                 │     Scoring        │  strong LLM (prompts/score_job.md)
                 │  (flows/score)     │  -> structured scores + reasoning
                 └─────────┬──────────┘
                           │
                           v
     ┌────────────────┐    │    ┌────────────────────┐
     │     API        │<───┴───>│      Web           │
     │   FastAPI      │  HTTP   │  Next.js App Router│
     │   (api/*)      │         │   (web/src/*)      │
     └────────┬───────┘         └──────────┬─────────┘
              │                            │
              v                            v
         Postgres                     Browser
         Redis                        (triage UI)
         Qdrant

Components¶

Ingestion — `core/sources/`¶

LinkedIn (core/sources/linkedin/) — session-authenticated HTTP client wrapping the LinkedIn private API. Resumable via an on-disk checkpoint (.hireex/ per-workdir), so interrupted batches don't re-fetch.
JobSpy adapters (core/sources/jobspy/) — bridges to the JobSpy library for Indeed, Glassdoor, ZipRecruiter, etc.
Normalized Job rows land in Postgres with deduplication (rapidfuzz-based) and provenance tracking (which source, which query).

Pre-filter — `flows/filter.py`¶

Reads unscored jobs, feeds title + description to the cheap LLM via prompts/filter_job.md, records a match_score + confidence + reasoning. Below a configurable threshold, the job skips the scorer and gets marked prefilter_skip.

Scoring — `flows/score.py`¶

The meat. For each job that passes the pre-filter:

Load candidate profile (get_candidate_profile() from core/config.py).
Render prompts/score_job.md with {{ job_description }} and {{ candidate_profile }}.
Call LLM (OpenRouter or whatever backend is wired in core/llm/backends/).
Parse strict JSON response, validate via core/llm/schemas.py (pydantic).
Persist scoring axes + reasoning + tailoring hint to the jobs table.

Scoring runs as a Prefect flow for batch ops, and as a one-shot coroutine for single-job re-scoring from the API.

API — `api/`¶

FastAPI app. Routes:

/jobs — list, filter, archive, score again.
/profile — read / write config/candidate_profile.toml.
/prompts — read / write score_job.md and filter_job.md.
/stats — score buckets, stack distributions, per-source counts.
/rescore — trigger a re-scoring run for a subset.

Middleware: rate limiting (api/middleware/rate_limiter.py), CORS, structured logging.

Web — `web/`¶

Next.js App Router app (React 19, Tailwind 4). Key pages:

/jobs — triage table (filters, score columns, archive bulk actions).
/archived — separate view for archived rows.
/score — per-job detail + re-score controls.
/settings — edit candidate profile and prompts live.

TanStack Query for server state, TanStack Table 8 for the job grid. Dark mode is default; see web/src/lib/theme.ts.

Storage¶

Postgres — primary store. Schema in core/db/models.py; migrations under core/db/migrations/versions/. Alembic. Supabase-compatible (same postgres image).
Redis — rate-limiter token buckets, LLM response cache, Prefect flow state.
Qdrant — vector index over job embeddings for similarity search (used for the "more like this" feature).

Data flow for a single job¶

Ingest writes a raw row to jobs with status='ingested'.
Dedup pass marks duplicates against existing rows (rapidfuzz title + company match).
Pre-filter updates prefilter_score and status.
Scorer writes structured axes (match_score, penalty_score, risk_score, friction_score, role_fit, stack_match, ai_depth, salary_potential, growth), reasoning, tailoring_hint.
API serves the joined row; the dashboard applies client-side filters and sorts.
User actions (archive, shortlist, apply) write back via PATCH /jobs/{id} and get persisted with timestamps for telemetry.

Extension points¶

New source: implement a class under core/sources/ that yields normalized Job dicts; register it in the ingest flow.
New LLM backend: add a backend module in core/llm/backends/ implementing the protocol in core/llm/base.py.
New scoring axis: add the field to LLMScoringResult in core/llm/schemas.py, add an Alembic migration for the column, and update prompts/score_job.md [OUTPUT] to emit it. No other code changes needed — the UI auto-picks up new numeric axes.

Architecture¶

Components¶

Ingestion — core/sources/¶

Pre-filter — flows/filter.py¶

Scoring — flows/score.py¶

API — api/¶

Web — web/¶