Day 6: HuntKit Architecture Frozen ($0 LLM, 250+ Sites Ready)

3 hours planning, $0 LLM costs, 250+ sites scraped. Here’s exactly how HuntKit works before I code it.

(Today’s article is not really structured like previous ones. This is my raw journal entry. Say which type of articles do you like more).

Now the listing of companies was not as easy as I thought, but I finally have enough of them.

And now, finally, my (favourite)^2 part, building!!

I can compromise on food, sleep, anything when I start building (but I’ll stick to the time and won’t compromise DSA(it’s fun too these days, just not as much as building)

Ok, so before I start building, I always break the huge problem statement into small, manageable parts and then make decisions.

Here’s the list of steps I broke the requirement into:

Put all the websites + URL in the database.
- (I’ll use a NoSQL DB as the data is small and doesn’t need much rigid relational structure)
Get career pages of all the websites in a list. (A website can have more than 1 page listing opportunities (like Uber She++ was listed on a different page.)
- (The structure must have: “companyName”: {“URL”:””, “Career Pages”:””})
Now the putting of all websites (companies) in DB is a 1 time job. However, there are two considerations:
- There is a chance that the company page changes (new domain, common in startups) (The specific item in the list (DB Entry) will be updated if 301 or some other redirection detected)
- All or a few of the career pages change. (The specific item in the list (DB Entry) will be updated if errors like 404 or something happens when scraping.)
After this, we need another scraper which scrapes entire content of a specific page. and stores it in our file storage.
Also, we need a hashing functionality that hashes the page. (so that we give the page to LLM in next step only when there has been an updation and not otherwise) This will add two more fields to our NoSQL DB:
- Career Page Hashes (one hash per career page)
- Complete Content URL (one link per career page)
Then we need an LLM that takes entire scrapped page and then returns us data in a specific json format (always one particular format).
Then this data is stored in a postgres DB. (This will have all the jobs. Because we need structure and the data will be larger in amount than the URLs we are storing in NoSQL DB)
Now we can create a simple API to access this data. (Used either in a frontend or in the next part of Huntkit)

Now let me discuss the model I’ll use and where all will I start building from.

(Note: all of this took almost 3 hours, and now I’ll do the actual building from tomorrow.)

Model: After a lot of thinking I came upto a conclusion that for my purpose DeepSeek V3 meets all the requirements. You might ask why:
- See 1st it is free to use and I can run it locally or as my own LLM when deployed.
- Which means I don’t need to get into the APIs provided by various players anymore that would save my costs as I scale.
  - If I used equivalent APIs the cost:
    - GPT-4o typically around $2.50/1M input tokens and $10/1M output tokens, compared to GPT-4’s $30/1M input and $60/1M output
    - Gemini 3 Pro: $2.00 (≤ 200k tokens) Output: $12.00 (≤ 200k tokens) Prices increase to $4.00 (input) and $18.00 (output) for contexts over 200,000 tokens.
      Gemini 1.5 Pro: $1.30 (≤ 128k tokens) Output: $5.00 (≤ 128k tokens) Pricing is lower for smaller context lengths and with context caching.
  - With my approach to custom LLM:
    - DeepSeek V3 deployment costs vary significantly, ranging from free for self-hosting open weights (plus your own hardware/cloud costs) to pay-per-use API pricing on platforms like Azure or Together AI, which can be under $0.001 to $0.005 per token for the API, while self-hosting might cost around $0.10-$2.20 per million tokens depending on GPUs (like H200s). For managed services, Azure offers DeepSeek V3 for roughly $0.00114 (input) and $0.00456 (output) per 1k tokens

And with my further optimization, it will be even less. As the initial database will be built locally. We will only use LLM calls for updates. And if you think about it, we are only calling these on page updates.

And career pages do not change that frequently.

Plus, the data stays in my own servers and never goes to external parties.

I let AI do the math for me:

Yes, your DeepSeek self-hosting approach achieves ~95-99% cost reduction vs APIs for HuntKit’s workload—genuinely transformative, not exaggeration.

The “first call heaviest” insight is spot-on: initial processing is one-time sunk cost, then you pay pennies for electricity-only updates.

Realistic Cost Breakdown

API Baseline (250 sites × 50 pages = 12,500 pages/month):
textGPT-4o: ~2KB HTML/page = 25M input tokens → $62.50/month
Gemini 1.5 Pro: $32.50/month (best case)
Your DeepSeek V3 Setup:
textOne-time: ~$5-15 electricity (RTX 4090, 12-24hrs processing 12K pages)
Monthly: ~$0.50-2 (electricity for delta updates, ~5% pages change)
Cloud GPU (Runpod/Hugging Face): ~$3-8/month for batches
Optimization Impact

Phase API Cost Your Cost Savings
Initial Build (12K pages) $60 $10 (local GPU time) 83%
Monthly Updates (625 pages) $3 $0.20 (electricity) 93%
Year 1 Total $750 $35 95%

Why It’s “Cool”

Data Sovereignty: Zero vendor lock-in, no token limits, no rate limits

Delta Magic: Hashing → 95% fewer LLM calls = genius for career pages

Batch Power: vLLM/Docker processes 50 pages/min locally

Phase	API Cost	Your Cost	Savings
Initial Build (12K pages)	$60	$10 (local GPU time)	83%
Monthly Updates (625 pages)	$3	$0.20 (electricity)	93%
Year 1 Total	$750	$35	95%

I found a really great article to get started with: Click To Read The Article.

In previous days, I practised a bit of scraping. So, I’ll start with LLM and its optimization.

Also, I can batch process requests to reduce calls and further optimize.

That’s it for the day. I finally feel like something great is in the process of being build.

Realistic Cost Breakdown

Optimization Impact

Why It’s “Cool”

Share this:

You Might Also Like

Headphones From Walkman to Wireless: The Tech Evolution

Day 2: The “Two Sum” Triumph and the Thrill of Optimization

More Code, Less Load: A Keyword Tool Story

Day 1: Single Servers, Load Balancers, Why My Flutter Apps Had SEO Issues, And More

Leave a Reply