Day 3: Why I’m Building 4 Services Instead of One Big App

Day 3: Why I’m Building 4 Services Instead of One Big App

Day 3 done.

No code written yet—spent the hour planning HuntKit‘s architecture.

And honestly? I almost made a huge mistake.

The Temptation: Build It All At Once

My first instinct was to build one FastAPI app that does everything:

  • Scrapes jobs
  • Analyzes profiles
  • Matches candidates
  • Finds emails

Ship it all together, deploy, done.

Then I re-read Chapter 1.

What Chapter 1 Actually Taught Me

“Trying to scale everything together is how you end up with a monolith that can’t grow.”

The example that clicked: Web servers and databases scale differently. That’s why we separate them with load balancers and database replication.

For HuntKit, I realized each piece has wildly different needs:

Job Aggregator (scraping 1000+ sites):

  • I/O intensive—constantly making HTTP requests
  • Needs distributed crawlers across regions
  • Rate limiting per domain to avoid bans
  • Biggest scaling challenge

Profile Analyzer (parsing GitHub/LinkedIn):

  • CPU intensive—processing repos, analyzing code
  • Caching-heavy (same profiles checked repeatedly)
  • Predictable load based on user signups

Matching Engine (ranking jobs by fit):

  • Read-heavy (90% reads, 10% writes)
  • Benefits massively from Redis caching
  • Needs fast response times (<100ms)

Outreach Assistant (email finder + drafts):

  • API-dependent (Hunter.io, OpenAI)
  • Queuing required (async processing)
  • Cost-sensitive (API calls = $$)

If I built one monolith, I’d have to scale EVERYTHING when job scraping hits limits—even though matching and outreach aren’t stressed.

The Microservices Decision

Breaking into 4 independent services means:

  • Scale each based on actual need (crawler needs 10 instances, matcher needs 2)
  • Deploy updates without breaking everything
  • Test one piece at a time (ship faster, iterate publicly)
  • Different tech choices per service (Scrapy for crawling, FastAPI for matching)

This is literally applying horizontal scaling at the service level.

Why Start With The Hardest Part?

Service 1: Job Aggregator is my Day 8-20 focus because:

  1. It’s the riskiest unknown – I’ve never scraped at scale, don’t know legal boundaries, anti-bot measures are evolving
  2. It’s the foundation – without jobs, nothing else matters
  3. It’s where I’ll learn the most – I’ve done GitHub parsing, matching algorithms, LLM integration before. Distributed scraping? New territory.

De-risk the hard stuff early.

My Struggle: Database Decisions

Here’s where I’m stuck: Do I choose databases now or later?

Current thinking:

  • Job listings = probably Postgres (relational, ACID for job data integrity)
  • User profiles/cache = Redis (fast reads for matching)
  • Analytics = maybe ClickHouse later for logs/metrics

But honestly? I don’t know the exact data shape yet. Choosing now feels like guessing.

Is it okay to say “TBD until Day 10 when I see actual scraped data”? Or does that look like I don’t understand databases matter?

Genuinely curious: How do you decide tech stack timing?

Tomorrow’s Plan (Day 4)

Not building yet. Exploring:

  • Research Scrapy vs Playwright vs custom async crawlers
  • Study 10 company career pages (structure, anti-bot measures)
  • Test robots.txt compliance frameworks
  • Read legal considerations (don’t want to accidentally break laws)

Day 5: Pick an approach based on findings
Day 6+: Start coding Service 1

I’m nervous about the legal gray area of scraping. Has anyone dealt with this before?

What I’m Learning

System design isn’t just “pick the right database.” It’s about:

  • Understanding different scaling needs
  • Building for iteration, not perfection
  • De-risking unknowns early
  • Separating concerns so failures don’t cascade

Chapter 1 made this click. Now applying it to something real.

Tech Stack (So Far):

  • Backend: FastAPI (async = perfect for I/O-heavy scraping)
  • Frontend: React for web (Flutter mobile later if this works)
  • Services: 4 independent microservices
  • Infrastructure: TBD per service needs

Progress: 3/100 days. Still planning, not rushing.

Drop your thoughts below—especially if you’ve built scrapers or microservices before. I’m learning as I go.

Leave a Reply