CASE STUDIES

Case Studies & Results

Real projects, not testimonials. Each one is drawn from my own production work — problem, approach, measurable outcome, and the exact stack. No invented client names, no inflated numbers.

AI agents • local & cloud inference • DevOps & hosting • full-stack platforms

Book a free 30-min scoping call See services

LOCAL AI INFERENCE

Near-zero-cost captcha solving with a local vision model

Banking-portal monitor — personal project

Problem: A portal monitor kept hitting hCaptcha challenges. Solving every one through a cloud vision API would have meant an unbounded, metered bill for a task that runs on a schedule, indefinitely.
Approach: I built a local-first solver: a quantized Qwen3-VL vision model running on-GPU via exllamav3, wrapped in a hierarchical cache so repeat challenges never re-infer. Only the rare cases the local model is unsure about escalate to Claude vision. I benchmarked engines head-to-head (exllamav3 vs llama.cpp vs vLLM) at equal accuracy to pick the fastest.
Outcome: The winning engine solved a challenge in ~335 ms on local hardware — roughly 3x faster than the alternatives at the same accuracy — and the cache plus local inference cut the recurring cloud-API cost for captcha solving to near zero.
Stack: Qwen3-VL-8Bexllamav3 (EXL3 4bpw)llama.cppClaude vision escalationPythonhierarchical cacheRTX 5090

AI AGENT / GATEWAY

A 24/7 chat AI assistant on a flat subscription, not metered API

OpenClaw gateway on Telegram & WhatsApp — personal project

Problem: I wanted a always-available AI assistant reachable from my phone via chat — but running a capable model 24/7 on a per-token API would make costs unpredictable.
Approach: I deployed the OpenClaw gateway in Docker and routed it through a billing-proxy that authenticates with a long-lived subscription token, so the assistant runs on an existing Claude Max plan instead of metered API billing. It exposes Telegram and WhatsApp front-ends with skills, cron jobs, and sub-agents.
Outcome: A production assistant reachable from chat around the clock, at a predictable flat monthly cost instead of a variable API bill — running unattended on Docker with a documented token-rotation procedure.
Stack: OpenClaw gatewayDocker Composebilling-proxyClaude Code OAuth tokenTelegramWhatsApp

HOSTING / DEVOPS

Stopping server-wide restarts across a 300-site host

Shared LiteSpeed / cPanel server — managed hosting

Problem: A CloudLinux + LiteSpeed server hosting ~300 cPanel sites kept sending "503 — server restarted automatically" alerts. Every restart briefly bounced all 300 sites, and the cause was being misattributed to the wrong subsystem.
Approach: I traced it to its real source: LiteSpeed's own autoFix503 was graceful-restarting the entire server whenever a single account hit its per-account memory (LVE) cap and OOM-looped — usually driven by a crawler flood or a bot herd. I disabled the server-wide over-restart, then blocked the abusive traffic at the layer that actually sees the real client IP.
Outcome: The false server-wide restarts stopped. A misbehaving account is now throttled exactly as designed without taking the other ~300 sites down with it, and genuine outages are still caught by the standard service monitor.
Stack: CloudLinuxLiteSpeed EnterprisecPanel / WHMCSFImunify360LVE.htaccess

FULL-STACK PLATFORM

A 200K-LOC bilingual content platform, maintained by one person

josenobile.co — this site

Problem: Publish and maintain a large, fast, bilingual (EN/ES) content site — dozens of technical guides plus a health-article library — without a CMS, a team, or an ongoing hosting bill.
Approach: I built it as a static platform with an AI-augmented build pipeline: a partial-injection system for shared headers/footers, and scripts that auto-generate the Spanish mirror pages, the sitemap with hreflang, and the search index on every build. It ships to Cloudflare Pages through CI/CD.
Outcome: A ~200K-LOC platform with 69 technical guides and 82 bilingual health articles that one person can keep current — every push rebuilds mirrors, sitemap, and search automatically and deploys to a global edge at near-zero hosting cost.
Stack: Static HTML/CSS/JSNode build scriptsCloudflare PagesGitLab CI/CDClaude Code

DATA / ANALYTICS

A repeatable weekly KPI dashboard from a production database

SaaS analytics dashboard — fitness-tech platform

Problem: Leadership needed a consistent weekly view of product and traffic KPIs, but the numbers lived in a production database and were being pulled together by hand each week.
Approach: I defined each metric against the source-of-truth database with the stakeholders, then built a dashboard that queries those definitions directly and renders the week's product and traffic KPIs in one place, on a repeatable schedule.
Outcome: A single, repeatable weekly dashboard replaced the manual spreadsheet pull — consistent metric definitions, less hand-work, and a faster read on how the product and traffic are trending. (Specific figures are confidential.)
Stack: MySQLPythonscheduled generationchart rendering

The Common Thread

Every one of these is production work I own end to end — from picking the model or the fix, to shipping it, to keeping it running. That is what you get when you hire me: a senior engineer who has already done the hard version of your problem.

19+

years in software engineering

microservices in production

200K+

lines of code maintained

2,200+

Claude Code sessions

Have a Similar Problem?

If any of these look like your situation, book a free 30-minute scoping call. I’ll tell you honestly whether it’s a fit and what the smallest effective first step is. See pricing or how to hire me.

Book a free 30-min scoping call WhatsApp Write me via the contact form