Software & Cloud Engineer. I help businesses reduce costs, improve performance, latency, and stability in production systems.
Services
Performance & Cost Fix Sprint
What I Offer
- Hands-on analysis of production performance and cloud costs
- Identification of the single highest-impact bottleneck
- Direct implementation of targeted fixes
- Improvements to observability where needed to validate impact
- Clear before/after metrics and next steps
How I Work
I start with real data and production visibility, not assumptions.
I isolate the few issues that create most of the latency, instability, or cost, then fix one of them end-to-end. Scope is intentionally narrow to avoid risky or unnecessary changes.
The goal is a measurable improvement within a short, defined engagement.
I implement changes directly and verify results with metrics.
€1,500
14-day engagement
Implementation included
Consulting Call
Have a specific performance or cloud question? I'll review your situation and explain what's likely going wrong, what matters, and what doesn't.
Custom Engagements
Need something different? I also offer follow-up implementation help, deeper reviews, or ongoing advisory work depending on your needs and constraints.
Work Experience
Contributions
Results that speak for themselves
From AI platforms to mobile apps, I've helped teams achieve measurable improvements in performance, scalability, and cost efficiency.

Artifimo
Built a complete AI automation platform from scratch. Achieved sub-200ms response times on LLM orchestration and 99.9% uptime across all client deployments.

Actiko
Implemented intelligent caching and RAG optimization that reduced API costs by 65% while improving response quality scores by 40%.

VOWCE
Optimized speech-to-text pipeline achieving real-time transcription with 95% accuracy. Reduced app bundle size by 35% through code splitting.

JobCue
Architected scalable interview processing system handling 1000+ concurrent sessions. Reduced infrastructure costs by 50% through smart resource allocation.

Postmate
Built high-throughput content generation pipeline. Implemented queue workers that process 10,000+ posts daily with zero downtime.

Sentimenty
Delivered enterprise-grade feedback system with real-time analytics. Achieved 60ms average page load through edge caching and optimization.

CloseUp.Pics
Engineered GPU inference pipeline with 3x faster image generation. Built monitoring stack that reduced debugging time by 80%.

IrreglY
Developed scalable mobile reporting system serving thousands of users. Implemented efficient geospatial queries with sub-100ms response times.

PromptFern
Built AI recommendation monitoring across ChatGPT, Claude, Perplexity, and Gemini with real-time alerts and a 60s refresh loop. Implemented audit trails and multilingual tracking to surface brand mentions worldwide, reducing analysis time by 70%.

Artifimo
Built a complete AI automation platform from scratch. Achieved sub-200ms response times on LLM orchestration and 99.9% uptime across all client deployments.

Actiko
Implemented intelligent caching and RAG optimization that reduced API costs by 65% while improving response quality scores by 40%.

VOWCE
Optimized speech-to-text pipeline achieving real-time transcription with 95% accuracy. Reduced app bundle size by 35% through code splitting.

JobCue
Architected scalable interview processing system handling 1000+ concurrent sessions. Reduced infrastructure costs by 50% through smart resource allocation.

Postmate
Built high-throughput content generation pipeline. Implemented queue workers that process 10,000+ posts daily with zero downtime.

Sentimenty
Delivered enterprise-grade feedback system with real-time analytics. Achieved 60ms average page load through edge caching and optimization.

CloseUp.Pics
Engineered GPU inference pipeline with 3x faster image generation. Built monitoring stack that reduced debugging time by 80%.

IrreglY
Developed scalable mobile reporting system serving thousands of users. Implemented efficient geospatial queries with sub-100ms response times.

PromptFern
Built AI recommendation monitoring across ChatGPT, Claude, Perplexity, and Gemini with real-time alerts and a 60s refresh loop. Implemented audit trails and multilingual tracking to surface brand mentions worldwide, reducing analysis time by 70%.
Professional Certifications
I hold the following certifications, demonstrating my expertise and commitment to continuous learning.
- U
Building AI
University of Helsinki
- M
Career Essentials in Generative AI
Microsoft
Recent Posts
FastAPI-MCP: A Guide to Using MCP With FastAPI
Learn how to integrate FastAPI with Model Context Protocol (MCP) to instantly turn your API endpoints into agent-ready tools and workflows.
Reassessing "Zero to One" in the Age of Advanced AI
AI is advancing faster than the previous decade expected.
Ready to improve your cloud?
Get a comprehensive infrastructure review and actionable recommendations to reduce costs, improve performance, and scale with confidence.
Need to contact me via e-mail? Write to: m [at] martinkostov [dot] me