AI agents in practice: self-learning, knowledge bases, and why fewer agents is better

Building AI agents sounds fun until you actually build one. Then a different set of problems shows up — ones nobody wrote a blog post about yet. This is a summary of a conversation between developers actively running agent systems in production or near-production. The topics: self-improvement conflicts with git, what to use for a knowledge base, Andrej Karpathy’s Obsidian approach, and why adding more agents rarely helps. The self-improvement problem One of the selling points of agents like Hermes is that they can self-reflect and improve — updating their own rules based on experience....

May 15, 2026 · 6 min · Oleksandr Kulbida
Infrastructure as Code

Terraform at scale: GitOps tools and the long apply problem

If you’ve been using Terraform Cloud for a while, you’ve probably hit at least one of these: the pricing model changed and suddenly it’s expensive, applies take 10+ minutes, or the state files have grown into something nobody wants to touch. You’re not alone — this is a recurring topic in every DevOps community right now. This post covers the main tools people are using to solve these problems in 2025–2026, with a focus on two separate issues that often get conflated: GitOps orchestration (who triggers plans, who approves applies) and state management at scale (why applies are slow and what to do about it)....

May 8, 2026 · 6 min · Oleksandr Kulbida

How GitHub engineers tackle platform problems

GitHub’s platform engineering team shares their approach to tackling infrastructure challenges at scale. Key strategies include: Understanding your domain: Talk to neighboring teams with more experience Investigate old issues to understand system limitations Read documentation to build foundational knowledge Platform-specific skills: Network fundamentals (TCP, UDP, L4 load balancing, debugging tools) Operating systems and hardware selection for scalability and cost Infrastructure as Code (Terraform, Ansible, Consul) Distributed systems understanding (failures are inevitable, need failover/recovery) Impact radius considerations:...

December 27, 2025 · 1 min · Oleksandr Kulbida