How GitHub Engineers Tackle Platform Problems

GitHub’s platform engineering team shares their approach to tackling infrastructure challenges at scale. Key strategies include: Understanding your domain: Talk to neighboring teams with more experience Investigate old issues to understand system limitations Read documentation to build foundational knowledge Platform-specific skills: Network fundamentals (TCP, UDP, L4 load balancing, debugging tools) Operating systems and hardware selection for scalability and cost Infrastructure as Code (Terraform, Ansible, Consul) Distributed systems understanding (failures are inevitable, need failover/recovery) Impact radius considerations:...

December 27, 2025 · 1 min · Oleksandr Kulbida

ACM Finally Automates Certificate Management for Kubernetes

AWS Certificate Manager (ACM) now supports automated certificate management for Kubernetes workloads through AWS Controllers for Kubernetes (ACK). Previously, using ACM certificates in Kubernetes required manual steps: exporting certificates via API, creating Kubernetes Secrets, and manually updating them at renewal. With ACK, you can define certificates as Kubernetes resources, and the controller automates the complete lifecycle - requesting, exporting, creating Secrets, and auto-updating at renewal. This works for both public certificates (ACM exportable certificates) and private certificates (AWS Private CA), enabling automated certificate management for:...

December 27, 2025 · 1 min · Oleksandr Kulbida
Security and infrastructure

How Cloudflare Secures Terraform State at Scale

Managing Terraform state securely is one of those things that seems simple until you’re dealing with hundreds of accounts and thousands of resources. Cloudflare, being their own Customer Zero, had to solve this problem at enterprise scale. The interesting part? They built a custom solution called tfstate-butler - a Go program that acts as an HTTP backend for Terraform state storage. The Security Problem When you’re managing infrastructure at Cloudflare’s scale, a single compromised state file could be catastrophic....

December 23, 2025 · 2 min · Oleksandr Kulbida
Question mark and cloud technology

AWS in 2025: Things You Think You Know That Are Actually Wrong

You know what’s wild? AWS is almost twenty years old now. That’s both cool and kind of terrifying at the same time. I’ve been working with AWS for a while, and honestly, I still catch myself thinking about things the way they used to be, not how they actually work today. The problem is that AWS changes constantly, but a lot of the foundational stuff has evolved in ways that aren’t super obvious....

December 23, 2025 · 8 min · Oleksandr Kulbida
Cloud computing and technology

AWS re:Invent 2025 Recap: Key Announcements for Cloud Practitioners

Another year, another AWS re:Invent has come and gone. I’ve been following the announcements closely, and there’s a lot to cover. The Big Picture This year’s re:Invent felt a bit different. The pre:Invent announcements started later than usual (mid-November instead of early October), and the keynote felt more focused on GenAI than infrastructure improvements. That said, there are still plenty of practical enhancements that can make our lives easier. Security Features That Matter AWS Security Agent (Preview) This one caught my attention....

December 23, 2025 · 7 min · Oleksandr Kulbida
Data storage and cloud

AWS Finally Built a Browser for S3 (And Why It Took 20 Years)

It’s 2024, and AWS finally built a browser-based S3 viewer. Twenty years after S3 launched, you can now browse your buckets directly in the browser. It’s still in alpha, but hey, better late than never, right? You might be thinking - wait, weren’t there options before? Like S3Fox? Yeah, there were some third-party tools, but AWS itself never had an official browser interface. So what took so long? Real Problem: Access Control Turns out, building a simple file browser wasn’t the hard part....

November 20, 2024 · 3 min · Oleksandr Kulbida
12 Factors

12 Factors vs kubernetes world

Disclaimer: here you might not find something new if you know 12 factors app. The 12-factor app methodology 12factor.net, is a set of principles designed to enable applications to be built with portability and resilience when deployed to the web. These principles focus on declarative formats for automation, clean contracts with the operating system, and suitability for deployment on modern cloud platforms, thus minimizing divergence between development and production, enabling continuous deployment for maximum agility....

May 8, 2024 · 5 min · Oleksandr Kulbida

Aws vs Elasticsearch licensing

Recent elasticsearch licensing change ensures that the Beats modules are sending data to an officially supported versions of Elasticsearch and Kibana where Elastic can attest to the quality and scale of the products. Does AWS have any plans to fork a version filebeat? https://www.elastic.co/guide/en/beats/libbeat/current/breaking-changes-7.13.html https://www.reddit.com/r/aws/comments/nn95aq/elastic_has_broken_filebeat_as_of_713_it_no/ What are the alternatives? Host elasticsearch on EC2 instances, why not? CloudWatch, slow… https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_ES_Stream.html Kubernetes options like ECK or helm chart deployments…tricky for production usage Loki?...

April 21, 2024 · 1 min · Oleksandr Kulbida

k8s 1.30 version

Reasons to upgrade to k8s 1.30 Container resource based autoscaling Container resource based autoscaling is now promoted to stable https://github.com/kubernetes/enhancements/issues/1610 Horizontal Pod Autoscaler examines the total resource usage of the entire pod i.e. sum of all containers and scale pods based on average CPU or memory usage. Container resource based autoscaling feature allows HPA to scale workloads based on the resource usage of individual containers within a pod, instead of the aggregated usage of all containers in the pod...

April 21, 2024 · 1 min · Oleksandr Kulbida

k8s InPlacePodVerticalScaling

Kubernetes InPlacePodVerticalScaling feature Kubernetes v1.27 introduces InPlacePodVerticalScaling, allowing seamless pod resource resizing without restarts Enhanced Continuity: Eliminates the downtime and potential data loss caused by pod restart Cost Savings: Avoid overprovisioning and optimizing resource usage. InPlacePodVerticalScaling lets you allocate resources precisely as needed In this example for pod memory resources configuration, the resizePolicy indicates that changes to the memory allocation require a restart of the container, and for CPU resources the restart is not necessary during resizing....

April 20, 2024 · 1 min · Oleksandr Kulbida