Stop Treating YAML Like a String

A new approach to Kubernetes configuration management

Koreo is a data structure orchestration engine. Although it's primarily designed for Kubernetes resource orchestration, Koreo's core functionality can orchestrate and manage virtually any structured data. What Koreo provides today, however, is a new approach to Kubernetes configuration management empowering developers and platform teams through programmable workflows. This approach draws upon the strengths of existing tools like Helm, Kustomize, and Crossplane while addressing some of their limitations. A foundational part of this lies in how Koreo handles configuration itself, specifically the distinction between "interpolation vs. overlay".

String-interpolated templates

String-interpolation-based templating like Helm has its roots in the rudimentary templating approaches of early web technologies such as CGI, PHP, and ASP.NET. This pattern persists in many popular templating solutions (like Jinja and Go templates) because HTML's complex structure makes it notoriously difficult to parse and manipulate programmatically. While some solutions, such as Elm UI or JSX-based approaches, offer cleaner, more structured HTML generation, most languages lack robust built-in mechanisms for this. And, no language lets you actually manipulate HTML with any confidence or precision. HTML is so painful that it is easier to treat as a string and, thus, we have string-interpolation-based templates.

Really, this is fine for many text-encoded data structures. Visually, when you've got one value to set within a data structure like YAML, it's clear that it makes sense.

Setting one value in YAML

Perhaps it even makes sense when you have a number of values you'd like to set:

Setting multiple values in YAML

For simple use cases, that isn't too bad. But when you start having conditionals where you're swapping sets of values, where you're toggling stuff on and off, and interspersing those needs… it gets hairy:

Toggling values in YAML

When using a string-interpolation-based template like Helm each of those becomes something like this, at best:

Helm toggling

It gets worse with complexity, such as a nested conditional like this:

Complex YAML conditional

These become very unwieldy to reason about and manage. Really, at this point someone should have stopped and asked "what am I doing here?" If not, then certainly when you find yourself typing "indent 2" you should be asking some questions. We're dealing with well-defined, structured data formats after all. Hell, often they have an actual schema. Why would you possibly treat something with a defined, trivially manipulatable structure as an unstructured string? If you opened code in any language and saw someone building a dictionary or an array via string interpolation, you would almost certainly ask why they were doing that.

The Kubernetes team explained that they chose YAML because it is reasonably easily readable and editable by both humans and machines. They didn't mean for humans and machines to just treat it as unstructured character streams. It is easy to see why tools have evolved and opted to use string-interpolated templates though. For simple cases, it makes sense enough—it's not that bad to set a value via templating. Even when passing in multiple values, it's probably still ok. But, once you're at control-flow constructs like conditionals, this is not the right solution.

Templates contain business logic

With Helm, organizations often implement what I call a "god chart" or "one chart to rule them all" in order to provide developers with a common application deployment model. It usually starts simple but then evolves organically over time into a tangled ball of yarn that's difficult to maintain and nearly impossible to test. The road to hell is paved with good intentions.

The situation quickly gets really nasty because there are actually multiple distinct things happening. One common need is to swap out a lot of "static" stuff (values, structure, or both). This can cause a lot of noise within a string-interpolated template, but you for sure don't want to copy/paste your template because it also contains your business logic.

Take a short pause and say this out loud: my string-interpolated template contains my business logic.

You layer in value injection throughout your data structure, that you're treating like a stream of characters, and now we need to add more logic, like conditionals, in order to handle toggling values on or off or deciding to include blocks. If we include them, we need to inject their values as well. Now we have a real situation. We have a lot of logic, nestled inside a string—a string that is actually a well-defined data structure.

Interpolation vs. Overlay

Koreo lets you approach these needs with different, purpose-built constructs. First, to swap big-block structural things or static values, you can swap the entire "base":

YAML base swapping

Using a tool like kpt, you can keep those in sync. If you don't want kpt, then just copy/paste or use whatever diff tool you like. These are just the static structures and static or default values.

Static values? That's limiting. If I just had static values, I'd use Kustomize—and you should. It's great for this case. It is the inspiration for Koreo's approach to overlays.

Remember, we've got structured data here. To address value or structural updates based on computed logic, you can apply overlays. Koreo overlays are really just a set of atomic updates which are applied (using real data structures) that behave how you'd do this in code:

Koreo overlays

The overlay lets you specify an atomic set of updates to be applied. No JSON path stuff needed. You write the overlay more or less as you would in code, using a simple but powerful expression language to specify your logic. Koreo evaluates the expressions, converts them into an update, and applies them. That example above is just like setting a dictionary value. It converts to something similar to this:

resource.spec.sub.toggler = "orange"

You specify the path you'd like updated, provide the value expression, and then the value will be updated. You can update multiple fields and sub-structures within a single overlay. You can even do things like map over values from configuration or the base resource specification in order to filter or change the values within the list or map.

In fact, that is what Koreo does. It takes your overlay spec and builds an "update index" which tells Koreo the specific properties in your data structure you want updated. Then it evaluates your expressions and uses the update index to set the computed values. In practice, it looks just like you'd write this in "real" code.

This approach gives a really clean model that leverages the fact we're dealing with well-defined data structures. Having a big, nasty overlay is better than the logic being within a string-interpolated template, but it isn't a lot better. Instead, Koreo lets you have sane logic and "point" updates because you can (optionally) layer the overlays themselves. What's even better with this model is that they're ideally small and they are always testable. They're always testable because they are pure functions and Koreo contains a first-class testing framework that is built into the language itself. You can test the entire set of overlays to ensure the resource is correct:

Koreo ResourceFunction and FunctionTest A Koreo ResourceFunction and accompanying FunctionTest for building an S3 bucket with overlays

But you can actually test each overlay in isolation via the same testing framework. That means rather than an insane huge template that you're trying to test, you write something like unit tests for the data structure updates. That makes it so that you can ensure your overlays individually function correctly and that they update the correct fields with the correct values. Then you can write a few tests to ensure the correct overlays are applied and that they work together correctly. It is like some form of black magic, and no sacrifices were even required—just treating structured data like structured data.

This layered approach to resource materialization provides a means for "factoring" different configuration concerns into reusable, testable building blocks. For instance, the security team wants to ensure specific encryption configuration is enabled, the compliance team wants to ensure data-retention policies are set, and the SRE team wants to ensure data replication is configured appropriately.

Koreo ValueFunction used as an overlay and FunctionTest A Koreo ValueFunction and accompanying FunctionTest used as an overlay by the S3 ResourceFunction for enabling lifecycle rules

Treating configuration as code

Of course, string templating isn't the only approach that has been used to manage Kubernetes configurations. Configuration languages like Jsonnet, Cue, and Dhall attempt to solve some of these challenges by treating configuration as structured data rather than text. These languages introduce programming constructs like variables, conditionals, and functions while maintaining a declarative model. However, they often introduce their own complexity. Jsonnet, for example, provides a powerful way to generate JSON/YAML but can quickly become difficult to debug due to its evaluation model. Cue enforces strong validation but requires a different way of thinking about constraints and configurations.

Each of these tools attempts to move beyond naive string interpolation and offer a step in the right direction, but they still operate largely as external DSLs rather than being deeply integrated into the Kubernetes Resource Model. There's power in being able to leverage a "real" programming language for configuration, but sometimes being overly expressive is a drawback. What we want is a nice balance of expressiveness that is still deterministic.

Koreo strikes that balance by embedding structured configuration directly into Kubernetes in conjunction with resource orchestration. Instead of treating configuration as text or an external DSL, Koreo provides a native, programmatic approach that integrates seamlessly with the Kubernetes Resource Model. This structured foundation enables features that go beyond simple templating—such as its built-in test framework, which is designed for testing async, event-driven control loops without requiring a tremendous amount of boilerplate or test harness setup. It makes modeling complex scenarios easy and supports validating happy paths, error handling, and template rendering.

Additionally, the Koreo language server integrates with your IDE, providing real-time feedback, autocomplete, and introspection. This makes creating and manipulating data structures feel like working in a real programming language rather than twiddling YAML or editing string-interpolated templates.

This represents a fundamental shift in how we approach Kubernetes configuration management and structured data orchestration. Koreo simplifies the management of complex configurations. It achieves this by moving away from string-interpolated templates and adopting a structured, programmatic approach that's native to Kubernetes. Its overlay system allows for precise, testable updates, eliminating the fragility and complexity of traditional templating. With a built-in testing framework and IDE integration, Koreo makes working with Kubernetes configuration feel more like actual programming. Ultimately, it provides a platform engineering toolkit that allows you to build powerful abstractions on top of Kubernetes.

Configuration management + resource orchestration

But configuration is only one part of the story. Managing infrastructure effectively requires not only better configuration management, but also a way to orchestrate resources and reconcile changes over time. This is where Koreo extends beyond just managing structured data—it provides a controller-driven model that ensures configuration changes are continuously reconciled, just like Kubernetes itself does for workloads.

Rather than treating Kubernetes resources as static manifests to be generated and applied, Koreo embraces a dynamic, event-driven model where configurations are continuously managed and updated based on changing conditions. This allows us to do more than simply treat configuration as structured data. It enables a truly controller-driven approach to infrastructure management by providing a way to program and compose control loops.

Controller-Driven Infrastructure as Code

Harnessing the Kubernetes Resource Model for modern infrastructure management

Infrastructure as Code (IaC) revolutionized how we manage infrastructure, enabling developers to define resources declaratively and automate their deployment. However, tools like Terraform and CloudFormation, despite their declarative configuration, rely on an operation-centric model, where resources are created or updated through one-shot commands.

The evolution of IaC: From operations to controllers

In contrast, Kubernetes introduced a new paradigm with its controller pattern and the Kubernetes Resource Model (KRM). This resource-centric approach to APIs redefines infrastructure management by focusing on desired state rather than discrete operations. Kubernetes controllers continuously monitor resources, ensuring they conform to their declarative configurations by performing actions to move the actual state closer to the desired state, much like a human operator would. This is known as a control loop.

Kubernetes also demonstrated the value of providing architectural building blocks that encapsulate standard patterns, such as a Deployment. These can then be composed and combined to provide impressive capabilities with little effort—HorizontalPodAutoscaler is an example of this. Through extensibility, Kubernetes allows developers to define new resource types and controllers, making it a natural fit for managing not just application workloads but infrastructure of any kind. This enables you to actually provide a clean API for common architectural needs that encapsulates a lot of routine business logic. Extending this model to IaC is something we call Controller-Driven IaC.

Building on the Kubernetes controller model

Controller-Driven IaC builds upon the Kubernetes foundation, leveraging its controllers to reconcile cloud resources and maintain continuous alignment between desired and actual states. By extending Kubernetes' principles of declarative configuration and control loops to IaC, this approach offers a resilient and scalable way to manage modern infrastructure. Integrating cloud and external system APIs into Kubernetes controllers enables continuous state reconciliation beyond Kubernetes itself, ensuring consistency, eliminating configuration drift, and reducing operational complexity. It results in an IaC solution that is capable of working correctly with modern, dynamic infrastructure. Additionally, it brings many of the other benefits of Kubernetes—such as RBAC, policy enforcement, and observability—to infrastructure and systems outside the cluster, creating a unified and flexible management framework. In essence, Kubernetes becomes the control plane for your entire developer platform. That means you can offer developers a self-service experience within defined bounds, and this can further be scoped to specific application domains.

This concept isn't entirely new. Kubernetes introduced Custom Resource Definitions (CRDs) in 2017, enabling the creation of Operators, or custom controllers, to extend its functionality. Today, countless Operators exist to manage diverse applications and infrastructure, both within and outside of Kubernetes, including those from major cloud providers. For instance, GCP's Config Connector, AWS's ACK, and Azure's ASO offer controllers for managing their respective platform's infrastructure. However, just as operationalizing Kubernetes requires tooling and investment to build an effective platform, so too does implementing Controller-Driven IaC. Integrating these various controllers into a cohesive platform requires its own kind of orchestration. We need a way to program control loops—whether built-in Kubernetes controllers (like Deployments or Jobs), off-the-shelf controllers (like ACK or Config Connector), or custom controllers we've built ourselves.

Introducing Koreo: Programming control loops for modern platforms

There are tools such as Crossplane that take a controller-oriented approach to infrastructure, but they have their own challenges and limitations. In particular, we really need the ability to compose arbitrary Kubernetes resources and controllers, not just specific provider APIs. What if we could treat anything in Kubernetes as a referenceable object capable of acting as the input or output to an automated workflow, and without the need for building tons of CRDs or custom Operators? Additionally, it's critical that resources can be namespaced rather than cluster-scoped to support multi-tenant environments and that the corresponding infrastructure can live in cloud projects or accounts separate from where the control plane itself lives.

To address these needs and deliver the full potential of Controller-Driven IaC, we've developed and open-sourced Koreo, a platform engineering toolkit for Kubernetes. Koreo is a new approach to Kubernetes configuration management and resource orchestration empowering developers through programmable workflows and structured data. It enables seamless integration and automation around the Kubernetes Resource Model, supporting a wide range of use cases centered on Controller-Driven IaC. Koreo serves as a meta-controller programming language and runtime that allows you to compose control loops into powerful abstractions.

Workflow in Koreo UI The Koreo UI showing a workflow for a custom AWS workload abstraction

Koreo is specifically built to empower platform engineering teams and DevOps engineers by allowing them to provide Architecture-as-Code building blocks to the teams they support. With Koreo, you can easily leverage existing Kubernetes Operators or create your own specialized Operators, then expose them through powerful, high-level abstractions aligned with your organization's needs. For example, you can develop a "StatelessCrudApp" that allows development teams to enable company-standard databases and caches with minimal effort. Similarly, you can build flexible automations that combine and orchestrate various Kubernetes primitives.

Workload example An instance of the custom AWS workload abstraction

Where Koreo really shines, however, is making it fast and safe to add new capabilities to your internal developer platform. Existing configuration management tools like Helm and Kustomize, while useful for simpler configurations, become unwieldy when dealing with the intricacies of modern Kubernetes deployments. They ultimately treat configuration as static data, and this becomes problematic as configuration evolves in complexity.

Koreo instead embraces configuration as code by providing a programming language and runtime with robust developer tooling. This allows platform engineers to define and manage Kubernetes configurations and resource orchestration in a way that is better suited to modern infrastructure challenges. It offers a solution that scales with complexity. A built-in testing framework makes it easy to quickly validate configuration and iterate on infrastructure, and IDE integration gives developers a familiar programming-like experience.

The future of infrastructure management is controller-driven

By harnessing the power of Kubernetes controllers for Infrastructure as Code, Koreo bridges the gap between declarative configuration and dynamic infrastructure management. It moves beyond the limitations of traditional IaC, offering a truly Kubernetes-native approach that brings the benefits of control loops, composability, and continuous reconciliation to your entire platform. With Koreo, you're not just managing resources; you're composing Kubernetes controllers to do powerful things like building internal developer platforms, managing multi-cloud infrastructure, or orchestrating application deployments and other complex workflows.

See what you can build with Koreo.

Wrangle platform engineering