OpenTofu currently uses its own domain-specific language, built in terms of HashiCorp's HCL, as the means for module authors to describe the infrastructure that should exist in their target environments.
At various points over the years people have asked whether OpenTofu or its predecessor could directly support some other similarly-designed language as an alternative way to write a module. There are lots of existing attempts at this which work as preprocessors that then generate the main OpenTofu language, but these questions (and this article) are about having another language be directly supported by OpenTofu, so that there isn't any additional indirection where an author is using one language to generate another.
(Note that this is not about supporting general-purpose languages such as TypeScript or Python. That's also a common question with some interesting details, but if that's what you're looking for then I'd recommend you consider using Pulumi instead of OpenTofu, since that product already addresses that problem well.)
Because of these common requests -- and also, to be honest, because I just find the question interesting in itself! -- I've researched a number of different potential alternative languages in the past, and unfortunately have always found some drawback that makes it not a good fit for OpenTofu.
One of the earliest alternative languages I considered was CUE, but it was very young when I first looked at it, and so the language was not yet complete enough for practical use and its library API (for using CUE as an embedded language in other Go programs) was aimed at too high a level of abstraction for OpenTofu's needs.
CUE has come quite a long way in the meantime though, so I thought it would be interesting to revisit it and see if the situation has changed. This is a also a pretty good time to consider alternative languages for OpenTofu, since we're getting started on a redesign of OpenTofu's internals that should, among other benefits, hopefully better isolate OpenTofu's semantic layer from its surface language(s).
A CUE Primer
The CUE website has a number of guides on the concepts behind its language and so I won't spend a lot of words on that here, but there are some top-level ideas that the rest of this article relies on, so I'll attempt a short summary.
For my purposes here, the most importand part is the letter "U" in "CUE", which stands for "unify". I think is is by far the most significant idea that any CUE author needs to understand.
"Unification" is a fancy word for the idea of taking two definitions of the
same item and producing a new value that somehow represents the meaning of both
inputs at once. The notation for unification is the & operator, which has
the values to be unified as its operands.
For example, the type constraint string can unify with the concrete
value "hello" to produce the concrete value "hello", because that value
is a string: string & "hello" produces "hello".
A more complicated example is unifying two "struct" values (CUE's closest
equivalent to OpenTofu's object types) by combining their fields together:
{ a: "foo" } and { b: "bar" } can unify to { a: "foo", b: "bar" }.
Conversely, a failure to unify is the primary way that CUE can be used for
validation-related use-cases: bool and "hello" cannot unify at all, and
so successfully unifying a concrete value with a value that describes schema
affirms that the concrete value conforms to the schema. CUE represents such
failures as a special value called "bottom", spelled _|_, and so the
error gets recorded as part of the data structure in the place where it occured
rather than being returned through a side-channel as in many other languages.
OpenTofu actually has some similar theoretical foundations itself, though they are considered to be implementation details rather than part of the programming model, and are implemented directly as logic in Go rather than abstractly in a domain-specific language as in CUE.
For example, a subset of the schema for
an older, simpler version of aws_vpc in the hashicorp/aws provider
could be written in CUE syntax something as like this:
{
// Arguments that the configuration author is allowed to set
cidr_block: string
instance_tenancy: string | *null
enable_dns_support: bool | *true
enable_dns_hostnames: bool | *false
enable_classiclink: bool | *false
enable_classiclink_dns_support: bool | *false
assign_generated_ipv6_cidr_block: bool | *false
tags: { [string]: string } | *{}
// Arguments whose values are decided by the provider
arn: string
id: string
main_route_table_id: string
ipv6_association_id: string | null
ipv6_cidr_block: string | null
}
That (non-concrete) value can be unified with a concrete aws_vpc object
from OpenTofu state to determine whether it conforms to the schema, and also
to substitute default values (using the syntax like *true above) where
the given struct value does not include those fields.
For example, given this very minimal input...
{
cidr_block: "192.168.0.0/16"
}
...unifying that with the schema above would produce:
{
cidr_block: "192.168.0.0/16"
instance_tenancy: null
enable_dns_support: true
enable_dns_hostnames: false
enable_classiclink: false
enable_classiclink_dns_support: false
assign_generated_ipv6_cidr_block: false
tags: {}
arn: string
id: string
main_route_table_id: string
ipv6_association_id: string | null
ipv6_cidr_block: string | null
}
Notice that some of the fields still contain non-concrete constraints like
string instead of concrete values like "hello". This has a similar meaning
to the idea of "unknown values" in the OpenTofu language: it's a placeholder
for a value that we don't fully know yet, describing whatever we do know about
it. The unification operation I described above is quite similar to the work
OpenTofu does to produce the "proposed new value" to send to a provider for
planning purposes.
The other very important idea about CUE for the sake of this example is that, unlike in the OpenTofu language, the values described in the input are also the values you can refer to from other expressions. You can refer to anything you've defined elsewhere.
Here's a simple example demonstrating that:
greeting = "Hello" name = "Martin" message = "\(greeting), \(name)!"
message can refer directly to the greeting and name fields defined
at the same level. In the OpenTofu language the symbols you can refer to are
only indirectly related to the input configuration, mediated using rules decided
by OpenTofu itself, so something like the above would typically involve defining
some Local Values --
an OpenTofu-level concept rather than an HCL-level concept -- with references
between them.
What it means to use CUE for OpenTofu
There are lots of different ways to think about using a different source language for defining an OpenTofu module. I already mentioned above that preprocessors are not what I'm talking about here, but there's also another possibility I want to rule out before we start:
In principle one could try to map HCL's syntax-agnostic information model onto CUE's concepts, so that applications like OpenTofu would continue thinking in terms of HCL's concepts even though the source syntax is CUE, similar to how HCL already defines a mapping to JSON's concepts.
However, HCL's abstraction here was designed for languages that either have a very similar evaluation model as HCL's own native syntax (of which there are currently no interesting examples) or languages that have no expression evaluation concept of their own at all and so HCL can impose its own, like with JSON where HCL says that JSON strings correspond to HCL string templates.
CUE is not a viable target for this abstraction because it has its own separate evaluation model that is not compatible with HCL's. The best we could do is effectively the same as preprocessing: evaluate the CUE program to obtain a concrete data structure, and then use that data structure as the input to HCL. Most notably, that would mean that CUE expressions could not refer to any dynamic data from HCL's scope, because CUE evaluation would happen before HCL's expression evaluation.
Therefore I'm only really interested in a model where OpenTofu interacts directly with CUE in a similar way to how it interacts with HCL today, where the CUE program is the definition of the module and OpenTofu expects the CUE program to produce a final value to send to the provider, rather than producing another level of expressions to evaluate through HCL.
Making CUE meet OpenTofu's expectations
Although the OpenTofu language and CUE have some similar theoretical foundations, the fact that CUE treats the input program as the data that's available in references, rather than having a separate symbol table as HCL does, means that OpenTofu would need to approach CUE evaluation a little differently than HCL evaluation.
Specifically, OpenTofu must introduce additional data into the CUE program by actually modifying that program to include the additional data.
For the sake of my experiment here, I decided to slightly extend CUE using its "attribute" syntax, which was intended for this very purpose of describing how data in CUE relates to data in some other language/format/system. For example, let's consider this relatively-simple but realistic description of some AWS EC2 network resources:
_base_cidr_block: string @input(base_cidr_block)
_subnets: {
[string]: close({
number: int
tag_name: string
})
} @input(subnets)
_tags: { [string]: string } @input(tags)
vpc: {
cidr_block: _base_cidr_block
tags: _tags
} @resource(aws_vpc.main)
subnets: {
for n, s in _subnets {
(n): {
cidr_block: cidrsubnet(vpc.cidr_block, 4, s.number)
vpc_id: vpc.id
tags: {
_tags
Name: s.tag_name
}
}
}
} @resource(aws_subnet.main[*])
vpc_id: vpc.id @output(vpc_id)
subnet_ids: {
for k, s in subnets {
(k): s.id
}
} @output(subnet_ids)
Notice that some of the fields are annotated with attributes starting
with @input, @resource, and @output. I intend for these to have a
similar meaning as variable, resource/data/ephemeral, and output
blocks respectively in the current OpenTofu language, but here I'm leaning
into CUE's "data-first" design instead of HCL's "structure-first" design.
These extra attributes allow OpenTofu to identify parts of the data structure that it should interact with, while everything else is just arbitrary data that's private to the CUE program.
Instead of passing in a symbol table with symbols like var.subnets, OpenTofu
would handle input variables by behaving as if it was modifying this program
with additional "unify" operations on every field that's annotated with
an @input attribute, like this:
_base_cidr_block: string & "192.168.0.0/16" @input(base_cidr_block)
_subnets: {
[string]: close({
number: int
tag_name: string
})
} & {
foo: {
number: 1
tag_name: "Foo"
}
bar: {
number: 2
tag_name: "Bar"
}
} @input(subnets)
_tags: { [string]: string } & {
Environment: "PROD"
} @input(tags)
After the unification operator has been applied, _base_cidr_block refers
to the concrete value "192.168.0.0/16", and so on. Therefore CUE can
propagate that value into the definition of vpc, producing the same
value as this program:
vpc: {
cidr_block: "192.169.0.0/16"
tags: {
Environment: "PROD"
}
} @resource(aws_vpc.main)
The other helpful modification OpenTofu can make is to find each field
annotated with a @resource and modify its expression to be unified with
a value derived from the provider schema. I'm going to use an even more
simplified subset of the real provider schema here just to keep these
examples relatively terse:
vpc: {
cidr_block: _base_cidr_block
tags: _tags
} & close({
id: string
cidr_block: string
tags: { [string]: string } | *{}
}) @resource(aws_vpc.main)
subnets: {
for n, s in _subnets {
(n): {
cidr_block: cidrsubnet(vpc.cidr_block, 4, s.number)
vpc_id: vpc.id
tags: {
_tags
Name: s.tag_name
}
}
}
} & {
[string]: close({
id: string
vpc_id: string
cidr_block: string
tags: { [string]: string } | null
})
} @resource(aws_subnet.main[*])
One somewhat-arbitrary decision I made here is that using @resource with
an address that ends in [*] has a similar effect to for_each in the current
OpenTofu language, and so whatever it's annotating is expected to be a map
from instance keys to configuration objects rather than just a single
configuration object. Therefore the @resource(aws_vpc.main) field gets
annotated with just the aws_vpc schema directly, while
@resource(aws_subnet.main[*]) gets annotated with a slightly more elaborate
schema that represents the map-of-objects structure.
(I'm writing out the modified CUE programs as source code here just for exposition purposes, but in my prototype implementation of this they really exist only in memory as a modified abstract syntax tree, so the end-user never needs to see these somewhat-ugly expressions.)
After the CUE runtime to evaluate this modified program, annotated with both input values and provider schemas, produces the following value:
{
// (the underscore-prefixed names are removed in CUE's default
// value presentation because by convention they are unexported
// fields.)
vpc: {
id: string
cidr_block: "192.168.0.0/16"
tags: {
Environment: "PROD"
}
}
subnets: {
bar: {
id: string
cidr_block: "192.168.32.0/20"
vpc_id: string
tags: {
Environment: "PROD"
Name: "Bar"
}
}
foo: {
id: string
cidr_block: "192.168.16.0/20"
vpc_id: string
tags: {
Environment: "PROD"
Name: "Foo"
}
}
}
vpc_id: string
subnet_ids: {
bar: string
foo: string
}
}
As a final postprocessing step we can use the OpenTofu-specific attributes again to extract the sub-trees of this data structure that are relevant to OpenTofu's goals:
// (this notation is just some debug output from my prototype, showing
// the evaluated value associated with each declaration.)
@resource(aws_vpc.main) is {
id: string
cidr_block: "192.168.0.0/16"
tags: {
Environment: "PROD"
}
}
@resource(aws_subnet.main) is {
bar: {
id: string
cidr_block: "192.168.32.0/20"
vpc_id: string
tags: {
Environment: "PROD"
Name: "Bar"
}
}
foo: {
id: string
cidr_block: "192.168.16.0/20"
vpc_id: string
tags: {
Environment: "PROD"
Name: "Foo"
}
}
}
@output(vpc_id) is string
@output(subnet_ids) is {
bar: string
foo: string
}
What we have here is essentially the same as what the current OpenTofu language produces during the validation phase: resource instance objects that conform to the resource type schemas, with unknown values as placeholders for values that won't be known until either the plan phase or apply phase, and the similarly-placeholder values for outputs, derived from those resource values.
A real implementation could therefore send each of these three resource instance
objects (aws_vpc.main, aws_subnet.main["bar"], and aws_subnet.main["foo"])
to the provider's ValidateManagedResourceConfig function to make sure that
the value also respects any additional validation rules that can't be expressed
in CUE, similar to how providers normally catch validation problems that can't
be caught by HCL/OpenTofu alone.
So far so good! But next we need to deal with an additional problem: side-effects and the dependencies between them.
Gradual Evaluation with Dependencies
During the planning phase, the OpenTofu language runtime visits and evaluates each resource instance configuration in a dependency-respecting order, typically inferring dependencies automatically based on the references between resource instances.
For each resource instance, it calls the provider's PlanManagedResourceChange
operation to allow the provider to run some arbitrary logic to decide how to
merge the prior state with the desired state implied by the configuration,
producing the planned new value. When one resource instance refers to another,
it's planned new value that actually gets populated into HCL's symbol table,
so that downstream resource instances can incorporate values that the provider
added to the upstream resource instance's object.
The apply phase is essentially the same except that it also uses
ApplyManagedResourceChange to cause changes to the real infrastructure, and
then that function's value propagates downstream to other resource instances
instead.
Because CUE does not have a separate symbol table from the source program, again we need a slightly different strategy for CUE. This step is the main place where things fell apart in my first experiment with incorporating CUE, but this is also an area where some things have changed in our favor in the meantime.
In particular, CUE actually has its own subsystem that gradually performs
side-effects in a dependency-respecting fashion and propagates data between
them, called flow.
Unfortunately, the current form of this relies on being embedded in the CUE
codebase, and so an external caller like OpenTofu cannot follow this strategy
today. Someone from the CUE team has indicated that
they intend to expose the underlying building-block eventually
though, so the remainder of this is a hypothetical design that I've not been
able to verify using a prototype, but the approach in the "flow" package is
similar enough that I'm optimistic that it should work.
Recalling how I previously incorporated input values and resource type schemas into the program, you might already have guessed what comes next: OpenTofu must continue gradually modifying the program with additional "unify" operations, one resource instance at a time until they've all been evaluated. Each modification adds information needed to evaluate downstream resource instances.
For example, after applying the changes for aws_vpc.main, OpenTofu could
modify the expression for the vpc field to include another unify operation,
this time with the final object returned by the provider:
vpc: {
cidr_block: _base_cidr_block
tags: _tags
} & close({
id: string
cidr_block: string
tags: { [string]: string } | *{}
}) & {
id: "vpc-a1b2c3d4"
cidr_block: "192.168.0.0/16"
tags: {
Environment: "PROD"
}
} @resource(aws_vpc.main)
If OpenTofu then asked CUE to evaluate this modified program, it would incorporate the VPC ID that was now returned by the provider and propagate it downstream, allowing the resources and output values to update to the following:
@resource(aws_vpc.main) is {
id: "vpc-a1b2c3d4"
cidr_block: "192.168.0.0/16"
tags: {
Environment: "PROD"
}
}
@resource(aws_subnet.main) is {
bar: {
id: string
cidr_block: "192.168.32.0/20"
vpc_id: "vpc-a1b2c3d4"
tags: {
Environment: "PROD"
Name: "Bar"
}
}
foo: {
id: string
cidr_block: "192.168.16.0/20"
vpc_id: "vpc-a1b2c3d4"
tags: {
Environment: "PROD"
Name: "Foo"
}
}
}
@output(vpc_id) is "vpc-a1b2c3d4"
@output(subnet_ids) is {
bar: string
foo: string
}
After this, OpenTofu has enough information to create and apply the final
plan for both aws_subnet.main["bar"] and aws_subnet.main["foo"],
and can likewise modify their expressions with an additional unification
operation:
subnets: {
for n, s in _subnets {
(n): {
cidr_block: cidrsubnet(vpc.cidr_block, 4, s.number)
vpc_id: vpc.id
tags: {
_tags
Name: s.tag_name
}
}
}
} & {
[string]: close({
id: string
vpc_id: string
cidr_block: string
tags: { [string]: string } | null
})
} & {
// Note that this time we have two separate objects to return,
// because this is a multi-instance resource.
bar: {
id: "subnet-abc123"
cidr_block: "192.168.32.0/20"
vpc_id: "vpc-a1b2c3d4"
tags: {
Environment: "PROD"
Name: "Bar"
}
}
foo: {
id: "subnet-def789"
cidr_block: "192.168.16.0/20"
vpc_id: "vpc-a1b2c3d4"
tags: {
Environment: "PROD"
Name: "Foo"
}
}
} @resource(aws_subnet.main[*])
...and then that's all of the resource instances dealt with and one final evaluation should leave us with concrete state for each resource instance and so concrete output values derived from those:
@resource(aws_vpc.main) is {
id: "vpc-a1b2c3d4"
cidr_block: "192.168.0.0/16"
tags: {
Environment: "PROD"
}
}
@resource(aws_subnet.main) is {
bar: {
id: "subnet-abc123"
cidr_block: "192.168.32.0/20"
vpc_id: "vpc-a1b2c3d4"
tags: {
Environment: "PROD"
Name: "Bar"
}
}
foo: {
id: "subnet-def789"
cidr_block: "192.168.16.0/20"
vpc_id: "vpc-a1b2c3d4"
tags: {
Environment: "PROD"
Name: "Foo"
}
}
}
@output(vpc_id) is "vpc-a1b2c3d4"
@output(subnet_ids) is {
bar: "subnet-abc123"
foo: "subnet-def789"
}
The execution engine can then return these output values to the caller, and its work is complete!
Comparisons with the current OpenTofu Language implementation
Aside from the fact that not all of the needed functionality is currently exposed to outside callers, it now seems like CUE has sufficient features to be used as part of an evaluation and execution model like OpenTofu's.
The main "trick" to this is that whereas OpenTofu's runtime maintains several separate data structures -- a table of input variable values, a table of provider schemas, the "state" containing the results for each resource instance -- for CUE we effectively store all of that data inside the CUE program itself, by modifying it in-place and repeatedly re-evaluating it.
With that comes a performance concern, though: HCL is intentionally designed to allow its calling application to pull the input program into small parts that can be evaluated in isolation, whereas for CUE the programming model effectively requires re-evaluating the entire program over and over as new information is added to it. I don't have the practical experience to gauge how much this actually hurts, but incremental evaluation is already recorded as a concern for CUE's own "flow" engine, which follows a very similar implementation strategy to what I sketched for OpenTofu above.
On the other hand, re-evaluating the entire program every time (or, ideally in future, re-evaluating a subset that's affected by a change) gives authors a lot more flexibility in how they can structure things, vs. OpenTofu's highly prescriptive structure. In most of my sketching here I used a relatively flat structure that resembles a current typical OpenTofu module with all of the declaration attributes on top-level fields, but in principle those declaration attributes could appear at arbitrary points in the program, and be nested inside one another:
vpc: {
cidr_block: _base_cidr_block
tags: _tags
// NOTE: Directly annotating the `id` field also being the definition
// of the `vpc_id` output, instead of declaring a separate field for
// the output somewhere else.
id: string @output(vpc_id)
} @resource(aws_vpc.main)
When I tried this in practice I had some troubles with the attributes not being
propagated consistently during unification, and so in certain shapes of
configuration the @output(vpc_id) attribute got silently dropped when unifying
with the final value for the resource instance. I'm not sure if that's a bug
in the CUE evaluator or if I just don't understand well enough the rules for
attributes under unification.
What's missing?
So far I've mainly focused on the commonalities between the current OpenTofu language and this hypothetical CUE-based alternative, but there are still a few remaining details that CUE doesn't seem to have an answer to yet:
OpenTofu uses a concept called "marks" from the
ctytype system that HCL uses to represent various details about the provenence of a value, such as whether it was derived from a sensitive value or from an ephemeral value.As far as I can tell, CUE doesn't have any mechanism quite like that. I don't know how the concepts of sensitivity and ephemerality would be implemented for modules written in CUE.
OpenTofu is also beginning to make use of another
ctyconcept called "capsule types" in the new evaluator prototype, as a way to pass references to OpenTofu-specific objects like providers through expressions as a special kind of value.As far as I can tell, CUE's type system is closed and so doesn't offer any similar way for the calling application to pass opaque references to its own objects through CUE evaluation. I can certainly understand why that would be omitted though, since it would need to be possible to define where each of these new types appears in the overall type lattice and define custom unification rules for them. The equivalent mechanisms in
ctyare quite complex and awkward to use.OpenTofu has a set of built-in functions that modules can rely on. Many of these are general-purpose enough that CUE already has its own equivalents, such as JSON parseing/encoding, but OpenTofu also has a few that are a little more specific to OpenTofu's domain or that directly expose information from OpenTofu's language runtime, so we'd probably want some way to introduce custom functions written in Go.
There does not appear to currently be any way to do that. All of the Go-implemented functions available in CUE live inside the CUE codebase and are implemented in terms of unexported APIs, so OpenTofu cannot currently define its own functions.
(Observant readers might've noticed that my earlier examples used a
cidrsubnetfunction that doesn't actually exist in CUE. I cheated with that: the real program I was using for prototyping uses"192.168.\(s.number*16).0/20"instead as a placeholder, just like folks used to do in Terraform before it had CIDR calculation functions!)Finally, as noted earlier, the facilities for detecting dependencies between values are currently not exposed in the public API, though I expect that will change eventually. Dependency detection is crucial for OpenTofu's behavior because it needs to propagate the results of side-effects, so this is a show-stopper for now.
The "flow" engine also works at the ADT level rather than the AST level as I used in my prototype. The AST API serves as a good enough substitute for experimentation, but it would be a lot less clunky to work at the semantic layer, so hopefully more of that functionality will be exposed through the public
cue.Valueabstraction eventually.
Overall though, I think the path to this hypothetical future is considerably clearer than when I first investigated this several years ago.
What's next?
I'm not intending to pursue this any further at least until the dependency
API and a more complete representation of the ADT are exposed in CUE's public
API, since the dependency detection is crucial and while AST-based processing
can work in simple cases it's unlikely to be robust in more complicated programs,
such as those where the predefined symbols like string are shadowed by local
declarations.
The proposed new internal architecture for OpenTofu will hopefully make it more plausible to support different source languages in future, and possibly even allow mixing them in the same program, but I expect we won't prioritize that for now since we've got plenty work to do just to get HCL-based modules working to the same extent as they work in today's runtime.
With that said then: this article is mainly just some notes for my own future reference so I can hopefully pick up where I left off as the CUE team continues to expose more functionality in the public API. We'll see how it goes!