What is YAML?

We look at the pros and cons of the language developed in 2001

The increasing popularity of Kubernetes means you've probably heard of YAML because it's the format for Kubernetes configuration files, so almost every developer may need to get some familiarity with it.

But while it's not as ubiquitous as JSON, YAML goes far beyond Kubernetes; first released in 2001, it's used in tools from OpenStack to Ansible playbooks.

Originally YAML stood for Yet Another Markup Language; it was renamed to read YAML Ain't Markup Language to make it clear that unlike SGML and HTML that are languages for documents, it's designed for data.

Advertisement - Article continues below

It's a text-based format for declarative configuration information or specifications, and for data serialisation (where you convert complex, structured data into a flat file format you can store or transmit, but still be able to get back to the original structure).

Those are the same kinds of things you'd do with XML but unlike XML or JSON, it's designed to be a format that humans can read and write easily, which is why projects like Ansible picked it over other options. The YAML website is easy to read and it's also valid YAML code.

How does YAML work?

This programming language borrows features and patterns from a host of others to simplify the process of reading and writing code.

Advertisement
Advertisement - Article continues below

You may use indentations and new lines to structure code so how the code is displayed on your monitor is how it would work, as in Python for example. You’ll be able to select the degrees of indentation you wish to adopt so you can choose whichever you find the most readable, so long as you maintain consistency. You cannot, however, use the tab character, which avoids a major issue that varying operating systems handle tabs differently - in addition to the ongoing spaces versus tabs debate.

Advertisement - Article continues below

Users may also adopt a more compact format where the two main data types, lists and associative arrays (also known as maps) are denoted by the [] and) {} figures. This makes it effectively a superset of JSON, although this is outlined for machines, not humans, to read. Incidentally, YAML also has features that are absent from JSON, including comments, which JSON hasn’t been created to support. There are, however, workarounds.

These data types may also be nested to represent more complicated structures based on those present in Perl. Features are lifted from C, HTML, MIME, as well as mail headers, with colons used to denote key: value pairs.

The space function is present so users won’t have to put quotation marks around strings and numbers. Simple types such as integers, floats and Booleans are detected by default, and there’s priced-in support for ISO-formatted dates and times, although you can also declare your own data types.

Advertisement - Article continues below

Structures let you store multiple documents in a single file or refer to content in one part of the document from elsewhere using an anchor (which also lets you duplicate or inherent properties).

That means it's much more flexible than JSON where the hierarchy is fixed, with each child node having only one parent node and while there's a similar option in XML the YAML parse automatically expands the references. That way you get a file that's easier to read and you avoid potential errors copying and pasting parameters where only a handful of things change between different instances, but external systems don't need to be told about the structure of the YAML file.

What are the benefits of YAML?

Because the formatting is straightforward and you don't have to worry about closing tags, brackets or quote marks, you can edit YAML in simple text editing tools and subsections of YAML files are often valid YAML. But there are also plugins to add YAML support to common IDEs like Visual Studio Code and Atom; these can use the YAML Language Server provide autocomplete and Intellisense, and there are several YAML linters to check code for correctness.

Advertisement
Advertisement - Article continues below
Advertisement - Article continues below

You can't write YAML that validates itself the way XML documents can do, based on schema, but if you need to define a schema for your YAML there are languages that let you do that. The combination of YAML and JSON Schema can be powerful: VS Code, the DocFX static web site generator and even the schema for Microsoft's Q# Quantum Chemistry library use them together to achieve a more human-readable version of JSON.

Using YAML files has advantages over typing in command line options: you can create much more complex structures in YAML and you don't have to deal with long and unwieldy strings of parameters. And because they're files, you can check them into source control systems, track versions and changes. Because YAML treats lines as information, it works better with git-based systems for tracking changes than JSON. That makes it easier to treat configuration as code that you manage, test and consume the same way you do all your other code.

Are there any downsides?

Like any language, YAML has its faults and its detractors. It was explicitly designed to be simple and straightforward to read and write, but because indentation is functional, it's easy to make a mistake and change what your YAML code does by adding or missing a space. Long YAML files quickly get complex and typos can be hard to find; if a typing error means your code is functionally correct but doesn't do what you want a linter won't help and it's a declarative language so there's no concept of stepping through' code or setting breakpoints to debug it.

Advertisement - Article continues below

But because it's much more readable than JSON or XML, you're more likely to be able to spot what's wrong by reading through a YAML file. And the problem of complexity is a sign of underlying changes in IT rather than flaws in YAML itself. As the kind of configuration you do in YAML becomes more central with the adoption of DevOps, the configurations you're specifying are going to become more complex and demand more expertise, whatever language you're writing them in. There are arguable better languages like TOML but they haven't been adopted widely, so YAML is what an increasing number of developers will be faced with.

Higher level tools are always going to be easier to work with than reading and writing YAML files by hand and there's an ever-growing selection of those for Kubernetes, from the kubectl command line to tools like Helm that streamline installing and managing Kubernetes apps to managed cloud services like Azure Kubernetes Service and tools like Pulumi that use familiar programming languages like JavaScript or PowerShell. But YAML is the configuration format for some many popular tools and projects that it's worth getting familiar with it and understanding what it's good at as well as its quirks.

Featured Resources

Preparing for long-term remote working after COVID-19

Learn how to safely and securely enable your remote workforce

Download now

Cloud vs on-premise storage: What’s right for you?

Key considerations driving document storage decisions for businesses

Download now

Staying ahead of the game in the world of data

Create successful marketing campaigns by understanding your customers better

Download now

Transforming productivity

Solutions that facilitate work at full speed

Download now
Advertisement
Advertisement

Most Popular

Visit/business/business-operations/356395/nvidia-overtakes-intel-as-most-valuable-us-chipmaker
Business operations

Nvidia overtakes Intel as most valuable US chipmaker

9 Jul 2020
Visit/laptops/29190/how-to-find-ram-speed-size-and-type
Laptops

How to find RAM speed, size and type

24 Jun 2020
Visit/mobile/google-android/356373/over-2-dozen-additional-android-apps-found-stealing-user-data
Google Android

Over two dozen Android apps found stealing user data

7 Jul 2020