Day - 80 of DevOps

Day - 80 of DevOps

YAML vs JSON and Where is YAML Used

Welcome to Day 80 of the #100DaysOfDevOps Challenge! Today we will see about the difference between YAML and JSON and Where is YAML Used

YAML vs JSON

How is YAML different from JSON? Let’s try to figure it out.

Check out the below code snippet of Kubernetes configuration written in JSON. Don’t pay attention to what it does just observe the file.

{
 "description": "APIService represents a server for a particular GroupVersion. Name must be \"version.group\".",
 "properties": {
   "apiVersion": {
     "description": "APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources",
     "type": [
       "string",
       "null"
     ]
   },
   "kind": {
     "description": "Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds",
     "type": [
       "string",
       "null"
     ],
     "enum": [
       "APIService"
     ]
   },
   "metadata": {
     "$ref": "https://kubernetesjsonschema.dev/master/_definitions.json#/definitions/io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta"
   },
   "spec": {
     "$ref": "https://kubernetesjsonschema.dev/master/_definitions.json#/definitions/io.k8s.kube-aggregator.pkg.apis.apiregistration.v1beta1.APIServiceSpec",
     "description": "Spec contains information for locating and communicating with a server"
   },
   "status": {
     "$ref": "https://kubernetesjsonschema.dev/master/_definitions.json#/definitions/io.k8s.kube-aggregator.pkg.apis.apiregistration.v1beta1.APIServiceStatus",
     "description": "Status contains derived information about an API server"
   }
 },
 "type": "object",
 "x-kubernetes-group-version-kind": [
   {
     "group": "apiregistration.k8s.io",
     "kind": "APIService",
     "version": "v1beta1"
   }
 ],
 "$schema": "http://json-schema.org/schema#"
}

Doesn’t it look like a pure JSON file? Let’s see if we can validate it in our YAML parser.

It’s odd that the YAML parser didn’t report the file as invalid. Does this imply that JSON is also YAML?

YAML is, in fact, a superset of JSON. All JSON files are valid YAML files, but not the other way around.

Can we combine JSON and YAML? Is it still a valid YAML file? Let’s put this hypothesis to the test. Let us change some of the above snippet to make it look more like the YAML we are familiar with 😉

description: "APIService represents a server for a particular GroupVersion. Name must be \"version.group\"."
"properties": {
 "apiVersion": {
   "description": "APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources",
   "type": [
     "string",
     "null"
   ]
 },
 "kind": {
   "description": "Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds",
   "type": [
     "string",
     "null"
   ],
   "enum": [
     "APIService"
   ]
 },
 "metadata": {
   "$ref": "https://kubernetesjsonschema.dev/master/_definitions.json#/definitions/io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta"
 },
 "spec": {
   "$ref": "https://kubernetesjsonschema.dev/master/_definitions.json#/definitions/io.k8s.kube-aggregator.pkg.apis.apiregistration.v1beta1.APIServiceSpec",
   "description": "Spec contains information for locating and communicating with a server"
 },
 "status": {
   "$ref": "https://kubernetesjsonschema.dev/master/_definitions.json#/definitions/io.k8s.kube-aggregator.pkg.apis.apiregistration.v1beta1.APIServiceStatus",
   "description": "Status contains derived information about an API server"
 }
}
"type": "object"
"x-kubernetes-group-version-kind": [
 {
   "group": "apiregistration.k8s.io",
   "kind": "APIService",
   "version": "v1beta1"
 }
]
"$schema": "http://json-schema.org/schema#"

Notice that there isn’t a root JSON wrapper {} anymore, there are just maps at the root level, but most of it is still JSON. Validate the file once more in a YAML parser. It is a valid YAML file, but when we try to validate it in a JSON parser, it says it is invalid. That’s because the file is no longer JSON, but rather YAML. This demonstrates that YAML is, in fact, the superset of JSON.

Where is YAML Used?

We learned a lot about YAML and saw that it works great as a configuration language. Let us see it in action with some of the most famous tools.

Ansible

Ansible playbooks are used to automate repeated tasks that execute actions automatically.

Playbooks are expressed in YAML format and perform any action defined in plays.

Here is a simple Ansible playbook that installs Nginx, applies the specified template to replace the existing default Nginx landing page, and finally enables TCP access on port 80.

To learn more about Ansible playbooks, see our article: Working with Ansible Playbooks – Tips & Tricks with Examples.

---
- hosts: all
  become: yes
  vars:
    page_title: Spacelift
    page_description: Spacelift is a sophisticated CI/CD platform for Terraform, CloudFormation, Pulumi, and Kubernetes.
  tasks:
    - name: Install Nginx
      apt:
        name: nginx
        state: latest

    - name: Apply Page Template
      template:
        src: files/spacelift-intro.j2
        dest: /var/www/html/index.nginx-debian.html

    - name: Allow all access to tcp port 80
      ufw:
        rule: allow
        port: '80'
        proto: tcp

Kubernetes

Kubernetes, also known as K8s, is an open-source system for automating the deployment, scaling, and management of containerized applications.

Kubernetes works based on a state model where it tries to reach the desired state from the current state in a declarative way. Kubernetes uses YAML files to define the Kubernetes object, which is applied to the cluster to create resources like pods, services, and deployments.

Here is a YAML file that describes a deployment that runs Nginx.

apiVersion: apps/v1
kind: Deployment
metadata:
 name: nginx-deployment
spec:
 selector:
   matchLabels:
     app: nginx
replicas: 2 # tells deployment to run 2 pods matching the template
template:
   metadata:
     labels:
       app: nginx
spec:
     containers:
       - name: nginx
image: nginx:1.14.2
ports:
   - containerPort: 80

Interesting Things About YAML

YAML works great as a configuration language, but it is important to be aware of certain challenges as well when using it.

The curious case of the Norway problem

Imagine listing the abbreviation of all the countries where it snows

countries:
- GB # Great britain
- IE # Ireland
- FR # France
- DE # Denmark
- NO # Norway

All looks good, right? But when you try to read this YAML file in python, we see NO being read False instead of ‘NO’

>>> from pyyaml import load
>>> load(the_configuration)
{'countries': ['GB', 'IE', 'FR', 'DE', False]}

So why does this happen?

Remember the core schema which interprets NULL | null the same way? The same schema interprets FALSE | F | NO the same way. So instead of parsing NO as a string, it parses it as a boolean. This is can be easily solved by quoting NO.

countries:
- GB # Great Britain
- IE # Ireland
- FR # France
- DE # Denmark
- 'NO' # Norway

But instead, to avoid any such kinds of surprises, we can use StrictYAML, which parses everything as a string by default.