Edgecase 2022: Kubernetes at the Edge
Written by Tibo Beijen, Lead Developer / Agile Architect
Finally, after the pandemic abruptly changed every meetup and conference into a video conference, in-person events are possible again! I attended Edgecase 2022, organized by Fullstaq, and focused on running Kubernetes at the edge. These are my main takeaways.
At Nu.nl, we don’t run Kubernetes at the edge, nor do we intend to. We use AWS. Nevertheless, underlying techniques might still be applicable. Furthermore, it never hurts to look at things from a different perspective, and it’s great to interact with peers that are equally enthusiastic about the technology they use.
Characteristics of running ‘at the edge’ include:
- Slow or flaky network connectivity with centralized or cloud-based systems
- Low compute/memory specs
- Need to avoid having to transfer large amounts of data generated at the edge
I won’t go over the individual talks but instead, address some of the topics that stood out to me.
ArgoCD is a well-established name in the Kubernetes ecosystem. It’s a tool that enables managing applications via GitOps, of which I heard quite some positive things.
Now GitOps is a bit of a buzzword that might mean different things to different people. So at the risk of cutting corners, my take is: “Adhering to best practices of Infra-as-Code, Config-as-Code, and automation, leveraging the power of Git for collaboration and context via pull requests, history, and commit messages”. Something like that.
Within GitOps, there’s the push-based variant: adding pipelines to Git that apply changes to the target environment. Another approach is pull-based GitOps: a component running inside the target environment that continuously monitors the Git repo and ensures the environment always reflects what’s in Git. This is what ArgoCD offers by running inside a cluster or multiple clusters.
Rephrased in the context of Kubernetes:
- Kubernetes components continuously keep the cluster’s state following the applied manifests.
- Similarly, ArgoCD continuously applies the manifests in the cluster following what is in Git.
Some of the downsides of pipelines that ArgoCD remediates:
- Risk of actual state drifting from Git state. The “where is the truth” challenge. This depends on the scope covered by the pipeline, and the access engineers (or systems) have to the infrastructure. Increasing factor: no Git commits means no applying, means no reconciliation.
- Granting a CI/CD pipeline read/write access to the target environment is arguably harder than providing a system read-only access to a Git repository.
- Scalability. An increasing amount of clusters or applications does not result in a similar increase in the number of deployment pipelines.
Now it’s good to acknowledge that pipelines won’t be redundant: there are still tests to do and artifacts to build. That stays the same. However, the deploys can change from issuing a batch of API calls to updating the desired state in Git.
Another observation is that it benefits Kubernetes deploys only. So if other platforms such as AWS Lambda get thrown in the mix, the advantage becomes less clear-cut.
Specific to Kubernetes at the edge, some exciting techniques were illustrated that showcase how ArgoCD is a perfect fit for such cases.
One can set up a primary cluster with good connectivity to the center of command (office, cloud). Then, in remote locations (spokes), a local ArgoCD installation can keep applications in sync using a local Git copy. This way, the remote ArgoCD instance always has a reliable Git repo available and can tolerate unreliable internet connectivity.
So, centralized Git commits get synced to remote locations, and ArgoCD takes it from there. Metrics and logs follow the reverse direction: Prometheus and Loki, running at the edge, store data in files that span a couple of hours. Once persisted, these files can be shipped to centralized locations where (using Thanos) dashboards can provide info about the state of all the edge locations, albeit with some delay.
Getting a lot of attention on EdgeCase was K3S, a lightweight Kubernetes distribution. It’s certified (meaning: fully compatible with its big brother Kubernetes) yet more straightforward, packing all of the moving parts of Kubernetes in a single binary.
As the docs put it:
Lightweight Kubernetes. Easy to install, half the memory, all in a binary of less than 100 MB. Great for:
* IoT * CI
* Embedding K8s
* Situations where a PhD in K8s clusterology is infeasible
Some of these advantages are reduced when putting K3S against Kubernetes managed by cloud vendors, such as EKS, AKS, and GKE. Regardless, being able to throw the
yaml we love and loathe to a Raspberry Pi in pretty much the same way as to a big cluster in the cloud, showcases the versatility of Kubernetes.
One of the cases presented illustrated K3S running on small servers right next to greenhouses. Data collected in the greenhouses is processed at the location, allowing a much smaller data set to be moved around. K3S provides the means for an easy-to-maintain cluster at the edge.
Leafcloud focuses on running compute in a more energy-efficient way. They do so by running compute in ‘Leaf sites’, where heat generated by the servers can be used to replace fossil fuels. Data is stored in a more traditional data center for security reasons, and glass fiber lines connect data to the leaf sites.
Leaf cloud is based on OpenStack so that it can be provisioned in a similar way as more traditional clouds with tools like Terraform. It’s unlikely one would migrate all of the infrastructure using the 200+ services provided by AWS to run on Leafcloud. However, it could be relatively easy for specific cloud-agnostic workloads (if there is such a thing).
Regardless, it’s an interesting, innovative concept that addresses the important topic of sustainability.
But even if we’re not moving to Leafcloud, there are still plenty of things we can do not to waste energy, such as:
- Right-sizing, serverless like AWS Lambda being the prime example
- Turn off things we don’t use
- Use efficient programming languages
One of the takeaways is that the ‘pets vs. cattle’ paradigm applies to Kubernetes clusters similarly to virtual machines: operating one or many clusters, short- or long-lived, should hardly make a difference. Another is the platform’s versatility: same API, totally different environments.
I really like this scale of the event, which, in a lot of ways, reminded me of the 2019 September Kubernetes Community Days: small scale (~200 ppl), nice venue, single day, single track (no FOMO), not too expensive (in this case free even) and well organized.
Simply a lot to like. So thanks to all the people that have put hard work into organizing. Till the next one!
Originally published at https://www.tibobeijen.nl on June 1, 2022.