Kubernetes

Misc

  • An open-source container orchestration platform
    • Automates the deployment, scaling, and management of containerized applications. It works by simplifying the containerized workloads based on demand, across a cluster of machines to ensure increased availability, scalability, and efficient resource use.
  • Docs
  • Resources
  • K9s is a powerful CLI tool that makes managing your Kubernetes clusters easy
  • Datascience project components
    • Your code (python/R/julia/matlab)
    • A Dockerfile that packages up your code
    • A configuration file (deployment.yaml, job.yaml) (sometimes someone else will do this for you)
  • Process
    • Build your machine learning code.
    • Package it up into a docker container and push that image into a registry.
    • Using the configuration file, you tell the kubernetes cluster where to find that image and how to build your service out of it (“make two copies and give them lots of ram”).
      • Kubernetes has new instructions now, and makes sure the cluster state matches those instructions.
  • Data Scientist responsibilities
    • Make sure your container actually runs, test that extensively!
      • Write R tests that check if you can handle the expected inputs.
      • Check that you are logging errors when they occur.
      • Check how you handle unexpected inputs (a person with an age of 200, a car with no weight, etc)
      • Test the container, pass expected input to the container, pass unexpected input, test if the container fails and protests loudly when the required environmental variables are not found.
    • Arrange what your API should look like: when you use {plumber}: what endpoint will be called, what port should be reached for, and what will the data look like? Make sure you write some tests for that! When you use {shiny}: what port does it live on, what are the memory and CPU requirements for your application. In all cases: what secrets must be supplied to the container?
    • Arrange where logs should go and how they should look. My favorite R logging package is {logger} and it can do many many forms of logging. If something goes wrong you want the logs to tell you what happened and where you should investigate.
    • Use {renv} to install specific versions of r-packages, and to record that in a lockfile.
  • Write process script for each of these steps
    • Run all the tests
    • Build the docker image
    • Check the image
    • Push the image to the registry
    • Deploy the new version of your code to the production cluster
  • Cost Benefits
    • Autoscaling: Kubernetes consists of some open-source tools (like Horizontal Pod Autoscaler and Cluster Autoscale) that allow the users to dynamically adjust the number of containers to use which prevents the companies from overprovisioning, where they pay for resources that are not being used.
    • Improved developer efficiency: This also improves the developer efficiency by streamlining deployments, rollbacks, and scaling allowing them to focus on building apps rather than managing cloud infrastructure.
    • Horizontal scaling: It also gives the users leverage the horizontal scaling, allowing them to distribute workloads across multiple nodes, optimizing resource usage and reducing costs.

Terms

  • etcd - Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data. Docs
    • If your Kubernetes cluster uses etcd as its backing store, make sure you have a back up plan for those data.
  • Time-to-Live (TTL) - Kubernetes mechanism for shutting down pods and gc - “Kubernetes v1.23 [stable] TTL-after-finished controller provides a TTL (time to live) mechanism to limit the lifetime of resource objects that have finished execution. TTL controller only handles Jobs.” Docs