<h1 align="center">
<a href="https://prompts.chat">
GKE-ENKI-GitLab-agent is a project to manage and configure GitLab Kubernetes Agent for the ENKI Google Cloud cluster and to configure all required cluster tools for production, monitoring, and backup.
Sign in to like and favorite skills
# GK[CLUSTERNAME>]-[CLUSTERNAME>][CLUSTERNAME>]KI-Git[CLUSTERNAME>]ab-agent
GK[CLUSTERNAME>]-[CLUSTERNAME>][CLUSTERNAME>]KI-Git[CLUSTERNAME>]ab-agent is a project to manage and configure Git[CLUSTERNAME>]ab Kubernetes [CLUSTERNAME>]gent for the [CLUSTERNAME>][CLUSTERNAME>]KI Google [CLUSTERNAME>]loud cluster and to configure all required cluster tools for production, monitoring, and backup.
## [CLUSTERNAME>]ontents
▪️ [Future roadmap](#future-roadmap)
▪️ [Installed cluster components (dependent order)](#installed-cluster-components-dependent-order)
▪️ [[CLUSTERNAME>]I/[CLUSTERNAME>]D for automated deployment and maintenance](#cicd-for-automated-deployment-and-maintenance)
▪️ [[CLUSTERNAME>]ome useful kubectl commands](#some-useful-kubectl-commands)
▪️ [Kubernetes cluster configuration](#kubernetes-cluster-configuration)
▪️ [Git[CLUSTERNAME>]ab Kubernetes agent installation](#gitlab-kubernetes-agent-installation)
▪️ [[CLUSTERNAME>]earing down and reinstalling the agent](#tearing-down-and-reinstalling-the-agent)
## Future roadmap
- Fully integrate Git[CLUSTERNAME>]ab Kubernetes [CLUSTERNAME>]gent for GitOps as an alternative to using Git[CLUSTERNAME>]ab [CLUSTERNAME>]unner and Helm.
[CLUSTERNAME>] [CLUSTERNAME>]he agent has to mature to handle sequenced Y[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] deploys, and the agent must operate with clusterwide admin privileges to make this integration possible.
- [CLUSTERNAME>]onsider adding the following:
- [CLUSTERNAME>]ser billing and tracking (using Kubecost)
- [CLUSTERNAME>]unbooks for Jupyter[CLUSTERNAME>]ab (notebook-based) GitOps using [CLUSTERNAME>]ubix/[CLUSTERNAME>]urtch
- [CLUSTERNAME>]loudwatch integration
- [CLUSTERNAME>]lastic [CLUSTERNAME>]ontainer [CLUSTERNAME>]ervice
- Investigate the Google [CLUSTERNAME>]loud [CLUSTERNAME>]un serverless platform.
[CLUSTERNAME>] Port knative Geobarometer and [CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] web services to remove any dependence on the Kubernetes cluster.
## Installed cluster components (dependent order)
1. **Git[CLUSTERNAME>]ab Kubernetes [CLUSTERNAME>]gent**
[CLUSTERNAME>] [CLUSTERNAME>]ntity that attaches a GK[CLUSTERNAME>] cluster to this repository (configuration notes below).
1. **Git[CLUSTERNAME>]ab [CLUSTERNAME>]unner**
[CLUSTERNAME>] Gitlab [CLUSTERNAME>]unner allows [CLUSTERNAME>]I jobs to run on the cluster in privileged mode, which allows us to execute *kubectl* and *helm* commands to perform GitOps tasks using Y[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] files stored in this repository. Basically, the runner gives us the functionality of Google [CLUSTERNAME>]loud [CLUSTERNAME>]hell or a desktop connection of gcloud/kubectl using Git[CLUSTERNAME>]ab [CLUSTERNAME>]I.
1. **Kubernetes [CLUSTERNAME>]GI[CLUSTERNAME>]X Ingress [CLUSTERNAME>]ontroller**
[CLUSTERNAME>] [CLUSTERNAME>]he ingress controller is utilized to expose endpoints of services to external ports. [CLUSTERNAME>]here are multiple ingress controllers operating on the cluster. [CLUSTERNAME>]his one is used to expose Grafana and Kasten K10 endpoints. [CLUSTERNAME>]nother is built into JupyterHub to expose that endpoint.
1. **[CLUSTERNAME>]ert [CLUSTERNAME>]anager**
[CLUSTERNAME>] [CLUSTERNAME>]sed by the ingress controller to acquire and attach [CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] certificates to ingress external endpoints so that ports can support *https* and encrypted traffic.
1. **Prometheus** and **Grafana** (exposed at https://cluster.enki-portal.org/)
[CLUSTERNAME>] [CLUSTERNAME>]he Kube Prometheus stack (with Grafana) monitors the cluster and exposes metrics at an external endpoint so that cluster performance can be assessed.
1. **Google [CLUSTERNAME>]loud [CLUSTERNAME>]torage**
[CLUSTERNAME>] [CLUSTERNAME>]torage independent of the Kubernetes cluster that is utilized for backups of cluster resources. [CLUSTERNAME>]he backup service (Kasten K10) is capable of restoring and migrating the cluster using this independent storage.
1. **Kasten K10** (exposed at https://k10.enki-portal.org/k10/)
[CLUSTERNAME>] Backup, restoration, and migration tool for Kubernetes
1. **JupyterHub**
[CLUSTERNAME>] [CLUSTERNAME>]ervice that hosts the [CLUSTERNAME>][CLUSTERNAME>]KI server. JupyterHub exposes single-user pods that host the [CLUSTERNAME>]hermo[CLUSTERNAME>]ngine Docker container image with a Jupyter[CLUSTERNAME>]ab user interface. It also allocates and maintains access to user-based persistent storage.
1. [CLUSTERNAME>]esting installation
[CLUSTERNAME>] [CLUSTERNAME>]his installation is for testing options and configuring possible upgrades to the production server. For cost reasons, it is normally not running.
1. Production installation
[CLUSTERNAME>] [CLUSTERNAME>]his installation is the production server exposed at https://server.enki-portal.org/ .
1. **Knative** web services
[CLUSTERNAME>] [CLUSTERNAME>]ervice to expose stateless, scalable web services. [CLUSTERNAME>]hese services should probably be moved outside the cluster and exposed using the Google [CLUSTERNAME>]loud [CLUSTERNAME>]un serverless platform. [CLUSTERNAME>]ee *Future [CLUSTERNAME>]oad[CLUSTERNAME>]ap* above.
1. **[CLUSTERNAME>]y[CLUSTERNAME>]Q[CLUSTERNAME>]** (exposed as http://mysql.enki-portal.org:3306/ )
[CLUSTERNAME>] Database server that currently holds the [CLUSTERNAME>][CLUSTERNAME>]P[CLUSTERNAME>]/[CLUSTERNAME>]raceDs as well as some smaller databases ([CLUSTERNAME>]tixrude, Berman, Inforex, etc.) that are used by cluster apps.
## [CLUSTERNAME>]I/[CLUSTERNAME>]D for automated deployment and maintenance
[CLUSTERNAME>]he *.gitlab-ci.yml* Y[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] file performs a number of functions:
- Deploys manifests using Git[CLUSTERNAME>]ab Kubernetes [CLUSTERNAME>]gent to perform GitOps tasks
- [CLUSTERNAME>]uns *helm* and *kubectl* jobs on the cluster to perform GitOps tasks
- Functions as the downstream pipeline for related projects that generate content related to the cluster ([CLUSTERNAME>]ee the Git[CLUSTERNAME>]ab project https://gitlab.com/[CLUSTERNAME>][CLUSTERNAME>]KI-portal/jupyterhub_custom)
## [CLUSTERNAME>]ome useful kubectl commands
- [CLUSTERNAME>]ommands for managing namespaces and their resources:
```
kubectl create ns gitlab-runner
kubectl delete all --all -n {namespace}
```
- Get Git[CLUSTERNAME>]ab usernames associated with persistent storage volumes:
```
kubectl --namespace jhub describe persistentvolumeclaims | grep "hub.jupyter.org/username"
```
- [CLUSTERNAME>]estart hub on cluster using Google [CLUSTERNAME>]loud [CLUSTERNAME>]hell in order to update [CLUSTERNAME>][CLUSTERNAME>]KI-portal/jupyterhub_custom to amend login page:
```
helm upgrade --cleanup-on-fail jhub jupyterhub/jupyterhub --version=1.1.3 --namespace jhub --reuse-values
```
## Kubernetes cluster configuration
[CLUSTERNAME>]he following Google [CLUSTERNAME>]loud setup instructions are from the **Zero to JupyterHub** document https://zero-to-jupyterhub.readthedocs.io/en/latest/kubernetes/google/step-zero-gcp.html, as found in October 2021.
1. [CLUSTERNAME>]sing Google [CLUSTERNAME>]loud [CLUSTERNAME>]hell, install **kubectl** and **helm** using **gcloud** after enabling the Kubernetes [CLUSTERNAME>]ngine [CLUSTERNAME>]PI.
1. [CLUSTERNAME>]reate a managed kubernetes cluster with a default node pool:
```
gcloud container clusters create \
--machine-type n1-standard-2 \
--enable-autoscaling \
--max-nodes=6 \
--min-nodes=2 \
--zone <compute zone from the list linked below[CLUSTERNAME>] \
--cluster-version latest \
<[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]
```
- *\<[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]\[CLUSTERNAME>]* is **enkiserver**
- *\<compute zone from the list linked below\[CLUSTERNAME>]* is **us-west1-a**
1. [CLUSTERNAME>]levate the user Google [CLUSTERNAME>]loud account for administrative functions:
```
kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole=cluster-admin \
--user=<GOOG[CLUSTERNAME>][CLUSTERNAME>]-[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]I[CLUSTERNAME>]-[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]O[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]
```
- *\<GOOG[CLUSTERNAME>][CLUSTERNAME>]-[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]I[CLUSTERNAME>]-[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]O[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]\[CLUSTERNAME>]* is *email address* of Google [CLUSTERNAME>]loud account owner
1. [CLUSTERNAME>]reate a node pool for users:
```
gcloud beta container node-pools create user-pool \
--machine-type n1-standard-2 \
--num-nodes 0 \
--enable-autoscaling \
--min-nodes 0 \
--max-nodes 6 \
--node-labels hub.jupyter.org/node-purpose=user \
--node-taints hub.jupyter.org_dedicated=user:[CLUSTERNAME>]o[CLUSTERNAME>]chedule \
--zone us-central1-b \
--cluster <[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]
```
[CLUSTERNAME>]fter you complete these steps, two node pools are up and running. [CLUSTERNAME>]he default node pool is used to run cluster-wide apps, while the tainted user node pool is used to launch nodes for single-user Jupyter pods. [CLUSTERNAME>]ix nodes in the user pool should be able to accommodate about 100 users doing small-scale [CLUSTERNAME>][CLUSTERNAME>]KI-related modeling.
## Git[CLUSTERNAME>]ab Kubernetes [CLUSTERNAME>]gent installation
[CLUSTERNAME>]he following instructions are from the Git[CLUSTERNAME>]ab document https://docs.gitlab.com/ee/user/clusters/agent/#set-up-the-kubernetes-agent-server, as found in October 2021.
1. [CLUSTERNAME>]reate a config.yaml file in the repository at *.gitlab/agents/primary-agent* with the contents:
```
gitops:
manifest_projects:
- id: "enki-portal/gke-enki-gitlab-agent"
paths:
- glob: 'generated-manifests/**/*.{yaml,yml,json}'
inventory_policy: adopt_if_no_inventory
```
- [CLUSTERNAME>]he *ID* is the repository name that contains the manifest files (this repository).
- [CLUSTERNAME>]he *glob* is altered from the default suggestion to look only at Y[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] files in the folder and subfolders of *generated-manifests*.
- [CLUSTERNAME>]he *inventory_policy* is changed from the default suggestion to allow the agent to inherit the management of applications that are already running on the cluster when their Y[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] manifests are added to the *generated-manifests* file hierarchy.
[CLUSTERNAME>]ultiple manifest projects can be defined; future plans will allow these to be *private* repositories.
[CLUSTERNAME>]urrently the agent repository must be public; future plans will allow the agent to be associated with a *group*.
1. [CLUSTERNAME>]reate the agent in Git[CLUSTERNAME>]ab (*Infrastructure [CLUSTERNAME>] Kubernetes clusters*) and generate a *secret token*. [CLUSTERNAME>]ssign this token to a pipeline environment variable (*[CLUSTERNAME>]ettings* [CLUSTERNAME>] *[CLUSTERNAME>]I/[CLUSTERNAME>]D* [CLUSTERNAME>] *Variables*) with the name *GI[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]B_[CLUSTERNAME>]G[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]_[CLUSTERNAME>]OK[CLUSTERNAME>][CLUSTERNAME>]*. [CLUSTERNAME>]ake sure that the value is *protected* and *masked* in order to keep it hidden in pipeline logs.
1. In Google [CLUSTERNAME>]loud [CLUSTERNAME>]hell, execute the following to create a namespace for the agent:
```
kubectl create ns gitlab-kubernetes-agent
```
[CLUSTERNAME>]hen install the agent, with the appropriate token value substituted for *$(GI[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]B_[CLUSTERNAME>]G[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]_[CLUSTERNAME>]OK[CLUSTERNAME>][CLUSTERNAME>])*:
```
docker run --pull=always --rm \
registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/cli:stable generate \
--agent-token=$(GI[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]B_[CLUSTERNAME>]G[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]_[CLUSTERNAME>]OK[CLUSTERNAME>][CLUSTERNAME>]) \
--kas-address=wss://kas.gitlab.com \
--agent-version stable \
--namespace gitlab-kubernetes-agent | kubectl apply -f -
```
1. [CLUSTERNAME>]pgrade the Git[CLUSTERNAME>]ab agent service account to have a cluster-admin role (so that it can create *secrets*, *pods*, *config maps*, etc. in arbitrary cluster *namespaces*) by executing first in Google [CLUSTERNAME>]loud [CLUSTERNAME>]hell:
```
kubectl get rolebindings,clusterrolebindings --all-namespaces \
-o custom-columns='KI[CLUSTERNAME>]D:kind,[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]P[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]:metadata.namespace,[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]:metadata.name,[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]VI[CLUSTERNAME>][CLUSTERNAME>]_[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]O[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>]:subjects[?(@.kind=="[CLUSTERNAME>]ervice[CLUSTERNAME>]ccount")].name' | grep gitlab-agent
```
[CLUSTERNAME>]ote that this critical step is missing from the Git[CLUSTERNAME>]ab documentation. [CLUSTERNAME>]he command gives the output:
```
[CLUSTERNAME>]luster[CLUSTERNAME>]oleBinding <none[CLUSTERNAME>] cilium-alert-read gitlab-agent
[CLUSTERNAME>]luster[CLUSTERNAME>]oleBinding <none[CLUSTERNAME>] gitlab-agent-gitops-read-all gitlab-agent
[CLUSTERNAME>]luster[CLUSTERNAME>]oleBinding <none[CLUSTERNAME>] gitlab-agent-gitops-write-all gitlab-agent
[CLUSTERNAME>]luster[CLUSTERNAME>]oleBinding <none[CLUSTERNAME>] gitlab-agent-read-binding gitlab-agent
[CLUSTERNAME>]luster[CLUSTERNAME>]oleBinding <none[CLUSTERNAME>] gitlab-agent-write-binding gitlab-agent
```
1. [CLUSTERNAME>]pply the binding with the command:
```
kubectl create clusterrolebinding gitlab-agent-cluster-admin-binding --clusterrole=cluster-admin --serviceaccount=default:gitlab-agent
kubectl get clusterrolebinding | grep gitlab-agent
```
[CLUSTERNAME>]he command gives output such as the following:
```
gitlab-agent-cluster-admin-binding [CLUSTERNAME>]luster[CLUSTERNAME>]ole/cluster-admin 12s
gitlab-agent-gitops-read-all [CLUSTERNAME>]luster[CLUSTERNAME>]ole/gitlab-agent-gitops-read-all 162d
gitlab-agent-gitops-write-all [CLUSTERNAME>]luster[CLUSTERNAME>]ole/gitlab-agent-gitops-write-all 162d
gitlab-agent-read-binding [CLUSTERNAME>]luster[CLUSTERNAME>]ole/gitlab-agent-read 162d
gitlab-agent-write-binding [CLUSTERNAME>]luster[CLUSTERNAME>]ole/gitlab-agent-write 162d
```
[CLUSTERNAME>]he agent is now installed.
## [CLUSTERNAME>]earing down and reinstalling the agent
[CLUSTERNAME>]his process is tricky and not automated by Git[CLUSTERNAME>]ab. Occasionally, reinstalling the agent is necessary, as the agent does not tolerate errors in Y[CLUSTERNAME>][CLUSTERNAME>][CLUSTERNAME>] manifests very well and can enter a condition in which it is unresponsive.
Follow this procedure in Google [CLUSTERNAME>]loud [CLUSTERNAME>]hell:
1. Delete all resources associated with the agent in its namespace:
```
kubectl delete all --all -n gitlab-kubernetes-agent
```
1. Delete the namespace:
```
kubectl delete ns gitlab-kubernetes-agent
```
1. Delete the inventory file in the default namespace that the agent uses to track managed installations ([CLUSTERNAME>]his resource is not automatically removed with the agent's namespace resources):
1. Go to the Google [CLUSTERNAME>]loud Platform, and choose *Kubernetes [CLUSTERNAME>]ngine* [CLUSTERNAME>] *[CLUSTERNAME>]onfiguration* from the upper left menu.
1. In the *default* namespace, delete the *[CLUSTERNAME>]onfig [CLUSTERNAME>]ap* named *inventory-nnn*, where *nnn* is a string of numbers and dashes.
1. In the *default* namespace, delete the *secret* *gitlab-agent-token-nnn*, where *nnn* is some arbitrary hexadecimal number.
1. [CLUSTERNAME>]einstall the agent following the above instructions, utilizing the same authorization token. GKE-ENKI-GitLab-agent is a project to manage and configure GitLab Kubernetes Agent for the ENKI Google Cloud cluster and to configure all required cluster tools for production, monitoring, and backup.
▪️ Future roadmap
▪️ Installed cluster components (dependent order)
▪️ CI/CD for automated deployment and maintenance
▪️ Some useful kubectl commands
▪️ Kubernetes cluster configuration
▪️ GitLab Kubernetes agent installation
▪️ Tearing down and reinstalling the agent
Fully integrate GitLab Kubernetes Agent for GitOps as an alternative to using GitLab Runner and Helm.
The agent has to mature to handle sequenced YAML deploys, and the agent must operate with clusterwide admin privileges to make this integration possible.
Consider adding the following:
Investigate the Google Cloud Run serverless platform.
Port knative Geobarometer and MELTS web services to remove any dependence on the Kubernetes cluster.
Entity that attaches a GKE cluster to this repository (configuration notes below).
Gitlab Runner allows CI jobs to run on the cluster in privileged mode, which allows us to execute kubectl and helm commands to perform GitOps tasks using YAML files stored in this repository. Basically, the runner gives us the functionality of Google Cloud Shell or a desktop connection of gcloud/kubectl using GitLab CI.
The ingress controller is utilized to expose endpoints of services to external ports. There are multiple ingress controllers operating on the cluster. This one is used to expose Grafana and Kasten K10 endpoints. Another is built into JupyterHub to expose that endpoint.
Used by the ingress controller to acquire and attach TLS certificates to ingress external endpoints so that ports can support https and encrypted traffic.
The Kube Prometheus stack (with Grafana) monitors the cluster and exposes metrics at an external endpoint so that cluster performance can be assessed.
Storage independent of the Kubernetes cluster that is utilized for backups of cluster resources. The backup service (Kasten K10) is capable of restoring and migrating the cluster using this independent storage.
Backup, restoration, and migration tool for Kubernetes
Service that hosts the ENKI server. JupyterHub exposes single-user pods that host the ThermoEngine Docker container image with a JupyterLab user interface. It also allocates and maintains access to user-based persistent storage.
This installation is for testing options and configuring possible upgrades to the production server. For cost reasons, it is normally not running.
This installation is the production server exposed at https://server.enki-portal.org/ .
Service to expose stateless, scalable web services. These services should probably be moved outside the cluster and exposed using the Google Cloud Run serverless platform. See Future RoadMap above.
Database server that currently holds the LEPR/TraceDs as well as some smaller databases (Stixrude, Berman, Inforex, etc.) that are used by cluster apps.
The .gitlab-ci.yml YAML file performs a number of functions:
kubectl create ns gitlab-runner kubectl delete all --all -n {namespace}
kubectl --namespace jhub describe persistentvolumeclaims | grep "hub.jupyter.org/username"
helm upgrade --cleanup-on-fail jhub jupyterhub/jupyterhub --version=1.1.3 --namespace jhub --reuse-values
The following Google Cloud setup instructions are from the Zero to JupyterHub document https://zero-to-jupyterhub.readthedocs.io/en/latest/kubernetes/google/step-zero-gcp.html, as found in October 2021.
gcloud container clusters create \ --machine-type n1-standard-2 \ --enable-autoscaling \ --max-nodes=6 \ --min-nodes=2 \ --zone <compute zone from the list linked below> \ --cluster-version latest \ <CLUSTERNAME>
kubectl create clusterrolebinding cluster-admin-binding \ --clusterrole=cluster-admin \ --user=<GOOGLE-EMAIL-ACCOUNT>
gcloud beta container node-pools create user-pool \ --machine-type n1-standard-2 \ --num-nodes 0 \ --enable-autoscaling \ --min-nodes 0 \ --max-nodes 6 \ --node-labels hub.jupyter.org/node-purpose=user \ --node-taints hub.jupyter.org_dedicated=user:NoSchedule \ --zone us-central1-b \ --cluster <CLUSTERNAME>
After you complete these steps, two node pools are up and running. The default node pool is used to run cluster-wide apps, while the tainted user node pool is used to launch nodes for single-user Jupyter pods. Six nodes in the user pool should be able to accommodate about 100 users doing small-scale ENKI-related modeling.
The following instructions are from the GitLab document https://docs.gitlab.com/ee/user/clusters/agent/#set-up-the-kubernetes-agent-server, as found in October 2021.
Create a config.yaml file in the repository at .gitlab/agents/primary-agent with the contents:
gitops: manifest_projects: - id: "enki-portal/gke-enki-gitlab-agent" paths: - glob: 'generated-manifests/**/*.{yaml,yml,json}' inventory_policy: adopt_if_no_inventory
Multiple manifest projects can be defined; future plans will allow these to be private repositories.
Currently the agent repository must be public; future plans will allow the agent to be associated with a group.
Create the agent in GitLab (Infrastructure > Kubernetes clusters) and generate a secret token. Assign this token to a pipeline environment variable (Settings > CI/CD > Variables) with the name GITLAB_AGENT_TOKEN. Make sure that the value is protected and masked in order to keep it hidden in pipeline logs.
In Google Cloud Shell, execute the following to create a namespace for the agent:
kubectl create ns gitlab-kubernetes-agent
Then install the agent, with the appropriate token value substituted for $(GITLAB_AGENT_TOKEN):
docker run --pull=always --rm \ registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/cli:stable generate \ --agent-token=$(GITLAB_AGENT_TOKEN) \ --kas-address=wss://kas.gitlab.com \ --agent-version stable \ --namespace gitlab-kubernetes-agent | kubectl apply -f -
Upgrade the GitLab agent service account to have a cluster-admin role (so that it can create secrets, pods, config maps, etc. in arbitrary cluster namespaces) by executing first in Google Cloud Shell:
kubectl get rolebindings,clusterrolebindings --all-namespaces \ -o custom-columns='KIND:kind,NAMESPACE:metadata.namespace,NAME:metadata.name,SERVICE_ACCOUNTS:subjects[?(@.kind=="ServiceAccount")].name' | grep gitlab-agent
Note that this critical step is missing from the GitLab documentation. The command gives the output:
ClusterRoleBinding <none> cilium-alert-read gitlab-agent ClusterRoleBinding <none> gitlab-agent-gitops-read-all gitlab-agent ClusterRoleBinding <none> gitlab-agent-gitops-write-all gitlab-agent ClusterRoleBinding <none> gitlab-agent-read-binding gitlab-agent ClusterRoleBinding <none> gitlab-agent-write-binding gitlab-agent
Apply the binding with the command:
kubectl create clusterrolebinding gitlab-agent-cluster-admin-binding --clusterrole=cluster-admin --serviceaccount=default:gitlab-agent kubectl get clusterrolebinding | grep gitlab-agent
The command gives output such as the following:
gitlab-agent-cluster-admin-binding ClusterRole/cluster-admin 12s gitlab-agent-gitops-read-all ClusterRole/gitlab-agent-gitops-read-all 162d gitlab-agent-gitops-write-all ClusterRole/gitlab-agent-gitops-write-all 162d gitlab-agent-read-binding ClusterRole/gitlab-agent-read 162d gitlab-agent-write-binding ClusterRole/gitlab-agent-write 162d
The agent is now installed.
This process is tricky and not automated by GitLab. Occasionally, reinstalling the agent is necessary, as the agent does not tolerate errors in YAML manifests very well and can enter a condition in which it is unresponsive.
Follow this procedure in Google Cloud Shell:
kubectl delete all --all -n gitlab-kubernetes-agent
kubectl delete ns gitlab-kubernetes-agent