Kubernetes Integration Specs
March 22, 2024
Kubernetes Integration Specs and Architecture Overview #
The Kubernetes integration work can be broken down into three sections:
- Configuration required on the kubernetes cluster.
- Deployment of a middleware server (interlink) that will act as the interface between kubernetes and Nunet.
- Changes we need to make to DMS in order to support the kubernetes integration.
This blog post aims to give an overview of all of these items at a level where we can start to consider the technical changes we will need to make to both NuNet DMS and the Interlink server / interlink virtual kublet as well as the nunet interlink plugin we will have to build.
As a result of there being three parts to the puzzle it is important to recognise that there are dependancies across all three systems that we need to specify and agree on. This blog post does not seek to replace detailed specification documentation it aims to provoke thought and discussion across the teams that are responsible for each of the components.
For the time being we are considering that the functionality required is for single pods to complete batch processing type workloads as opposed to hosting services that will be access from within the cluster. It is a non trivial task to support/integrate with the networking requirements of any cluster so we must limit the scope of jobs in this way.
Below is a diagram that was part of a presentation I made a few weeks ago that serves as the basis for system architecture this blog post.
General Kubernetes deployment requirements / specs #
Kubernetes-NuNet integration using interlink schematics, developed by Sam around 2024-02-10 - 2024-03-23. This is a basis for further specification and sequence diagrams /types that are presented below. The boxes marked with red bold borders and bold black text are entities further depicted in the sequence diagram, which participate in the Kubernetes - NuNet orchestration.
Kubernetes Cluster Configuration #
Before a job can be deployed via Kubernetes NuNet integration, certain configuration steps need to be completed on both the Kubernetes cluster and the Interlink gateway.
The cluster configuration will depend on the scale and type of jobs which are to be deployed on nunet. it will also depend on the exisiting cluster configuration so will have to specified on a case by case basis. I propose we build a simple testing environmnet that just covers the basics initally. The configuration steps can be expanded at a later date as we have actual use cases we will be supporting.
Network placement #
This diagram depicts the various servers that will be involved in the solution and the proposed placement of components on servers
sequenceDiagram participant KMN as Kubernetes Master Node participant KWN as Kubernetes Worker Node participant NIGW as Nunet Interlink Gateway participant NNW as Nunet Network Note over KMN: On Premise Standard Kubernetes (Scheduler, API Server) Note over KWN: On Premise Custom install (Virtual Kubelet) Note over NIGW: External server (Interlink Service, Nunet interlink Plugin, Nunet DMS) Note over NNW: Nunet Network (DMS compute providers)
Service Components #
This diagram depicts the various services / protocols / and their installation requirements.
sequenceDiagram participant KS as Kubernetes API on Master Node participant VK as Virtual Kubelet on Worker node participant OS as OAuth Proxy on External Server participant IMS as Interlink Middleware Server on External Server participant NP as Nunet Inerlink Plugin on External Server participant NDMSSP as Nunet DMS SP on External Server loop 0: Virtual Kubelet register with K8's server KS-->>VK: Vkubelet needs to use a certificate issued by the cluster administrator to authenticate and join the cluster. end loop 1: Virtual Kubelet authenticates via oauth proxy VK-->>OS: Virtual Kubelet needs a certificate issued by interlink administrator / Oauth server end loop 2: Virtual Kubelet registers to interlink server API VK-->>IMS: The process to be defined end loop 3: Interlink access local Nunet plugin via API IMS-->>NP: The process to be defined end loop 4: Nunet plugin accesses local DMS API NP-->>NDMSSP: The process to be defined end
Kubernetes API #
The kubernetes API is the core of Kubernetes, we will need to ensure the Interlink virtual kubelet is given the correct rights to be able to read and write to the API. We should consider it good practice that there will be a NuNet namspace configured on the cluster and that the virtual Kubelet is given approprate permissions within that namespace.
It is highly likely that we will want to specify custom resource definitions for some of the nunet hardware types so that they can be explicity used in the Job specifications. The reasoning being that nunet nodes will have a wide variety of hardware compared to a standard cluster where people maybe used to just specifiying nvidia.com/GPU as all the GPUS on a node are the same. Whereas the visrtual Kublet that represents a node to the cluster actually will have multitude of hardware.
Virtual kubelet (on a Kubernetes worker node) #
The Interlink Virtual Kubelet will need to be installed on a node in the cluster by a cluster administrator. Interlink are providing a helm chart to assit in this installation process.
External Server (Nunet Interlink Gateway) #
The following services are all installed on a single server instance referred to as the Nunet Interlink Gateway
Oauth proxy #
The Oauth proxy is part of the example Interlink architecture. In testing we have just used github to generate a key that grants access to the interlink Middleware. The proxy sits infront of interlink in order to facilitate this and abstract the authentication process from the Interlink software.
Interlink middleware server #
The Interlink middlware server is the endpoint that the virtual kubelet points to. The virtual kubelet will make requests to the Interlink server that will in turn pass those requests to a custom nunet “provider” or “plugin” that may also be refereed to as a sidecar in the interlink documentation.
NuNet Interlink plugin #
The NuNet Interlink plugin will recieve requests from the interlink middleware that it must either serve a response based on information it has cached or proxy a request to the local DMS that is running on the “Interlink gateway” (which is a server configured with the interlink software, nunet plugin, nunet DMS and an oauth proxy)
Service provider DMS #
As stated in the section above the service Provider DMS is installed on the same machine as the rest of the interlink software and is effectively the orchistrator for all the Kubernetes jobs that will be handled on NuNet.
Required functionality #
The current functionality we are looking to support is:
Dynamically report available resources to the cluster.
Create/start pods / jobs
Delete pods / jobs
Check the Status of pods / Jobs
Check Logfiles of pods / jobs
Although dynamically reporting available resources is the first logical step this is not currently implemented in Interlink and is not the core of the functionality. We will start with the creation of the Job.
Create Job - Sequence diagram #
The following Sequence diagram shows the steps required in order to create a job on Nunet that is then represented as a pod to the kubernetes cluster.
sequenceDiagram participant KS as Kubernetes Scheduler participant VK as Virtual Kubelet participant OS as OAuth Proxy participant IMS as Interlink Middleware Server participant NP as Nunet Inerlink Plugin participant NDMSSP as Nunet DMS SP participant NDMSCP as Nunet DMS CP loop 1: Kubernetes schedules a job on a virtual kubelet KS->>+VK: Decide placement & request resources VK->>+OS: Request authentication end loop 2: Virtual Kublet forwards request to interlink VK->>+IMS: Forward request w/ token IMS->+IMS: Interlink creates record for the job id. end loop 3: Interlink forwards request to NuNet interlink plugin IMS->>+NP: Route/transform request end loop 4: NuNet interlink plugin transforms data and creates a request to the local SP DMS NP->>+NDMSSP: end Loop 5: SPDMS Schedules job on remote DMS NDMSSP-->>+NDMSCP: Service Provider DMS requests job NDMSCP-->>+NDMSSP: Compute Provider DMS schedules job and returns job id to service provider end
Loop 1: Kubernetes schedules a job on a virtual kubelet #
Input #
kubernetes job description.
apiVersion: batch/v1
kind: Job
metadata:
name: tensorflow-gpu-job
namespace: interlink
spec:
template:
spec:
containers:
- name: tensorflow-container
image: tensorflow/tensorflow:latest-gpu # Use the appropriate TensorFlow GPU image
resources:
limits:
cpu: "8"
memory: 128Gi
nunet.gpu.nvidia.v100.40gb: 1 # Assuming this is a valid CRD
requests:
cpu: "8"
memory: 128Gi
command: ["python", "-u"]
args: ["path/to/your/tensorflow/script.py"] # path to your TensorFlow script
restartPolicy: Never
nodeSelector:
kubernetes.io/hostname: nunet # Ensure the pod is scheduled on the node called 'nunet'
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu-type # This label should be set on your 'nunet' node to match the GPU type
operator: In
values:
- nunet.gpu.nvidia.v100.40gb
backoffLimit: 4
-
Kubernetes scheduler reads the job description and takes into account the NuNet node selector specified in the job. It creates the pod spec and assigns it to the virtual kubelet. The pod is given an id by the cluster.
-
The virtual kubelet is watching the API for new jobs assigned to it;
- Place in the network:
- Kubernetes Cluster, kubernetes API sends data to virtual kubelet running on a Kubernetes Worker node
Output #
{
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name": "tensorflow-gpu-job-xxxxx", // A unique generated name for the pod
"namespace": "interlink",
"ownerReferences": [
{
"apiVersion": "batch/v1",
"kind": "Job",
"name": "tensorflow-gpu-job",
"uid": "uid-of-the-job-object", // UID generated by Kubernetes for the Job
"controller": true
}
],
"labels": {
"job-name": "tensorflow-gpu-job"
}
},
"spec": {
"containers": [
{
"name": "tensorflow-container",
"image": "tensorflow/tensorflow:latest-gpu",
"command": ["python", "-u"],
"args": ["path/to/your/tensorflow/script.py"],
"resources": {
"limits": {
"cpu": "8",
"memory": "128Gi",
"nunet.gpu.nvidia.v100.40gb": "1"
},
"requests": {
"cpu": "8",
"memory": "128Gi"
}
}
}
],
"restartPolicy": "Never",
"nodeSelector": {
"kubernetes.io/hostname": "nunet"
},
"affinity": {
"nodeAffinity": {
"requiredDuringSchedulingIgnoredDuringExecution": {
"nodeSelectorTerms": [
{
"matchExpressions": [
{
"key": "gpu-type",
"operator": "In",
"values": [
"nunet.gpu.nvidia.v100.40gb"
]
}
]
}
]
}
}
}
}
}
-
We need to consider access to custom container registries and the ability to pass secrets for access to registry and or other job data sources.
-
There is no output at this time as the virtual kubelet needs to pass the job request on to the interlink server (see loop2)
Loop 2: Virtual Kublet forwards request to interlink #
Input #
- Data:
- The virtual kubelet authenticates to the interlink server possibly via an oauth server to retrive an access token
- The virtual kubelet forwards the pod spec to the interlink server in this escaped JSON string format (note this is from the interlink demo job and not the pod description above.)
[{\"pod\":{\"metadata\":{\"name\":\"interlink-quickstart\",\"namespace\":\"default\",\"uid\":\"1df0a37e-a29b-426b-9b14-43f95dfc5a2f\",\"resourceVersion\":\"728753\",\"creationTimestamp\":\"2024-04-04T09:39:16Z\",\"annotations\":{\"kubectl.kubernetes.io/last-applied-configuration\":\"{\\\"apiVersion\\\":\\\"v1\\\",\\\"kind\\\":\\\"Pod\\\",\\\"metadata\\\":{\\\"annotations\\\":{},\\\"name\\\":\\\"interlink-quickstart\\\",\\\"namespace\\\":\\\"default\\\"},\\\"spec\\\":{\\\"automountServiceAccountToken\\\":false,\\\"containers\\\":[{\\\"args\\\":[\\\"-c\\\",\\\"sleep 600 \\\\u0026\\\\u0026 echo 'FINISHED!'\\\"],\\\"command\\\":[\\\"/bin/sh\\\"],\\\"image\\\":\\\"busybox\\\",\\\"imagePullPolicy\\\":\\\"Always\\\",\\\"name\\\":\\\"my-container\\\",\\\"resources\\\":{\\\"limits\\\":{\\\"cpu\\\":\\\"1\\\",\\\"memory\\\":\\\"1Gi\\\"},\\\"requests\\\":{\\\"cpu\\\":\\\"1\\\",\\\"memory\\\":\\\"1Gi\\\"}}}],\\\"nodeSelector\\\":{\\\"kubernetes.io/hostname\\\":\\\"nunet-node\\\"},\\\"tolerations\\\":[{\\\"key\\\":\\\"virtual-node.interlink/no-schedule\\\",\\\"operator\\\":\\\"Exists\\\"},{\\\"effect\\\":\\\"NoExecute\\\",\\\"key\\\":\\\"node.kubernetes.io/not-ready\\\",\\\"operator\\\":\\\"Exists\\\",\\\"tolerationSeconds\\\":300},{\\\"effect\\\":\\\"NoExecute\\\",\\\"key\\\":\\\"node.kubernetes.io/unreachable\\\",\\\"operator\\\":\\\"Exists\\\",\\\"tolerationSeconds\\\":300}]}}\\n\"},\"managedFields\":[{\"manager\":\"kubectl-client-side-apply\",\"operation\":\"Update\",\"apiVersion\":\"v1\",\"time\":\"2024-04-04T09:39:16Z\",\"fieldsType\":\"FieldsV1\",\"fieldsV1\":{\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:kubectl.kubernetes.io/last-applied-configuration\":{}}},\"f:spec\":{\"f:automountServiceAccountToken\":{},\"f:containers\":{\"k:{\\\"name\\\":\\\"my-container\\\"}\":{\".\":{},\"f:args\":{},\"f:command\":{},\"f:image\":{},\"f:imagePullPolicy\":{},\"f:name\":{},\"f:resources\":{\".\":{},\"f:limits\":{\".\":{},\"f:cpu\":{},\"f:memory\":{}},\"f:requests\":{\".\":{},\"f:cpu\":{},\"f:memory\":{}}},\"f:terminationMessagePath\":{},\"f:terminationMessagePolicy\":{}}},\"f:dnsPolicy\":{},\"f:enableServiceLinks\":{},\"f:nodeSelector\":{},\"f:restartPolicy\":{},\"f:schedulerName\":{},\"f:securityContext\":{},\"f:terminationGracePeriodSeconds\":{},\"f:tolerations\":{}}}}]},\"spec\":{\"containers\":[{\"name\":\"my-container\",\"image\":\"busybox\",\"command\":[\"/bin/sh\"],\"args\":[\"-c\",\"sleep 600 \\u0026\\u0026 echo 'FINISHED!'\"],\"resources\":{\"limits\":{\"cpu\":\"1\",\"memory\":\"1Gi\"},\"requests\":{\"cpu\":\"1\",\"memory\":\"1Gi\"}},\"terminationMessagePath\":\"/dev/termination-log\",\"terminationMessagePolicy\":\"File\",\"imagePullPolicy\":\"Always\"}],\"restartPolicy\":\"Always\",\"terminationGracePeriodSeconds\":30,\"dnsPolicy\":\"ClusterFirst\",\"nodeSelector\":{\"kubernetes.io/hostname\":\"nunet-node\"},\"serviceAccountName\":\"default\",\"serviceAccount\":\"default\",\"automountServiceAccountToken\":false,\"nodeName\":\"nunet-node\",\"securityContext\":{},\"schedulerName\":\"default-scheduler\",\"tolerations\":[{\"key\":\"virtual-node.interlink/no-schedule\",\"operator\":\"Exists\"},{\"key\":\"node.kubernetes.io/not-ready\",\"operator\":\"Exists\",\"effect\":\"NoExecute\",\"tolerationSeconds\":300},{\"key\":\"node.kubernetes.io/unreachable\",\"operator\":\"Exists\",\"effect\":\"NoExecute\",\"tolerationSeconds\":300}],\"priority\":0,\"enableServiceLinks\":true,\"preemptionPolicy\":\"PreemptLowerPriority\"},\"status\":{\"phase\":\"Pending\",\"conditions\":[{\"type\":\"Initialized\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null},{\"type\":\"Ready\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null},{\"type\":\"PodScheduled\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null}],\"hostIP\":\"10.244.0.18\",\"podIP\":\"10.244.0.18\",\"startTime\":\"2024-04-04T09:39:16Z\",\"containerStatuses\":[{\"name\":\"my-container\",\"state\":{\"running\":{\"startedAt\":\"2024-04-04T09:39:16Z\"}},\"lastState\":{},\"ready\":true,\"restartCount\":1,\"image\":\"busybox\",\"imageID\":\"\"}]}},\"container\":[{\"name\":\"\",\"configMaps\":null,\"secrets\":null,\"emptyDirs\":null}]}]
- Interlink server creates an entry in its database to store the pod data (including the pod ID)
- Place in the network: Nunet interlink gateway - interlink
Output #
- Interlink server sends a 200 response back to the virtual kubelet ?
Loop 3: Interlink forwards request to NuNet interlink plugin #
Input #
-
Data: The pod spec received by Interlink is then passed to the NuNet Interlink plugin (running as a local rest api endpoint)
POST /create Request body:
[{'metadata': {'name': 'interlink-quickstart', 'namespace': 'default', 'uid': '1df0a37e-a29b-426b-9b14-43f95dfc5a2f', 'resourceVersion': '728753', 'creationTimestamp': '2024-04-04T09:39:16Z', 'annotations': {'kubectl.kubernetes.io/last-applied-configuration': '{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"interlink-quickstart","namespace":"default"},"spec":{"automountServiceAccountToken":false,"containers":[{"args":["-c","sleep 600 \\u0026\\u0026 echo \'FINISHED!\'"],"command":["/bin/sh"],"image":"busybox","imagePullPolicy":"Always","name":"my-container","resources":{"limits":{"cpu":"1","memory":"1Gi"},"requests":{"cpu":"1","memory":"1Gi"}}}],"nodeSelector":{"kubernetes.io/hostname":"nunet-node"},"tolerations":[{"key":"virtual-node.interlink/no-schedule","operator":"Exists"},{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]}}\n'}, 'managedFields': [{'manager': 'kubectl-client-side-apply', 'operation': 'Update', 'apiVersion': 'v1', 'time': '2024-04-04T09:39:16Z', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:annotations': {'.': {}, 'f:kubectl.kubernetes.io/last-applied-configuration': {}}}, 'f:spec': {'f:automountServiceAccountToken': {}, 'f:containers': {'k:{"name":"my-container"}': {'.': {}, 'f:args': {}, 'f:command': {}, 'f:image': {}, 'f:imagePullPolicy': {}, 'f:name': {}, 'f:resources': {'.': {}, 'f:limits': {'.': {}, 'f:cpu': {}, 'f:memory': {}}, 'f:requests': {'.': {}, 'f:cpu': {}, 'f:memory': {}}}, 'f:terminationMessagePath': {}, 'f:terminationMessagePolicy': {}}}, 'f:dnsPolicy': {}, 'f:enableServiceLinks': {}, 'f:nodeSelector': {}, 'f:restartPolicy': {}, 'f:schedulerName': {}, 'f:securityContext': {}, 'f:terminationGracePeriodSeconds': {}, 'f:tolerations': {}}}}]}, 'spec': {'containers': [{'name': 'my-container', 'image': 'busybox', 'command': ['/bin/sh'], 'args': ['-c', "sleep 600 && echo 'FINISHED!'"], 'resources': {'limits': {'cpu': '1', 'memory': '1Gi'}, 'requests': {'cpu': '1', 'memory': '1Gi'}}, 'terminationMessagePath': '/dev/termination-log', 'terminationMessagePolicy': 'File', 'imagePullPolicy': 'Always'}], 'restartPolicy': 'Always', 'terminationGracePeriodSeconds': 30, 'dnsPolicy': 'ClusterFirst', 'nodeSelector': {'kubernetes.io/hostname': 'nunet-node'}, 'serviceAccountName': 'default', 'serviceAccount': 'default', 'automountServiceAccountToken': False, 'nodeName': 'nunet-node', 'securityContext': {}, 'schedulerName': 'default-scheduler', 'tolerations': [{'key': 'virtual-node.interlink/no-schedule', 'operator': 'Exists'}, {'key': 'node.kubernetes.io/not-ready', 'operator': 'Exists', 'effect': 'NoExecute', 'tolerationSeconds': 300}, {'key': 'node.kubernetes.io/unreachable', 'operator': 'Exists', 'effect': 'NoExecute', 'tolerationSeconds': 300}], 'priority': 0, 'enableServiceLinks': True, 'preemptionPolicy': 'PreemptLowerPriority'}, 'status': {'phase': 'Pending', 'conditions': [{'type': 'Initialized', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': None}, {'type': 'Ready', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': None}, {'type': 'PodScheduled', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': None}], 'hostIP': '10.244.0.18', 'podIP': '10.244.0.18', 'startTime': '2024-04-04T09:39:16Z', 'containerStatuses': [{'name': 'my-container', 'state': {'running': {'startedAt': '2024-04-04T09:39:16Z'}}, 'lastState': {}, 'ready': True, 'restartCount': 1, 'image': 'busybox', 'imageID': ''}]}}]
- Place in the network: Interlink NuNet Gateway Server
Output #
- Data:
- The NuNet plugin sends a 200 response to interlink server
- Place in the network
- Interlink NuNet Gateway Server
Loop 4: NuNet interlink plugin transforms data and creates a request to the local SP DMS #
Input #
- Data:
- Pod data sent to the NuNet interlink plugin is parsed to determine required resources.
- Peer list is parsed to determine a suitable target for the pod.
- Place in the network:
- Interlink NuNet Gateway Server
Output #
-
Data:
Nunet Interlink Plugin creates the payload for the job I have had added a few more sections for discussion.
{
"job_id": 3456,
"job": "example_docker_image", // virtual machines and wasm workloads are also possible
"jobparams": "params go here"
"jobenv": {
"env1": "value1",
"env2": ""value2,
},
"secrets": {
"secret1": "<encryptedblob?>"
},
"network": {
"vpn??subnet": <uuid>,
"vpn??maxlatency": 100
"ipaddr": 10.0.0.11,
"dnsname": "job.subdomain.domain.com" ,
"ingressproxyid": <peerid of proxy>,
"egressproxyid":
"fwrules": {
},
},
"requiredCapability": {
"executor": "docker", // or vm or wasm or others executors
"type": {
description: "batch"
},
"resources": {
"cpu": {
"cores": 1,
"frequency": 10000
},
"ram": 1024Gi,
"disk": 0,
"power": 0,
"vcpu": 128,
"gpu": {
"brand": "nvidia",
"model": "V100"
"vram": "40Gi"
}
},
"libraries": ["pytorch@1.0.0"],
"locality": ["us-east-1", "us-east-2"],
"storage": ["ramdisk", "localVolume", "ipfs", "aws-s3", "filecoin", "google-drive"],
"network": {
"open_ports": [80, 443, 8080],
"vpn??": true,
// TODO: specify it more
},
"price": {
"currency":"ntx",
"max_per_hour": 500,
"max_total": 10000,
"preference": 1
},
"time": {
"units":"seconds",
"max_time":6300,
"preference": 2
},
"kyc": ["iamx"]
}
}
- Nunet Interlink Plugin Sends the payload to the local SP DMS via the Nunet API endpoint see the documentation here: https://app.gitbook.com/o/HmQiiAfFnBUd24KadDsO/s/nUIl2GGpV9Wq3xiFlif2/public-technical-documentation/device-management-service/proposed/orchestrator#id-1.-job-posting
/orchestrator/postJob
method: HTTP POST
- Local SP DMS returns a job ID to the nunet interlink plugin
- Interlink updates its database record for the pod to include the nunet job ID (there is now a mapping between the kubernetes pod ID and the nunet job id)
- Place in the network
- Interlink NuNet Gateway Server
Loop 5: SPDMS Schedules job on remote DMS #
Input #
- Data:
- The Job is then scheduled on remote DMS
- Place in the network:
- NuNet compute Provider
Output #
-
Data:
- The remote CP DMS returns the status of the job creation to the SP DMS
- SP DMS Updates its record in its database with the latest status.
-
Place in the network
- Interlink NuNet Gateway Server
JOB is sceduled successfully
General sequence diagram (Get Job Status) #
sequenceDiagram participant KS as Kubernetes Scheduler participant VK as Virtual Kubelet participant OS as OAuth Proxy participant IMS as Interlink Middleware Server participant NP as Nunet Inerlink Plugin participant NDMSSP as Nunet DMS SP participant NDMSCP as Nunet DMS CP Loop 1: NDMSSP->>+NDMSCP: SP requests status NDMSCP-->>-NDMSSP: CP returns status end Loop 2: KS->>+VK: K8's API polls the virtual kublet for status information VK->>+IMS: Virtual kubelet passes request Interlink IMS->>+NP: Interlink passes request to Nunet Interlink plugin NP->>+NDMSSP: Nunet nterlink plugin sends request to service provider NDMSSP-->-NP: SP returns response + 200 header NP-->>-IMS: Nunet plugin reformats response and forwards + 200 IMS-->>-VK: Interlink forwards response + 200 Header VK-->>-KS: VKubelet forwards response + 200 Header end
Loop 1: Nunet Service Provider requests job status #
Input #
This is standard DMS functionality. DMS’s keep track of what other nodes are doing for them and store that data in their local database. NOTE although current DMS process is to keep a socket open between the two DMS’s it may be better to just allow A DMS to poll (ping) andother DMS for status information as required or to use a gossipsub topic to send and receive updates.
Loop 2: Kubernetes API requests status information for a pod #
Kubernetes API polls the virtual kubelet for status information #
We do not need to worry about the specifics of this as this is handled by the Interlink virtual kubelet
- Place in the network
- virtual kubelet (Kubernetes Worker node)
Virtual kubelet passes request to Interlink #
We do not need to worry about this as it is handled by interlink server
- Place in the network: Interlink NuNet Interlink gateway
Interlink passes request to Nunet Interlink plugin #
- Data:
- Place in the network:
- virtual kubelet (Kubernetes Worker node)
Nunet Interlink plugin sends request to Service provider #
The nunet interlink plugin extracts the relevant data
- Data:
- Nunet Interlink plugin makes a request to the SP DMS for job Status
/api/v1/job/status/{nunetjobid}
- Place in the network:
- virtual kubelet (Kubernetes Worker node)
Output #
- Data:
- response payload
{ jobid: DID? jobstatus: running starttime: duration: remainingtime: }
- jobstatus is stored (somewhere)
Sequence diagram (Update VKublet with Available Resources) #
As part of this solution it is fairly important for the resources avaialbe on nunet to be evaluated and suitable information passed back to the virtual kubelet to allow the cluster to make scheduling decisions.
Interlink currently sets this information in a text file in a directory where the virtual kublet is installed on the worker node.
I am proposing that we add the functionality to interlink to automatically update this resource information based on data returned from the plugins it uses. There are several things we need to consider here as the logic for each plugin will differ based on the type of network / system the plugin is supporting.
In the case of Nunet where we are likely to have many different spec nodes with different types of GPU avaialable we should try to group these nodes together into similar types and make these different groups avaialable to different virtual kubelets.
We will need a method to filter suitable node types that we want to advertise to the cluster. This can initally be done by using channels and onboarding specific hardware types to specific channels but going forward we should be thinking about how a service provider is tasked with discovering only relevant peers (We need to narrow down search params during the DMS discovery phase. e.g. geo loacation, hardware types, ping response times etc) Where should this functionality reside, network package, orchistrator etc
sequenceDiagram participant NDMSCP as Nunet DMS CP participant NDMSSP as Nunet DMS SP participant NP as Nunet Inerlink Plugin participant IMS as Interlink Middleware Server participant OS as OAuth Proxy participant VK as Virtual Kubelet participant KS as Kubernetes Scheduler loop 0: NDMSSP->>+NDMSCP: Ping request NDMSCP-->>-NDMSSP: Return Current Resource Availabity end loop 1: NP->>+NDMSSP: Request Resource Avaialalbity of DHT (Handshake) Peers NDMSSP -->>-NP: Return Resource Availalbity of DHT (Handshake) Peers end loop 2: IMS->>+NP: Interlink requests resource availability of NuNet NP-->>-IMS: Nunet plugin returns latest resource availability end loop 3: VK->>+IMS: Interlink Virtual Kubelet requests latest resource availabity IMS-->>-VK: IMS returns latest availability VK->>+KS: Interlink V Kubelet publishes latest availability to API end
Loop 0 - CP -> SP Resource Availability Update #
The DMS on the virtual Kublet (service provider role) requests current resource availability from all the connected compute providers in it’s DHT (“Handshake”) peers list. It will then update its local database.
Request
New format libp2p ping ?
Response format
New format libp2p ping response
Loop 1 - Nunet plugin requests availability update from DMS #
The Nunet plugin periodically requests the latest availability from the local DMS.
Request:
/api/v1/network/peers/list/resources?
current (/api/v1/peers/dht/dump)
Response:
[
{
"peer_id": "QmdRxGHC4QZhdsXyUUMhpu1GgeBQoLpjEU663dgMrMAj9k",
"is_available": true,
"has_gpu": false,
"allow_cardano": false,
"gpu_info": null,
"tokenomics_addrs": "addr_test1qq0aq505npfft4t2ql7gd7w442dpmarnfthlzy73pczzccj9q695zk5p2eyx8gqz68s4d7s6q5fpa0953taavkmxhupq0gzhse",
"tokenomics_blockchain": "",
"available_resources": {
"id": 1,
"tot_cpu_hz": 4000,
"price_cpu": 0,
"ram": 4000,
"price_ram": 0,
"vcpu": 1,
"disk": 0,
"price_disk": 0,
"ntx_price": 0
},
"services": []
},
{
"peer_id": "QmQ5CXpo9QaJXn9L5yJatBphnitKPB5jsdDpHNiaNV4dZm",
"is_available": false,
"has_gpu": true,
"allow_cardano": false,
"gpu_info": [
{
"name": "Tesla P100-PCIE-16GB",
"tot_vram": 16384,
"free_vram": 15972
},
{
"name": "Tesla P100-PCIE-16GB",
"tot_vram": 16384,
"free_vram": 16237
},
{
"name": "Tesla P100-PCIE-16GB",
"tot_vram": 16384,
"free_vram": 16221
}
],
"tokenomics_addrs": "addr_test1qr2p9uxv7a9mv8vty4nzl93elecpwjculxrx37zhdmm7gcptr70ngj0235495mpdtq6jy8let52598fls6aslqplv3jq9qcdke",
"tokenomics_blockchain": "",
"available_resources": {
"id": 0,
"tot_cpu_hz": 0,
"price_cpu": 0,
"ram": 0,
"price_ram": 0,
"vcpu": 0,
"disk": 0,
"price_disk": 0,
"ntx_price": 0
},
"services": null
},
{
"peer_id": "Qmab2uhVmFQ2ZkmporYr9wztX8oCkJkWj55zrua78Le5dC",
"is_available": true,
"has_gpu": true,
"allow_cardano": false,
"gpu_info": [
{
"name": "NVIDIA GeForce RTX 3080",
"tot_vram": 10240,
"free_vram": 9777
},
{
"name": "NVIDIA GeForce RTX 3090",
"tot_vram": 24576,
"free_vram": 24048
},
{
"name": "NVIDIA GeForce RTX 3090",
"tot_vram": 24576,
"free_vram": 24245
},
{
"name": "NVIDIA GeForce RTX 3080",
"tot_vram": 10240,
"free_vram": 9995
},
{
"name": "NVIDIA GeForce RTX 3080",
"tot_vram": 10240,
"free_vram": 9995
},
{
"name": "NVIDIA GeForce RTX 3080",
"tot_vram": 10240,
"free_vram": 9995
}
],
"tokenomics_addrs": "0x87DA03a4C593FE69fe98440B6c3d37348c93A8FB",
"tokenomics_blockchain": "",
"available_resources": {
"id": 1,
"tot_cpu_hz": 114299,
"price_cpu": 0,
"ram": 113855,
"price_ram": 0,
"vcpu": 17,
"disk": 0,
"price_disk": 0,
"ntx_price": 0
},
"services": [
{
"ID": 2,
"CreatedAt": "2023-09-06T16:38:29.206826938-04:00",
"UpdatedAt": "2023-09-06T16:44:38.211393386-04:00",
"DeletedAt": null,
"TxHash": "19405c405240aeadace33e027573bbc02b1d0636801576d6eff9be54448585e2",
"TransactionType": "",
"JobStatus": "running",
"JobDuration": 5,
"EstimatedJobDuration": 10,
"ServiceName": "registry.gitlab.com/nunet/ml-on-gpu/ml-on-gpu-service/develop/tensorflow",
"ContainerID": "6b89f0ca28d6ff88f145977cfb9d7d0383210883caa32fb5a3b07da637a70ad5",
"ResourceRequirements": 2,
"ImageID": "registry.gitlab.com/nunet/ml-on-gpu/ml-on-gpu-service/develop/tensorflow",
"LogURL": "https://log.nunet.io/api/v1/logbin/8ff56588-c29e-4cee-9da4-702c9359a436/raw",
"LastLogFetch": "2023-09-06T20:44:38.211356996Z",
"ServiceProviderAddr": "",
"ComputeProviderAddr": "",
"MetadataHash": "",
"WithdrawHash": "",
"RefundHash": "",
"Distribute_50Hash": "",
"Distribute_75Hash": "",
"SignatureDatum": "",
"MessageHashDatum": "",
"Datum": "",
"SignatureAction": "",
"MessageHashAction": "",
"Action": ""
}
]
},
Nunet Plugin then consolidates this info
- nunet.gpu.nvidia.tesla.p100.16gb x3
- nunet.gpu.nvidia.rtx.3080.10gb x5
- nunet.gpu.nvidia.rtx.3090.24gb x2
- nunet.cpu.cores.17
- nunet.cpu.mhz.114299
- nunet.ram.gb.113855
Carefull consideration needs to be taken in order to ensure whats reported makes sense to the cluster / scheduler. We may have to specify custom resource definitions for the GPU’s and maybe for node types so that ram and disk io are taken into account.
Loop 2 - Interlink requests availability from nunet plugin #
Request:
/resources
Response:
{
"nunet.gpu.nvidia.tesla.p100.16gb": 3,
"nunet.gpu.nvidia.rtx.3080.10gb": 5,
"nunet.gpu.nvidia.rtx.3090.24gb": 2,
"nunet.cpu.cores": 17,
"nunet.cpu.mhz": 114299,
"nunet.ram.gb": 113855
}
Loop 3 - V Kublet requests resource availability from interlink and publishes to k8’s API #
Request:
????
Response from Interlink
????
Put to K8’s API
PUT /api/v1/nodes/{nunet-node-1}/status
{
"kind": "Node",
"apiVersion": "v1",
"metadata": {
"name": "nunet-node-1",
"labels": {
"type": "virtual-kubelet"
}
},
"status": {
"capacity": {
"cpu": "17",
"memory": "113855Mi",
"nunet.gpu.nvidia.tesla.p100.16gb": "3",
"nunet.gpu.nvidia.rtx.3080.10gb": "5",
"nunet.gpu.nvidia.rtx.3090.24gb": "2",
"nunet.cpu.mhz": "114299"
},
"allocatable": {
"cpu": "17",
"memory": "113855Mi",
"nunet.gpu.nvidia.tesla.p100.16gb": "3",
"nunet.gpu.nvidia.rtx.3080.10gb": "5",
"nunet.gpu.nvidia.rtx.3090.24gb": "2",
"nunet.cpu.mhz": "114299"
},
"conditions": [
{
"type": "Ready",
"status": "True",
"reason": "KubeletReady",
"message": "kubelet is posting ready status"
}
],
"addresses": [
{
"type": "InternalIP",
"address": "192.168.100.1"
}
],
"daemonEndpoints": {
"kubeletEndpoint": {
"Port": 10250
}
},
"nodeInfo": {
"architecture": "amd64",
"containerRuntimeVersion": "docker://19.3",
"kubeletVersion": "v1.20.0",
"operatingSystem": "linux"
}
}
}
Maintainer: Sam (please tag all edit merge requests accordingly)