Kubernetes Integration Specs

Kubernetes Integration Specs

March 22, 2024

Kubernetes Integration Specs and Architecture Overview #

The Kubernetes integration work can be broken down into three sections:

  • Configuration required on the kubernetes cluster.
  • Deployment of a middleware server (interlink) that will act as the interface between kubernetes and Nunet.
  • Changes we need to make to DMS in order to support the kubernetes integration.

This blog post aims to give an overview of all of these items at a level where we can start to consider the technical changes we will need to make to both NuNet DMS and the Interlink server / interlink virtual kublet as well as the nunet interlink plugin we will have to build.

As a result of there being three parts to the puzzle it is important to recognise that there are dependancies across all three systems that we need to specify and agree on. This blog post does not seek to replace detailed specification documentation it aims to provoke thought and discussion across the teams that are responsible for each of the components.

For the time being we are considering that the functionality required is for single pods to complete batch processing type workloads as opposed to hosting services that will be access from within the cluster. It is a non trivial task to support/integrate with the networking requirements of any cluster so we must limit the scope of jobs in this way.

Below is a diagram that was part of a presentation I made a few weeks ago that serves as the basis for system architecture this blog post.

General Kubernetes deployment requirements / specs #

Kubernetes-NuNet integration using interlink schematics, developed by Sam around 2024-02-10 - 2024-03-23. This is a basis for further specification and sequence diagrams /types that are presented below. The boxes marked with red bold borders and bold black text are entities further depicted in the sequence diagram, which participate in the Kubernetes - NuNet orchestration.

Kubernetes Cluster Configuration #

Before a job can be deployed via Kubernetes NuNet integration, certain configuration steps need to be completed on both the Kubernetes cluster and the Interlink gateway.

The cluster configuration will depend on the scale and type of jobs which are to be deployed on nunet. it will also depend on the exisiting cluster configuration so will have to specified on a case by case basis. I propose we build a simple testing environmnet that just covers the basics initally. The configuration steps can be expanded at a later date as we have actual use cases we will be supporting.

Network placement #

This diagram depicts the various servers that will be involved in the solution and the proposed placement of components on servers

sequenceDiagram
    participant KMN as Kubernetes Master Node
    participant KWN as Kubernetes Worker Node
    participant NIGW as Nunet Interlink Gateway
    participant NNW as Nunet Network

    Note over KMN: On Premise Standard Kubernetes (Scheduler, API Server)
    Note over KWN: On Premise Custom install (Virtual Kubelet)
    Note over NIGW: External server (Interlink Service, Nunet interlink Plugin, Nunet DMS)
    Note over NNW: Nunet Network (DMS compute providers)
  

Service Components #

This diagram depicts the various services / protocols / and their installation requirements.

sequenceDiagram
    participant KS as Kubernetes API on Master Node
    participant VK as Virtual Kubelet on Worker node
    participant OS as OAuth Proxy on External Server
    participant IMS as Interlink Middleware Server on External Server
    participant NP as Nunet Inerlink Plugin on External Server
    participant NDMSSP as Nunet DMS SP on External Server

    loop 0: Virtual Kubelet register with K8's server
       KS-->>VK: Vkubelet needs to use a certificate issued by the cluster administrator to authenticate and join the cluster. 
    end
    loop 1: Virtual Kubelet authenticates via oauth proxy
       VK-->>OS: Virtual Kubelet needs a certificate issued by interlink administrator / Oauth server
    end
    loop 2: Virtual Kubelet registers to interlink server API
       VK-->>IMS: The process to be defined
    end
    loop 3: Interlink access local Nunet plugin via API
       IMS-->>NP: The process to be defined
    end
    loop 4: Nunet plugin accesses local DMS API
       NP-->>NDMSSP: The process to be defined
    end
  

Kubernetes API #

The kubernetes API is the core of Kubernetes, we will need to ensure the Interlink virtual kubelet is given the correct rights to be able to read and write to the API. We should consider it good practice that there will be a NuNet namspace configured on the cluster and that the virtual Kubelet is given approprate permissions within that namespace.

It is highly likely that we will want to specify custom resource definitions for some of the nunet hardware types so that they can be explicity used in the Job specifications. The reasoning being that nunet nodes will have a wide variety of hardware compared to a standard cluster where people maybe used to just specifiying nvidia.com/GPU as all the GPUS on a node are the same. Whereas the visrtual Kublet that represents a node to the cluster actually will have multitude of hardware.

Virtual kubelet (on a Kubernetes worker node) #

The Interlink Virtual Kubelet will need to be installed on a node in the cluster by a cluster administrator. Interlink are providing a helm chart to assit in this installation process.

The following services are all installed on a single server instance referred to as the Nunet Interlink Gateway

Oauth proxy #

The Oauth proxy is part of the example Interlink architecture. In testing we have just used github to generate a key that grants access to the interlink Middleware. The proxy sits infront of interlink in order to facilitate this and abstract the authentication process from the Interlink software.

The Interlink middlware server is the endpoint that the virtual kubelet points to. The virtual kubelet will make requests to the Interlink server that will in turn pass those requests to a custom nunet “provider” or “plugin” that may also be refereed to as a sidecar in the interlink documentation.

The NuNet Interlink plugin will recieve requests from the interlink middleware that it must either serve a response based on information it has cached or proxy a request to the local DMS that is running on the “Interlink gateway” (which is a server configured with the interlink software, nunet plugin, nunet DMS and an oauth proxy)

Service provider DMS #

As stated in the section above the service Provider DMS is installed on the same machine as the rest of the interlink software and is effectively the orchistrator for all the Kubernetes jobs that will be handled on NuNet.

Required functionality #

The current functionality we are looking to support is:

Dynamically report available resources to the cluster.

Create/start pods / jobs

Delete pods / jobs

Check the Status of pods / Jobs

Check Logfiles of pods / jobs

Although dynamically reporting available resources is the first logical step this is not currently implemented in Interlink and is not the core of the functionality. We will start with the creation of the Job.

Create Job - Sequence diagram #

The following Sequence diagram shows the steps required in order to create a job on Nunet that is then represented as a pod to the kubernetes cluster.

sequenceDiagram
    participant KS as Kubernetes Scheduler
    participant VK as Virtual Kubelet
    participant OS as OAuth Proxy
    participant IMS as Interlink Middleware Server
    participant NP as Nunet Inerlink Plugin
    participant NDMSSP as Nunet DMS SP
    participant NDMSCP as Nunet DMS CP

    loop 1: Kubernetes schedules a job on a virtual kubelet
        KS->>+VK: Decide placement & request resources
        VK->>+OS: Request authentication
    end
    loop 2: Virtual Kublet forwards request to interlink
    VK->>+IMS: Forward request w/ token
    IMS->+IMS: Interlink creates record for the job id.
    end
    loop 3: Interlink forwards request to NuNet interlink plugin
    IMS->>+NP: Route/transform request
    end
    loop 4: NuNet interlink plugin transforms data and creates a request to the local SP DMS
    NP->>+NDMSSP: 
    end
    Loop 5: SPDMS Schedules job on remote DMS
    NDMSSP-->>+NDMSCP: Service Provider DMS requests job
    NDMSCP-->>+NDMSSP: Compute Provider DMS schedules job and returns job id to service provider
    end
 
  

Loop 1: Kubernetes schedules a job on a virtual kubelet #

Input #

kubernetes job description.

apiVersion: batch/v1
kind: Job
metadata:
  name: tensorflow-gpu-job
  namespace: interlink
spec:
  template:
    spec:
      containers:
      - name: tensorflow-container
        image: tensorflow/tensorflow:latest-gpu  # Use the appropriate TensorFlow GPU image
        resources:
          limits:
            cpu: "8"
            memory: 128Gi
            nunet.gpu.nvidia.v100.40gb: 1  # Assuming this is a valid CRD
          requests:
            cpu: "8"
            memory: 128Gi
        command: ["python", "-u"]
        args: ["path/to/your/tensorflow/script.py"]  # path to your TensorFlow script
      restartPolicy: Never
      nodeSelector:
        kubernetes.io/hostname: nunet  # Ensure the pod is scheduled on the node called 'nunet'
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: gpu-type  # This label should be set on your 'nunet' node to match the GPU type
                operator: In
                values:
                - nunet.gpu.nvidia.v100.40gb
  backoffLimit: 4
  1. Kubernetes scheduler reads the job description and takes into account the NuNet node selector specified in the job. It creates the pod spec and assigns it to the virtual kubelet. The pod is given an id by the cluster.

  2. The virtual kubelet is watching the API for new jobs assigned to it;

  • Place in the network:
    1. Kubernetes Cluster, kubernetes API sends data to virtual kubelet running on a Kubernetes Worker node

Output #

{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "tensorflow-gpu-job-xxxxx",  // A unique generated name for the pod
    "namespace": "interlink",
    "ownerReferences": [
      {
        "apiVersion": "batch/v1",
        "kind": "Job",
        "name": "tensorflow-gpu-job",
        "uid": "uid-of-the-job-object",  // UID generated by Kubernetes for the Job
        "controller": true
      }
    ],
    "labels": {
      "job-name": "tensorflow-gpu-job"
    }
  },
  "spec": {
    "containers": [
      {
        "name": "tensorflow-container",
        "image": "tensorflow/tensorflow:latest-gpu",
        "command": ["python", "-u"],
        "args": ["path/to/your/tensorflow/script.py"],
        "resources": {
          "limits": {
            "cpu": "8",
            "memory": "128Gi",
            "nunet.gpu.nvidia.v100.40gb": "1"
          },
          "requests": {
            "cpu": "8",
            "memory": "128Gi"
          }
        }
      }
    ],
    "restartPolicy": "Never",
    "nodeSelector": {
      "kubernetes.io/hostname": "nunet"
    },
    "affinity": {
      "nodeAffinity": {
        "requiredDuringSchedulingIgnoredDuringExecution": {
          "nodeSelectorTerms": [
            {
              "matchExpressions": [
                {
                  "key": "gpu-type",
                  "operator": "In",
                  "values": [
                    "nunet.gpu.nvidia.v100.40gb"
                  ]
                }
              ]
            }
          ]
        }
      }
    }
  }
}
  • We need to consider access to custom container registries and the ability to pass secrets for access to registry and or other job data sources.

  • There is no output at this time as the virtual kubelet needs to pass the job request on to the interlink server (see loop2)

Input #

  • Data:
    1. The virtual kubelet authenticates to the interlink server possibly via an oauth server to retrive an access token
    2. The virtual kubelet forwards the pod spec to the interlink server in this escaped JSON string format (note this is from the interlink demo job and not the pod description above.)
[{\"pod\":{\"metadata\":{\"name\":\"interlink-quickstart\",\"namespace\":\"default\",\"uid\":\"1df0a37e-a29b-426b-9b14-43f95dfc5a2f\",\"resourceVersion\":\"728753\",\"creationTimestamp\":\"2024-04-04T09:39:16Z\",\"annotations\":{\"kubectl.kubernetes.io/last-applied-configuration\":\"{\\\"apiVersion\\\":\\\"v1\\\",\\\"kind\\\":\\\"Pod\\\",\\\"metadata\\\":{\\\"annotations\\\":{},\\\"name\\\":\\\"interlink-quickstart\\\",\\\"namespace\\\":\\\"default\\\"},\\\"spec\\\":{\\\"automountServiceAccountToken\\\":false,\\\"containers\\\":[{\\\"args\\\":[\\\"-c\\\",\\\"sleep 600 \\\\u0026\\\\u0026 echo 'FINISHED!'\\\"],\\\"command\\\":[\\\"/bin/sh\\\"],\\\"image\\\":\\\"busybox\\\",\\\"imagePullPolicy\\\":\\\"Always\\\",\\\"name\\\":\\\"my-container\\\",\\\"resources\\\":{\\\"limits\\\":{\\\"cpu\\\":\\\"1\\\",\\\"memory\\\":\\\"1Gi\\\"},\\\"requests\\\":{\\\"cpu\\\":\\\"1\\\",\\\"memory\\\":\\\"1Gi\\\"}}}],\\\"nodeSelector\\\":{\\\"kubernetes.io/hostname\\\":\\\"nunet-node\\\"},\\\"tolerations\\\":[{\\\"key\\\":\\\"virtual-node.interlink/no-schedule\\\",\\\"operator\\\":\\\"Exists\\\"},{\\\"effect\\\":\\\"NoExecute\\\",\\\"key\\\":\\\"node.kubernetes.io/not-ready\\\",\\\"operator\\\":\\\"Exists\\\",\\\"tolerationSeconds\\\":300},{\\\"effect\\\":\\\"NoExecute\\\",\\\"key\\\":\\\"node.kubernetes.io/unreachable\\\",\\\"operator\\\":\\\"Exists\\\",\\\"tolerationSeconds\\\":300}]}}\\n\"},\"managedFields\":[{\"manager\":\"kubectl-client-side-apply\",\"operation\":\"Update\",\"apiVersion\":\"v1\",\"time\":\"2024-04-04T09:39:16Z\",\"fieldsType\":\"FieldsV1\",\"fieldsV1\":{\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:kubectl.kubernetes.io/last-applied-configuration\":{}}},\"f:spec\":{\"f:automountServiceAccountToken\":{},\"f:containers\":{\"k:{\\\"name\\\":\\\"my-container\\\"}\":{\".\":{},\"f:args\":{},\"f:command\":{},\"f:image\":{},\"f:imagePullPolicy\":{},\"f:name\":{},\"f:resources\":{\".\":{},\"f:limits\":{\".\":{},\"f:cpu\":{},\"f:memory\":{}},\"f:requests\":{\".\":{},\"f:cpu\":{},\"f:memory\":{}}},\"f:terminationMessagePath\":{},\"f:terminationMessagePolicy\":{}}},\"f:dnsPolicy\":{},\"f:enableServiceLinks\":{},\"f:nodeSelector\":{},\"f:restartPolicy\":{},\"f:schedulerName\":{},\"f:securityContext\":{},\"f:terminationGracePeriodSeconds\":{},\"f:tolerations\":{}}}}]},\"spec\":{\"containers\":[{\"name\":\"my-container\",\"image\":\"busybox\",\"command\":[\"/bin/sh\"],\"args\":[\"-c\",\"sleep 600 \\u0026\\u0026 echo 'FINISHED!'\"],\"resources\":{\"limits\":{\"cpu\":\"1\",\"memory\":\"1Gi\"},\"requests\":{\"cpu\":\"1\",\"memory\":\"1Gi\"}},\"terminationMessagePath\":\"/dev/termination-log\",\"terminationMessagePolicy\":\"File\",\"imagePullPolicy\":\"Always\"}],\"restartPolicy\":\"Always\",\"terminationGracePeriodSeconds\":30,\"dnsPolicy\":\"ClusterFirst\",\"nodeSelector\":{\"kubernetes.io/hostname\":\"nunet-node\"},\"serviceAccountName\":\"default\",\"serviceAccount\":\"default\",\"automountServiceAccountToken\":false,\"nodeName\":\"nunet-node\",\"securityContext\":{},\"schedulerName\":\"default-scheduler\",\"tolerations\":[{\"key\":\"virtual-node.interlink/no-schedule\",\"operator\":\"Exists\"},{\"key\":\"node.kubernetes.io/not-ready\",\"operator\":\"Exists\",\"effect\":\"NoExecute\",\"tolerationSeconds\":300},{\"key\":\"node.kubernetes.io/unreachable\",\"operator\":\"Exists\",\"effect\":\"NoExecute\",\"tolerationSeconds\":300}],\"priority\":0,\"enableServiceLinks\":true,\"preemptionPolicy\":\"PreemptLowerPriority\"},\"status\":{\"phase\":\"Pending\",\"conditions\":[{\"type\":\"Initialized\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null},{\"type\":\"Ready\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null},{\"type\":\"PodScheduled\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null}],\"hostIP\":\"10.244.0.18\",\"podIP\":\"10.244.0.18\",\"startTime\":\"2024-04-04T09:39:16Z\",\"containerStatuses\":[{\"name\":\"my-container\",\"state\":{\"running\":{\"startedAt\":\"2024-04-04T09:39:16Z\"}},\"lastState\":{},\"ready\":true,\"restartCount\":1,\"image\":\"busybox\",\"imageID\":\"\"}]}},\"container\":[{\"name\":\"\",\"configMaps\":null,\"secrets\":null,\"emptyDirs\":null}]}]
  1. Interlink server creates an entry in its database to store the pod data (including the pod ID)
  • Place in the network: Nunet interlink gateway - interlink

Output #

  • Interlink server sends a 200 response back to the virtual kubelet ?

Input #

  • Data: The pod spec received by Interlink is then passed to the NuNet Interlink plugin (running as a local rest api endpoint)

    POST /create Request body:

[{'metadata': {'name': 'interlink-quickstart', 'namespace': 'default', 'uid': '1df0a37e-a29b-426b-9b14-43f95dfc5a2f', 'resourceVersion': '728753', 'creationTimestamp': '2024-04-04T09:39:16Z', 'annotations': {'kubectl.kubernetes.io/last-applied-configuration': '{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"interlink-quickstart","namespace":"default"},"spec":{"automountServiceAccountToken":false,"containers":[{"args":["-c","sleep 600 \\u0026\\u0026 echo \'FINISHED!\'"],"command":["/bin/sh"],"image":"busybox","imagePullPolicy":"Always","name":"my-container","resources":{"limits":{"cpu":"1","memory":"1Gi"},"requests":{"cpu":"1","memory":"1Gi"}}}],"nodeSelector":{"kubernetes.io/hostname":"nunet-node"},"tolerations":[{"key":"virtual-node.interlink/no-schedule","operator":"Exists"},{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]}}\n'}, 'managedFields': [{'manager': 'kubectl-client-side-apply', 'operation': 'Update', 'apiVersion': 'v1', 'time': '2024-04-04T09:39:16Z', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:annotations': {'.': {}, 'f:kubectl.kubernetes.io/last-applied-configuration': {}}}, 'f:spec': {'f:automountServiceAccountToken': {}, 'f:containers': {'k:{"name":"my-container"}': {'.': {}, 'f:args': {}, 'f:command': {}, 'f:image': {}, 'f:imagePullPolicy': {}, 'f:name': {}, 'f:resources': {'.': {}, 'f:limits': {'.': {}, 'f:cpu': {}, 'f:memory': {}}, 'f:requests': {'.': {}, 'f:cpu': {}, 'f:memory': {}}}, 'f:terminationMessagePath': {}, 'f:terminationMessagePolicy': {}}}, 'f:dnsPolicy': {}, 'f:enableServiceLinks': {}, 'f:nodeSelector': {}, 'f:restartPolicy': {}, 'f:schedulerName': {}, 'f:securityContext': {}, 'f:terminationGracePeriodSeconds': {}, 'f:tolerations': {}}}}]}, 'spec': {'containers': [{'name': 'my-container', 'image': 'busybox', 'command': ['/bin/sh'], 'args': ['-c', "sleep 600 && echo 'FINISHED!'"], 'resources': {'limits': {'cpu': '1', 'memory': '1Gi'}, 'requests': {'cpu': '1', 'memory': '1Gi'}}, 'terminationMessagePath': '/dev/termination-log', 'terminationMessagePolicy': 'File', 'imagePullPolicy': 'Always'}], 'restartPolicy': 'Always', 'terminationGracePeriodSeconds': 30, 'dnsPolicy': 'ClusterFirst', 'nodeSelector': {'kubernetes.io/hostname': 'nunet-node'}, 'serviceAccountName': 'default', 'serviceAccount': 'default', 'automountServiceAccountToken': False, 'nodeName': 'nunet-node', 'securityContext': {}, 'schedulerName': 'default-scheduler', 'tolerations': [{'key': 'virtual-node.interlink/no-schedule', 'operator': 'Exists'}, {'key': 'node.kubernetes.io/not-ready', 'operator': 'Exists', 'effect': 'NoExecute', 'tolerationSeconds': 300}, {'key': 'node.kubernetes.io/unreachable', 'operator': 'Exists', 'effect': 'NoExecute', 'tolerationSeconds': 300}], 'priority': 0, 'enableServiceLinks': True, 'preemptionPolicy': 'PreemptLowerPriority'}, 'status': {'phase': 'Pending', 'conditions': [{'type': 'Initialized', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': None}, {'type': 'Ready', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': None}, {'type': 'PodScheduled', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': None}], 'hostIP': '10.244.0.18', 'podIP': '10.244.0.18', 'startTime': '2024-04-04T09:39:16Z', 'containerStatuses': [{'name': 'my-container', 'state': {'running': {'startedAt': '2024-04-04T09:39:16Z'}}, 'lastState': {}, 'ready': True, 'restartCount': 1, 'image': 'busybox', 'imageID': ''}]}}]
  • Place in the network: Interlink NuNet Gateway Server

Output #

  • Data:
    1. The NuNet plugin sends a 200 response to interlink server
  • Place in the network
    1. Interlink NuNet Gateway Server

Input #

  • Data:
    1. Pod data sent to the NuNet interlink plugin is parsed to determine required resources.
    2. Peer list is parsed to determine a suitable target for the pod.
  • Place in the network:
    1. Interlink NuNet Gateway Server

Output #

  • Data:

    Nunet Interlink Plugin creates the payload for the job I have had added a few more sections for discussion.


{
    "job_id": 3456,
    "job": "example_docker_image", // virtual machines and wasm workloads are also possible
    "jobparams": "params go here"
    "jobenv": {
        "env1": "value1",
        "env2": ""value2,
    },
    "secrets": {
      "secret1": "<encryptedblob?>"
    },
    "network": {
            "vpn??subnet": <uuid>,
            "vpn??maxlatency": 100
            "ipaddr": 10.0.0.11,
            "dnsname": "job.subdomain.domain.com" ,
            "ingressproxyid": <peerid of proxy>,
            "egressproxyid":
            "fwrules": {
              
            },
    },
    "requiredCapability": {
        "executor": "docker", // or vm or wasm or others executors
        "type": {
          description: "batch"
        },
        "resources": { 
            "cpu": {
                "cores": 1,
                "frequency": 10000
                },
            "ram": 1024Gi,
            "disk": 0,
            "power": 0,
            "vcpu": 128,
            "gpu": {
                "brand": "nvidia",
                "model": "V100"
                "vram": "40Gi"
            }
          },
        "libraries": ["pytorch@1.0.0"],
        "locality": ["us-east-1", "us-east-2"],
        "storage": ["ramdisk", "localVolume", "ipfs", "aws-s3", "filecoin", "google-drive"],
        "network": {
            "open_ports": [80, 443, 8080],
            "vpn??": true,

            // TODO: specify it more
        },
        "price": {
          "currency":"ntx",
          "max_per_hour": 500,
          "max_total": 10000,
          "preference": 1
        },
        "time": {
          "units":"seconds",
          "max_time":6300,
          "preference": 2
        },
        "kyc": ["iamx"]
    }
}
  1. Nunet Interlink Plugin Sends the payload to the local SP DMS via the Nunet API endpoint see the documentation here: https://app.gitbook.com/o/HmQiiAfFnBUd24KadDsO/s/nUIl2GGpV9Wq3xiFlif2/public-technical-documentation/device-management-service/proposed/orchestrator#id-1.-job-posting
/orchestrator/postJob

method: HTTP POST

  1. Local SP DMS returns a job ID to the nunet interlink plugin
  2. Interlink updates its database record for the pod to include the nunet job ID (there is now a mapping between the kubernetes pod ID and the nunet job id)
  • Place in the network
    1. Interlink NuNet Gateway Server

Loop 5: SPDMS Schedules job on remote DMS #

Input #

  • Data:
    1. The Job is then scheduled on remote DMS
  • Place in the network:
    1. NuNet compute Provider

Output #

  • Data:

    1. The remote CP DMS returns the status of the job creation to the SP DMS
    2. SP DMS Updates its record in its database with the latest status.
  • Place in the network

    1. Interlink NuNet Gateway Server

JOB is sceduled successfully

General sequence diagram (Get Job Status) #

sequenceDiagram
    participant KS as Kubernetes Scheduler
    participant VK as Virtual Kubelet
    participant OS as OAuth Proxy
    participant IMS as Interlink Middleware Server
    participant NP as Nunet Inerlink Plugin
    participant NDMSSP as Nunet DMS SP
    participant NDMSCP as Nunet DMS CP

    Loop 1:
    NDMSSP->>+NDMSCP: SP requests status
    NDMSCP-->>-NDMSSP: CP returns status
    end

    Loop 2:
    KS->>+VK: K8's API polls the virtual kublet for status information
    VK->>+IMS: Virtual kubelet passes request Interlink 
    IMS->>+NP: Interlink passes request to Nunet Interlink plugin
    NP->>+NDMSSP: Nunet nterlink plugin sends request to service provider
    NDMSSP-->-NP: SP returns response + 200 header
    NP-->>-IMS: Nunet plugin reformats response and forwards + 200
    IMS-->>-VK: Interlink forwards response + 200 Header
    VK-->>-KS:  VKubelet forwards response + 200 Header
    end
  

Loop 1: Nunet Service Provider requests job status #

Input #

This is standard DMS functionality. DMS’s keep track of what other nodes are doing for them and store that data in their local database. NOTE although current DMS process is to keep a socket open between the two DMS’s it may be better to just allow A DMS to poll (ping) andother DMS for status information as required or to use a gossipsub topic to send and receive updates.

Loop 2: Kubernetes API requests status information for a pod #

Kubernetes API polls the virtual kubelet for status information #

We do not need to worry about the specifics of this as this is handled by the Interlink virtual kubelet

  • Place in the network
    1. virtual kubelet (Kubernetes Worker node)

We do not need to worry about this as it is handled by interlink server

  • Place in the network: Interlink NuNet Interlink gateway
  • Data:
  • Place in the network:
    1. virtual kubelet (Kubernetes Worker node)

The nunet interlink plugin extracts the relevant data

  • Data:
    1. Nunet Interlink plugin makes a request to the SP DMS for job Status
     /api/v1/job/status/{nunetjobid}
    
  • Place in the network:
    1. virtual kubelet (Kubernetes Worker node)

Output #

  • Data:
    1. response payload
     {
      jobid: DID?
      jobstatus: running
      starttime:
      duration:
      remainingtime:
    }
    
    1. jobstatus is stored (somewhere)

Sequence diagram (Update VKublet with Available Resources) #

As part of this solution it is fairly important for the resources avaialbe on nunet to be evaluated and suitable information passed back to the virtual kubelet to allow the cluster to make scheduling decisions.

Interlink currently sets this information in a text file in a directory where the virtual kublet is installed on the worker node.

I am proposing that we add the functionality to interlink to automatically update this resource information based on data returned from the plugins it uses. There are several things we need to consider here as the logic for each plugin will differ based on the type of network / system the plugin is supporting.

In the case of Nunet where we are likely to have many different spec nodes with different types of GPU avaialable we should try to group these nodes together into similar types and make these different groups avaialable to different virtual kubelets.

We will need a method to filter suitable node types that we want to advertise to the cluster. This can initally be done by using channels and onboarding specific hardware types to specific channels but going forward we should be thinking about how a service provider is tasked with discovering only relevant peers (We need to narrow down search params during the DMS discovery phase. e.g. geo loacation, hardware types, ping response times etc) Where should this functionality reside, network package, orchistrator etc

sequenceDiagram
    participant NDMSCP as Nunet DMS CP
    participant NDMSSP as Nunet DMS SP
    participant NP as Nunet Inerlink Plugin
    participant IMS as Interlink Middleware Server
    participant OS as OAuth Proxy
    participant VK as Virtual Kubelet
    participant KS as Kubernetes Scheduler

    loop 0:
    NDMSSP->>+NDMSCP: Ping request
    NDMSCP-->>-NDMSSP: Return Current Resource Availabity
    end
    loop 1: 
    NP->>+NDMSSP: Request Resource Avaialalbity of DHT (Handshake) Peers
    NDMSSP -->>-NP: Return Resource Availalbity of DHT (Handshake) Peers
    end
    loop 2:
    IMS->>+NP: Interlink requests resource availability of NuNet
    NP-->>-IMS: Nunet plugin returns latest resource availability 
    end
    loop 3:
    VK->>+IMS: Interlink Virtual Kubelet requests latest resource availabity
    IMS-->>-VK: IMS returns latest availability
    VK->>+KS: Interlink V Kubelet publishes latest availability to API
    end
  

Loop 0 - CP -> SP Resource Availability Update #

The DMS on the virtual Kublet (service provider role) requests current resource availability from all the connected compute providers in it’s DHT (“Handshake”) peers list. It will then update its local database.

Request

New format libp2p ping ?

Response format

New format libp2p ping response

Loop 1 - Nunet plugin requests availability update from DMS #

The Nunet plugin periodically requests the latest availability from the local DMS.

Request:

/api/v1/network/peers/list/resources?
current (/api/v1/peers/dht/dump)

Response:

[
    {
        "peer_id": "QmdRxGHC4QZhdsXyUUMhpu1GgeBQoLpjEU663dgMrMAj9k",
        "is_available": true,
        "has_gpu": false,
        "allow_cardano": false,
        "gpu_info": null,
        "tokenomics_addrs": "addr_test1qq0aq505npfft4t2ql7gd7w442dpmarnfthlzy73pczzccj9q695zk5p2eyx8gqz68s4d7s6q5fpa0953taavkmxhupq0gzhse",
        "tokenomics_blockchain": "",
        "available_resources": {
            "id": 1,
            "tot_cpu_hz": 4000,
            "price_cpu": 0,
            "ram": 4000,
            "price_ram": 0,
            "vcpu": 1,
            "disk": 0,
            "price_disk": 0,
            "ntx_price": 0
        },
        "services": []
    },
    {
        "peer_id": "QmQ5CXpo9QaJXn9L5yJatBphnitKPB5jsdDpHNiaNV4dZm",
        "is_available": false,
        "has_gpu": true,
        "allow_cardano": false,
        "gpu_info": [
            {
                "name": "Tesla P100-PCIE-16GB",
                "tot_vram": 16384,
                "free_vram": 15972
            },
            {
                "name": "Tesla P100-PCIE-16GB",
                "tot_vram": 16384,
                "free_vram": 16237
            },
            {
                "name": "Tesla P100-PCIE-16GB",
                "tot_vram": 16384,
                "free_vram": 16221
            }
        ],
        "tokenomics_addrs": "addr_test1qr2p9uxv7a9mv8vty4nzl93elecpwjculxrx37zhdmm7gcptr70ngj0235495mpdtq6jy8let52598fls6aslqplv3jq9qcdke",
        "tokenomics_blockchain": "",
        "available_resources": {
            "id": 0,
            "tot_cpu_hz": 0,
            "price_cpu": 0,
            "ram": 0,
            "price_ram": 0,
            "vcpu": 0,
            "disk": 0,
            "price_disk": 0,
            "ntx_price": 0
        },
        "services": null
    },
    {
        "peer_id": "Qmab2uhVmFQ2ZkmporYr9wztX8oCkJkWj55zrua78Le5dC",
        "is_available": true,
        "has_gpu": true,
        "allow_cardano": false,
        "gpu_info": [
            {
                "name": "NVIDIA GeForce RTX 3080",
                "tot_vram": 10240,
                "free_vram": 9777
            },
            {
                "name": "NVIDIA GeForce RTX 3090",
                "tot_vram": 24576,
                "free_vram": 24048
            },
            {
                "name": "NVIDIA GeForce RTX 3090",
                "tot_vram": 24576,
                "free_vram": 24245
            },
            {
                "name": "NVIDIA GeForce RTX 3080",
                "tot_vram": 10240,
                "free_vram": 9995
            },
            {
                "name": "NVIDIA GeForce RTX 3080",
                "tot_vram": 10240,
                "free_vram": 9995
            },
            {
                "name": "NVIDIA GeForce RTX 3080",
                "tot_vram": 10240,
                "free_vram": 9995
            }
        ],
        "tokenomics_addrs": "0x87DA03a4C593FE69fe98440B6c3d37348c93A8FB",
        "tokenomics_blockchain": "",
        "available_resources": {
            "id": 1,
            "tot_cpu_hz": 114299,
            "price_cpu": 0,
            "ram": 113855,
            "price_ram": 0,
            "vcpu": 17,
            "disk": 0,
            "price_disk": 0,
            "ntx_price": 0
        },
        "services": [
            {
                "ID": 2,
                "CreatedAt": "2023-09-06T16:38:29.206826938-04:00",
                "UpdatedAt": "2023-09-06T16:44:38.211393386-04:00",
                "DeletedAt": null,
                "TxHash": "19405c405240aeadace33e027573bbc02b1d0636801576d6eff9be54448585e2",
                "TransactionType": "",
                "JobStatus": "running",
                "JobDuration": 5,
                "EstimatedJobDuration": 10,
                "ServiceName": "registry.gitlab.com/nunet/ml-on-gpu/ml-on-gpu-service/develop/tensorflow",
                "ContainerID": "6b89f0ca28d6ff88f145977cfb9d7d0383210883caa32fb5a3b07da637a70ad5",
                "ResourceRequirements": 2,
                "ImageID": "registry.gitlab.com/nunet/ml-on-gpu/ml-on-gpu-service/develop/tensorflow",
                "LogURL": "https://log.nunet.io/api/v1/logbin/8ff56588-c29e-4cee-9da4-702c9359a436/raw",
                "LastLogFetch": "2023-09-06T20:44:38.211356996Z",
                "ServiceProviderAddr": "",
                "ComputeProviderAddr": "",
                "MetadataHash": "",
                "WithdrawHash": "",
                "RefundHash": "",
                "Distribute_50Hash": "",
                "Distribute_75Hash": "",
                "SignatureDatum": "",
                "MessageHashDatum": "",
                "Datum": "",
                "SignatureAction": "",
                "MessageHashAction": "",
                "Action": ""
            }
        ]
    },

Nunet Plugin then consolidates this info

  • nunet.gpu.nvidia.tesla.p100.16gb x3
  • nunet.gpu.nvidia.rtx.3080.10gb x5
  • nunet.gpu.nvidia.rtx.3090.24gb x2
  • nunet.cpu.cores.17
  • nunet.cpu.mhz.114299
  • nunet.ram.gb.113855

Carefull consideration needs to be taken in order to ensure whats reported makes sense to the cluster / scheduler. We may have to specify custom resource definitions for the GPU’s and maybe for node types so that ram and disk io are taken into account.

Request:

/resources

Response:

{
  "nunet.gpu.nvidia.tesla.p100.16gb": 3,
  "nunet.gpu.nvidia.rtx.3080.10gb": 5,
  "nunet.gpu.nvidia.rtx.3090.24gb": 2,
  "nunet.cpu.cores": 17,
  "nunet.cpu.mhz": 114299,
  "nunet.ram.gb": 113855
}

Request:

????

Response from Interlink

????

Put to K8’s API

PUT /api/v1/nodes/{nunet-node-1}/status
{
  "kind": "Node",
  "apiVersion": "v1",
  "metadata": {
    "name": "nunet-node-1",
    "labels": {
      "type": "virtual-kubelet"
    }
  },
  "status": {
    "capacity": {
      "cpu": "17",
      "memory": "113855Mi",
      "nunet.gpu.nvidia.tesla.p100.16gb": "3",
      "nunet.gpu.nvidia.rtx.3080.10gb": "5",
      "nunet.gpu.nvidia.rtx.3090.24gb": "2",
      "nunet.cpu.mhz": "114299"
    },
    "allocatable": {
      "cpu": "17",
      "memory": "113855Mi",
      "nunet.gpu.nvidia.tesla.p100.16gb": "3",
      "nunet.gpu.nvidia.rtx.3080.10gb": "5",
      "nunet.gpu.nvidia.rtx.3090.24gb": "2",
      "nunet.cpu.mhz": "114299"
    },
    "conditions": [
      {
        "type": "Ready",
        "status": "True",
        "reason": "KubeletReady",
        "message": "kubelet is posting ready status"
      }
    ],
    "addresses": [
      {
        "type": "InternalIP",
        "address": "192.168.100.1"
      }
    ],
    "daemonEndpoints": {
      "kubeletEndpoint": {
        "Port": 10250
      }
    },
    "nodeInfo": {
      "architecture": "amd64",
      "containerRuntimeVersion": "docker://19.3",
      "kubeletVersion": "v1.20.0",
      "operatingSystem": "linux"
    }
  }
}

Maintainer: Sam (please tag all edit merge requests accordingly)