Skip to main content

How To Use Gpu In Kubernetes

FPT Cloud provides Kubernetes with NVIDIA GPUs , offering the following key features:

  • Flexible GPU configuration with multiple GPU types, optional GPU memory, applied to each Worker Group.
  • Automated GPU resource management and allocation in Kubernetes with the NVIDIA Operator.
  • Visualization and monitoring of GPUs using NVIDIA DCGM (Data Center GPU Manager).
  • Automatic scaling of Containers/Nodes with Autoscaler when applications require increased/decreased GPU resources.
  • GPU sharing support with the Multi-Instance mechanism, optimizing GPU resource usage and costs.

FPT Cloud utilizes the NVIDIA GPU Operator , providing an automated tool for managing all the necessary software components to use GPUs on Kubernetes. The GPU Operator enables users to utilize GPU resources similar to using CPUs in a Kubernetes cluster. The components of the Operator include:

  • NVIDIA Drivers (CUDA, MIG, ...)
  • NVIDIA Device Plugin
  • NVIDIA Container Toolkit
  • NVIDIA GPU Feature Discovery
  • NVIDIA Data Center GPU Manager (Monitoring)

Currently, FPT Cloud supports Kubernetes using NVIDIA A30 GPUs with the following MIG profiles:

No.GPU A30 ProfileStrategyNumber instanceInstance resource
1all-1g.6gbsingle41g.6gb
2all-2g.12gbsingle22g.12gb
3all-balancedmixed21g.6gb
all-balancedmixed12g.12gb
4none (no label)none00 (Entire)
Example:
If the configuration strategy single: all-1g.6gb is selected, the A30 GPU on the worker node is divided into 4 MIG devices with logical GPU resources equivalent to ¼ of the physical GPU and 6GB of GPU RAM each.
Note:
  • MIG configuration applies to all cards attached to the worker.
  • MIG strategy across worker groups in the same cluster must be of the same type (single/mixed/none).