Run A RayService
This page shows how to leverage Kueue’s scheduling and resource management capabilities when running RayService.
Kueue manages the RayService through the RayCluster created for it. Therefore, RayService needs the label of kueue.x-k8s.io/queue-name: user-queue and this label is propagated to the relevant RayCluster to trigger Kueue’s management.
This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue’s overview.
Before you begin
-
Make sure you are using Kueue v0.6.0 version or newer and KubeRay v1.3.0 or newer.
-
Check Administer cluster quotas for details on the initial Kueue setup.
-
See KubeRay Installation for installation and configuration details of KubeRay.
Note
RayService is managed by Kueue through RayCluster, and in order to use RayCluster, prior to v0.8.1, you need to restart Kueue after the installation. You can do it by running:kubectl delete pods -l control-plane=controller-manager -n kueue-system
.
RayService definition
When running RayService on Kueue, take into consideration the following aspects:
a. Queue selection
The target local queue should be specified in the metadata.labels
section of the RayService configuration, and this label will be propagated to its RayCluster.
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
b. Configure the resource needs
The resource needs of the workload can be configured in the spec.rayClusterConfig
.
spec:
rayClusterConfig:
headGroupSpec:
template:
spec:
containers:
- resources:
requests:
cpu: "1"
workerGroupSpecs:
- template:
spec:
containers:
- resources:
requests:
cpu: "1"
c. Limitations
- Limited Worker Groups: Because a Kueue workload can have a maximum of 8 PodSets, the maximum number of
spec.rayClusterConfig.workerGroupSpecs
is 7 - In-Tree Autoscaling Disabled: Kueue manages resource allocation for the RayService; therefore, the internal autoscaling mechanisms need to be disabled
Example RayService
The RayService looks like the following:
apiVersion: ray.io/v1
kind: RayService
metadata:
name: test-rayservice
namespace: default
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
# serveConfigV2 takes a yaml multi-line scalar, which should be a Ray Serve multi-application config. See https://docs.ray.io/en/latest/serve/multi-app.html.
serveConfigV2: |
applications:
- name: fruit_app
import_path: fruit.deployment_graph
route_prefix: /fruit
runtime_env:
working_dir: "https://github.com/ray-project/test_dag/archive/78b4a5da38796123d9f9ffff59bab2792a043e95.zip"
deployments:
- name: MangoStand
num_replicas: 2
max_replicas_per_node: 1
user_config:
price: 3
ray_actor_options:
num_cpus: 0.1
- name: OrangeStand
num_replicas: 1
user_config:
price: 2
ray_actor_options:
num_cpus: 0.1
- name: PearStand
num_replicas: 1
user_config:
price: 1
ray_actor_options:
num_cpus: 0.1
- name: FruitMarket
num_replicas: 1
ray_actor_options:
num_cpus: 0.1
- name: math_app
import_path: conditional_dag.serve_dag
route_prefix: /calc
runtime_env:
working_dir: "https://github.com/ray-project/test_dag/archive/78b4a5da38796123d9f9ffff59bab2792a043e95.zip"
deployments:
- name: Adder
num_replicas: 1
user_config:
increment: 3
ray_actor_options:
num_cpus: 0.1
- name: Multiplier
num_replicas: 1
user_config:
factor: 5
ray_actor_options:
num_cpus: 0.1
- name: Router
num_replicas: 1
rayClusterConfig:
rayVersion: '2.46.0' # should match the Ray version in the image of the containers
######################headGroupSpecs#################################
# Ray head pod template.
headGroupSpec:
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.46.0
resources:
limits:
cpu: 4
memory: 6Gi
requests:
cpu: 2
memory: 4Gi
workerGroupSpecs:
# the pod replicas in this group typed worker
- replicas: 1
minReplicas: 1
maxReplicas: 5
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
spec:
containers:
- name: ray-worker # must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc'
image: rayproject/ray:2.46.0
resources:
limits:
cpu: "2"
memory: "4Gi"
requests:
cpu: "1"
memory: "2Gi"
Note
The example above comes from here and only has thequeue-name
label added and requests updated.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.