codeflare_sdk.common.utils package

Submodules

codeflare_sdk.common.utils.constants module

codeflare_sdk.common.utils.constants.RAY_VERSION = '2.52.1'

The below are used to define the default runtime image for the Ray Cluster. * For python 3.11:ray:2.52.1-py311-cu121 * For python 3.12:ray:2.52.1-py312-cu128

codeflare_sdk.common.utils.demos module

codeflare_sdk.common.utils.demos.copy_demo_nbs(dir: str = './demo-notebooks', overwrite: bool = False)[source]

Copy the demo notebooks from the package to the current working directory

overwrite=True will overwrite any files that exactly match files written by copy_demo_nbs in the target directory. Any files that exist in the directory that don’t match these values will remain untouched.

Args:
dir (str):

The directory to copy the demo notebooks to. Defaults to “./demo-notebooks”.

overwrite (bool):

Whether to overwrite files in the directory if it already exists. Defaults to False.

Raises:
FileExistsError:

If the directory already exists.

codeflare_sdk.common.utils.generate_cert module

codeflare_sdk.common.utils.generate_cert.cleanup_expired_certificates(dry_run=True)[source]

Removes TLS certificates that have expired.

Args:
dry_run (bool):

If True (default), only lists expired certificates without deleting them. Set to False to actually delete expired certificates.

Returns:

list: List of certificate paths that were (or would be) removed.

Example:
>>> # Check what would be deleted
>>> expired = cleanup_expired_certificates(dry_run=True)
>>> print(f"Found {len(expired)} expired certificates")
>>>
>>> # Actually delete them
>>> cleanup_expired_certificates(dry_run=False)
codeflare_sdk.common.utils.generate_cert.cleanup_old_certificates(days=30, dry_run=True)[source]

Removes TLS certificates older than a specified number of days.

Args:
days (int):

Remove certificates created more than this many days ago. Default is 30.

dry_run (bool):

If True (default), only lists old certificates without deleting them. Set to False to actually delete old certificates.

Returns:

list: List of certificate paths that were (or would be) removed.

Example:
>>> # Check certificates older than 90 days
>>> old = cleanup_old_certificates(days=90, dry_run=True)
>>> print(f"Found {len(old)} certificates older than 90 days")
>>>
>>> # Delete certificates older than 30 days
>>> cleanup_old_certificates(days=30, dry_run=False)
codeflare_sdk.common.utils.generate_cert.cleanup_tls_cert(cluster_name, namespace)[source]

Removes TLS certificates and keys for a specific Ray cluster.

This should be called when a cluster is deleted to clean up sensitive key material.

Args:
cluster_name (str):

The name of the Ray cluster.

namespace (str):

The Kubernetes namespace where the Ray cluster is located.

Returns:

bool: True if certificates were removed, False if they didn’t exist.

Example:
>>> cleanup_tls_cert("my-cluster", "default")
True
codeflare_sdk.common.utils.generate_cert.export_env(cluster_name, namespace)[source]

Sets environment variables to configure TLS for a Ray cluster.

Args:
cluster_name (str):

The name of the Ray cluster.

namespace (str):

The Kubernetes namespace where the Ray cluster is located.

Environment Variables Set:
  • RAY_USE_TLS: Enables TLS for Ray.

  • RAY_TLS_SERVER_CERT: Path to the TLS server certificate.

  • RAY_TLS_SERVER_KEY: Path to the TLS server private key.

  • RAY_TLS_CA_CERT: Path to the CA certificate.

codeflare_sdk.common.utils.generate_cert.generate_ca_cert(days: int = 30)[source]

Generates a self-signed CA certificate and private key, encoded in base64 format.

The certificate includes RFC 5280 compliant extensions: - BasicConstraints (CA:TRUE) - KeyUsage (keyCertSign, cRLSign) - SubjectKeyIdentifier

Similar to:

openssl req -x509 -nodes -newkey rsa:3072 -keyout ca.key -days 1826 -out ca.crt -subj ‘/CN=root-ca’

Args:
days (int):

The number of days for which the CA certificate will be valid. Default is 30.

Returns:
Tuple[str, str]:

A tuple containing the base64-encoded private key and CA certificate.

codeflare_sdk.common.utils.generate_cert.generate_tls_cert(cluster_name, namespace, days=30, force_regenerate=False)[source]

Generates a TLS certificate and key for a Ray cluster, saving them locally along with the CA certificate.

The certificate includes RFC 5280 compliant extensions: - BasicConstraints (CA:FALSE) - KeyUsage (digitalSignature, keyEncipherment) - ExtendedKeyUsage (serverAuth, clientAuth for mTLS) - SubjectAlternativeName (localhost, 127.0.0.1, ::1) - SubjectKeyIdentifier - AuthorityKeyIdentifier

Files are created with restricted permissions (0600) for security.

Certificates are stored in a user-private directory: - Default: ~/.local/share/codeflare/tls/{cluster_name}-{namespace}/ - Override via CODEFLARE_TLS_DIR environment variable

Args:
cluster_name (str):

The name of the Ray cluster.

namespace (str):

The Kubernetes namespace where the Ray cluster is located.

days (int):

The number of days for which the TLS certificate will be valid. Default is 30.

force_regenerate (bool):

If True, regenerates certificates even if they already exist. Useful when the server CA secret has been rotated and existing certificates are no longer valid. Default is False.

Files Created:
  • ca.crt: The CA certificate (permissions: 0600).

  • tls.crt: The TLS certificate signed by the CA (permissions: 0600).

  • tls.key: The private key for the TLS certificate (permissions: 0600).

Raises:
Exception:

If an error occurs while retrieving the CA secret.

Example:

# Normal generation generate_tls_cert(“my-cluster”, “default”)

# Force regeneration if CA was rotated generate_tls_cert(“my-cluster”, “default”, force_regenerate=True)

codeflare_sdk.common.utils.generate_cert.get_secret_name(cluster_name, namespace, api_instance)[source]

Retrieves the name of the Kubernetes secret containing the CA certificate for the given Ray cluster.

Args:
cluster_name (str):

The name of the Ray cluster.

namespace (str):

The Kubernetes namespace where the Ray cluster is located.

api_instance (client.CoreV1Api):

An instance of the Kubernetes CoreV1Api.

Returns:
str:

The name of the Kubernetes secret containing the CA certificate.

Raises:
KeyError:

If no secret matching the cluster name is found.

codeflare_sdk.common.utils.generate_cert.list_tls_certificates()[source]

Lists all TLS certificate directories and their details.

Returns:
list: List of dictionaries containing certificate information:
  • cluster_name: Name of the cluster

  • namespace: Kubernetes namespace

  • path: Full path to certificate directory

  • created: Creation time of the directory

  • size: Total size of certificates in bytes

  • cert_expiry: Expiration date of tls.crt (if readable)

Example:
>>> certs = list_tls_certificates()
>>> for cert in certs:
...     print(f"{cert['cluster_name']}/{cert['namespace']}: expires {cert['cert_expiry']}")
codeflare_sdk.common.utils.generate_cert.refresh_tls_cert(cluster_name, namespace, days=30)[source]

Refreshes TLS certificates by removing old ones and generating new ones.

This is useful when the server CA secret has been rotated and existing client certificates are no longer valid.

Args:
cluster_name (str):

The name of the Ray cluster.

namespace (str):

The Kubernetes namespace where the Ray cluster is located.

days (int):

The number of days for which the new TLS certificate will be valid. Default is 30.

Returns:

bool: True if certificates were successfully refreshed.

Example:
>>> # Server CA was rotated, refresh client certificates
>>> refresh_tls_cert("my-cluster", "default")
>>> export_env("my-cluster", "default")
>>> # Now you can reconnect with fresh certificates

codeflare_sdk.common.utils.k8s_utils module

Kubernetes utility functions for the CodeFlare SDK.

codeflare_sdk.common.utils.k8s_utils.get_current_namespace()[source]

Retrieves the current Kubernetes namespace.

Returns:
str:

The current namespace or None if not found.

codeflare_sdk.common.utils.utils module

codeflare_sdk.common.utils.utils.get_ray_image_for_python_version(python_version=None, warn_on_unsupported=True)[source]

Get the appropriate Ray image for a given Python version. If no version is provided, uses the current runtime Python version. This prevents us needing to hard code image versions for tests.

Args:

python_version: Python version string (e.g. “3.11”). If None, detects current version. warn_on_unsupported: If True, warns and returns None for unsupported versions.

If False, silently falls back to Python 3.12 image.

codeflare_sdk.common.utils.utils.update_image(image) str[source]

The update_image() function automatically sets the image config parameter to a preset image based on Python version if not specified. This now points to the centralized function in utils.py.

codeflare_sdk.common.utils.validation module

Validation utilities for the CodeFlare SDK.

This module contains validation functions used across the SDK for ensuring configuration compatibility and correctness.

codeflare_sdk.common.utils.validation.extract_ray_version_from_image(image_name: str) str | None[source]

Extract Ray version from a container image name.

Supports various image naming patterns: - quay.io/modh/ray:2.47.1-py311-cu121 - ray:2.47.1 - some-registry/ray:2.47.1-py311 - quay.io/modh/ray@sha256:… (falls back to None)

Args:

image_name: The container image name/tag

Returns:

The extracted Ray version, or None if not found

codeflare_sdk.common.utils.validation.validate_ray_version_compatibility(image_name: str, sdk_ray_version: str = '2.52.1') Tuple[bool, bool, str][source]

Validate that the Ray version in the runtime image matches the SDK’s Ray version.

Args:

image_name: The container image name/tag sdk_ray_version: The Ray version used by the CodeFlare SDK

Returns:
tuple: (is_compatible, is_warning, message)
  • is_compatible: True if versions match or cannot be determined, False if mismatch

  • is_warning: True if this is a warning (non-fatal), False otherwise

  • message: Descriptive message about the validation result

Module contents

Common utilities for the CodeFlare SDK.

codeflare_sdk.common.utils.get_current_namespace()[source]

Retrieves the current Kubernetes namespace.

Returns:
str:

The current namespace or None if not found.