Cloudfunction¶

Functions¶

class nvcf.api.asset.AssetAPI(api_client)¶

delete( asset_id: str, starfleet_api_key: str | None = None, ) → None¶

Delete a given asset, removing the ability to use for future invocations.

Parameters:

asset_id -- A unique identifier for the asset.
starfleet_api_key -- An API key with access to manage assets.

Returns:

JSON Response of NVCF function information.

Return type:

dict

info( asset_id: str, starfleet_api_key: str | None = None, ) → dict¶

Get metadata about a given function and version id.

Parameters:

asset_id -- A unique identifier for the asset.
starfleet_api_key -- An API key with access to manage assets.

Returns:

JSON Response of NVCF function information.

Return type:

dict

list(starfleet_api_key: str | None = None) → dict¶

List assets available to the account.

Parameters:: starfleet_api_key -- api key with access to invoke functions
Returns:: Keyed List of Functions.
Return type:: dict

upload( path: str, description: str, starfleet_api_key: str | None = None, ) → dict¶

Upload a given metadata about a given function and version id.

Parameters:

asset_id -- A unique identifier for the asset.
starfleet_api_key -- An API key with access to manage assets.

Returns:

JSON Response of NVCF function information.

Return type:

dict

class nvcf.api.function.FunctionAPI(api_client)¶

Create a function with the input specification provided by input.

Parameters:

name -- Display name of the function.
inference_url -- Endpoint you wish to use to do invocations.
health_uri -- Health endpoint for inferencing
container_image -- Container Image.
models -- NGC models. In form [override_name:]model
helm_chart -- Helm Chart URL.
helm_chart_service -- Only necessary when helm chart is specified.
function_id -- If provided, generate another version of the same function.
inference_port -- Optional port override which inference is forwarded to.
container_args -- Optional list of arguments to provide to container.
api_body_format -- Optional body format to use.
container_environment_variables -- List of key pair values to pass as variables to container. In form ["key1:value1", "key2:value2"]
tags -- Optional list of tags to create the function with.
resources -- Optional list of resources.

Keyword Arguments:

function_type -- Used to indicate a streaming function, defaults to DEFAULT.
health_port -- Port number where the health listener is running.
health_protocol -- HTTP/gPRC protocol type for health endpoint. Choices ["HTTP, gRPC"].
health_timeout -- ISO 8601 duration string in PnDTnHnMn.nS format.
health_expected_status_code -- Expected return status code considered as successful.
description -- Optional function/version description.
secrets -- Optional secret key/value pairs. In form ["key1:value1", "key2:value2"]
json_secrets -- Optional secret key/value pairs. In form [("key", {"jsonkey":1, "jsonkey2":{"nestedkey1":"nestedvalue"}}]
logs_telemetry_id -- UUID of telemetry log endpoint to map to function.
metrics_telemetry_id -- UUID of telemetry metrics endpoint to map to function.
traces_telemetry_id -- UUID of telemetry traces endpoint to map to function.
rate_limit_pattern -- Rate limit, format NUMBER-S|M|H|D, ex: 3-S.
rate_limit_exempt_nca_ids -- exempt NCA Ids.
rate_limit_sync_check -- Rate limit sync check.

Raises:

InvalidArgumentError -- If neither container image, models, or helm chart is provided, this is thrown.
ResourceNotFoundException -- If the image or model or helm chart cannot be found.

Returns:

Function Response provided by NVCF

Return type:

DotDict

delete(function_id: str, function_version_id: str)¶

Delete a function version id.

Parameters:

function_id -- Function's ID.
function_version_id -- Function's version ID.

Returns:

JSON Response of NVCF function information.

Return type:

DotDict

info( function_id: str, function_version_id: str, ) → DotDict¶

Get information about a given function version.

Parameters:

function_id -- Function's ID.
function_version_id -- Function's version ID.

Returns:

JSON Response of NVCF function information.

Return type:

DotDict

invoke( function_id: str, payload: dict, function_version_id: str | None = None, starfleet_api_key: str | None = None, asset_ids: list[str] | None = None, output_zip_path: str | None = None, polling_request_timeout: int | None = 300, pending_request_timeout: int | None = 600, pending_request_interval: float | None = 1.0, ) → DotDict¶

Parameters:

function_id -- ID of NVCF Function being invoked.
payload -- JSON payload specific to the function you are invoking.
SPEC. (The shape should adhere to your function's API)
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
asset_ids -- Asset ids that are referenced in the payload.
output_zip -- If output provides a zip file, this is the location to save the zip file.

Raises:

NgcException -- Matching HTTP Response code if fails in any way.

Returns:

Dictionary corresponding to JSON response from function invoked.

invoke_grpc( function_id: str, starfleet_api_key: str, function_request: Any, grpc_stub_function: Callable, function_version_id: str | None = None, ) → Any¶

Parameters:

function_id -- ID of GRPC NVCF Function being invoked.
starfleet_api_key -- Key with invocation access to the function.
function_request -- GRPC Payload specific to the function you are invoking.
grpc_stub_function -- GRPC Stub function to invoke.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.

Raises:

NgcException -- Matching HTTP Response code if fails in any way.

Returns:

GRPC Response of function invocation.

Return type:

Any

invoke_grpc_triton( function_id: str, function_request, starfleet_api_key: str | None = None, function_version_id: str | None = None, )¶

Parameters:

function_id -- ID of Triton based GRPC NVCF Function being invoked.
function_request -- GRPC Payload specific to the function you are invoking.
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.

Returns:

GRPC Response of function invocation

Return type:

ModelInferResponse

invoke_stream( function_id: str, payload: dict, starfleet_api_key: str | None = None, function_version_id: str | None = None, asset_ids: list[str] | None = None, request_timeout: int | None = 300, ) → Generator[bytes, None, None]¶

Parameters:

function_id -- ID of NVCF Function being invoked.
payload -- JSON payload specific to the function you are invoking.
SPEC. (The shape should adhere to your function's API)
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.

Raises:

NgcException -- Matching HTTP Response code if fails in any way.

Returns:

Streaming response of function invocation.

Return type:

Generator[bytes, None, None]

invoke_stream_grpc( function_id: str, starfleet_api_key: str, function_request: Any, grpc_stub_function: Callable, function_version_id: str | None = None, ) → Any¶

Parameters:

function_id -- ID of GRPC NVCF Function being invoked.
starfleet_api_key -- Key with invocation access to the function.
function_request -- GRPC Payload specific to the function you are invoking.
grpc_stub_function -- GRPC Stub function to invoke.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.

Raises:

NgcException -- Matching HTTP Response code if fails in any way.

Returns:

GRPC Response of function invocation.

Return type:

Any

invoke_stream_grpc_triton( function_id: str, function_request: Any, starfleet_api_key: str | None = None, function_version_id: str | None = None, ) → list[Any]¶

Parameters:

function_id -- ID of Triton based GRPC NVCF Function being invoked.
function_request -- GRPC Payload specific to the function you are invoking.
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.

Returns:

Streaming response of function invocation

Return type:

ModelStreamInferResponse

list( function_id: str | None = None, name_pattern: str | None = None, access_filter: list[str] | None = None, ) → DotDict¶

List functions available to the organization. Currently set.

Parameters:

function_id -- Optional parameter to list only versions of a specific function. Defaults to None.
name_pattern -- Optional parameter to filter functions that contain this name. Supports wildcards.
access_filter -- Optional parameter to filter functions by their access
["private" (to the account to)
"public"
"authorized"].

Returns:

Keyed List of Functions.

Return type:

dict

class nvcf.api.authorization.FunctionAuthorizationAPI(api_client: Client = None)¶

add( function_id: str, function_version_id: str | None = None, nca_id: str | None = None, ) → DotDict¶

Authorize additional NCA ids to invoke this function/function version.

Parameters:

function_id -- Function's ID.
function_version_id -- Function's version ID.
nca_id -- NCA ID of party you wish to authorize.

Returns:

JSON Response of NVCF function information.

Return type:

dict

clear( function_id: str, function_version_id: str | None = None, ) → DotDict¶

Delete all extra account authorizations for a given function/function version.

Parameters:

function_id -- Function's ID.
function_version_id -- Function's version ID.

Returns:

JSON Response of NVCF function information.

Return type:

dict

info( function_id: str, function_version_id: str | None = None, ) → DotDict¶

Get account authorization about a given function/function version.

Parameters:

function_id -- Function's ID.
function_version_id -- Function's version ID.

Returns:

JSON Response of NVCF function information.

Return type:

dict

remove( function_id: str, function_version_id: str | None = None, nca_id: str | None = None, ) → DotDict¶

Remove authorization for clients to invoke this function/function version.

Parameters:

function_id -- Function's ID.
function_version_id -- Function's version ID.
nca_id -- NCA ID of party you wish to authorize.

Returns:

JSON Response of NVCF function information.

Return type:

dict

class nvcf.api.deployment_spec.DeploymentSpecification( backend: str, gpu: str, min_instances: int, max_instances: int, instance_type: str | None = None, availability_zones: list[str] | None = None, max_request_concurrency: int | None = None, configuration: dict | None = None, )¶: Represents a deployment specification for NVCF.

class nvcf.api.deploy.DeployAPI(api_client: Client = None)¶

create( function_id: str, function_version_id: str, deployment_specifications: list[DeploymentSpecification] | None = None, targeted_deployment_specifications: list[TargetedDeploymentSpecification] | None = None, ) → DotDict¶: Create a deployment with a function id, version and a set of available deployment specifications.

delete( function_id: str, function_version_id: str, *, graceful: bool = False, )¶: Delete a given deployment.

info( function_id: str, function_version_id: str, ) → DotDict¶: Get information about a given function's deployment.

query_logs( function_id: str, function_version_id: str, start_time: datetime | None = None, end_time: datetime | None = None, duration: timedelta | None = None, ) → Iterator[dict]¶

Deployment logs.

Parameters:

function_id -- Id of function logs are pulled from.
duration -- Specifies the duration of time, either after begin-time or before end-time. Format: [nD][nH][nM][nS]. Default: 1 day, doesn't respect decimal measurements.
start_time -- Specifies the start time for querying logs. Default: None.
end_time -- Specifies the end_time time for querying logs. Default: Now.
function_version_id -- Optional version to specify for function id.

Returns:

Use to recieve logs one by one.

Return type:

Iterator

update( function_id: str, function_version_id: str, deployment_specifications: list[DeploymentSpecification] | None = None, targeted_deployment_specifications: list[TargetedDeploymentSpecification] | None = None, ) → DotDict¶: Update a given deployment.

class nvcf.api.task.TaskAPI(api_client: Client = None)¶

cancel(task_id)¶

Cancel a task.

Parameters:: task_id -- The task to cancel

create( name: str, container_image: str | None = None, container_args: str | None = None, container_environment_variables: list[str] | None = None, gpu_specification: GPUSpecification | None = None, models: list[str] | None = None, resources: list[str] | None = None, tags: list[str] | None = None, description: str | None = None, max_runtime_duration: Duration | None = None, max_queued_duration: Duration | None = None, termination_grace_period_duration: Duration | None = None, result_handling_strategy: str = 'UPLOAD', result_location: list[str] | None = None, secrets: list[str] | None = None, helm_chart: str | None = None, logs_telemetry_id: str | None = None, metrics_telemetry_id: str | None = None, traces_telemetry_id: str | None = None, )¶

Create a task with the specification provided by input.

Parameters:

name -- Display name of the task.
container_image -- Container image.
container_args -- Container args.
container_environment_variables -- Container environment variables.
gpu_specification -- GPU specifications.
models -- NGC models.
resources -- NGC resources.
tags -- Optional list of tags to create the function with.
max_runtime_duration -- Maximum runtime duration for task. Defaults to forever.
max_queued_duration -- Maximum queued duration for task. Defaults to 72 hours.
termination_grace_period_duration -- Grace period after termination. Defaults to 1 hour.
description -- Description of the task.
result_handling_strategy -- How results should be handled.
result_location -- Where results should be stored. Required if result_handling_strategy is UPLOAD.
secrets -- Optional secret key/value pairs. Form: ["key1:value1", "key2:value2"].
helm_chart -- Helm Chart URL.
logs_telemetry_id -- UUID of telemetry log endpoint to map to task.
metrics_telemetry_id -- UUID of telemetry metrics endpoint to map to task.
traces_telemetry_id -- UUID of telemetry traces endpoint to map to task.

Raises:

InvalidArgumentError -- If result handling strategy is set to upload and required fields aren't provided.
ResourceNotFound -- If the image or resource cannot be found.

Returns:

Function Response provided by NVCF

Return type:

DotDict

delete(task_id)¶

Delete a task.

Parameters:: task_id -- The task to delete.

events( task_id: str, limit: int = 100, ) → Iterable[DotDict]¶

Get a list of the task's events.

Returns:: Iterable of DotDict of task events.
Return type:: Iterable[DotDict]

info(task_id: str) → DotDict¶

Get information about a given task.

Parameters:: task_id -- The task to get information about.
Returns:: DotDict of task information.
Return type:: dict

list(limit: int = 100) → list[DotDict]¶

List tasks available to the organization currently set.

Returns:: A list of task DotDicts.

logs( task_id: str, limit: int = 100, start_time: datetime | None = None, end_time: datetime | None = None, duration: timedelta | None = None, ) → Iterable[DotDict]¶

Task deployment logs.

Returns:: Iterable of DotDict of task logs.
Return type:: Iterable[DotDict]

results( task_id: str, limit: int = 100, ) → Iterable[DotDict]¶

Get a list of the task's results.

Returns:: Iterable of DotDict of task results
Return type:: Iterable[DotDict]

Examples¶

Import the NGCSDK¶

from ngcsdk import Client
from nvcf.api.deployment_spec import TargetedDeploymentSpecification
from nvcf.api.invocation_handler import HTTPSInvocationHandler

Create and configure your client - put in your api_key, org_name, team name¶

key = "nvapi-***"

clt = Client()
clt.configure(key)

List your current functions¶

functions = clt.cloud_function.functions.list(access_filter=["private"])["functions"]

functions[0]

Create a new function with existing NGC Resources¶

genslm_function = clt.cloud_function.functions.create(name="peek-sdk-demo-sdxl", inference_url="echo",\
                                             container_image="stg.nvcr.io/cv5p43s0htqh/echo:latest", models=["nvidia/genslm:latest"],\
                                             container_environment_variables=["key:vavlue", "CUSTOM:ENV_VAR"],\
                                               inference_port=8000, container_args="bash exec")

Delete Function¶

clt.cloud_function.functions.delete(function_id=function_id, function_version_id=function_version_id)

Deployment¶

List Available GPUS¶

clt.cloud_function.gpus.list().keys()

l40s_instance = clt.cloud_function.gpus.info("L40S").L40S[0]

l40s_instance.name, l40S_instance.currentInstances

Create a deployment¶

function_id, function_version_id = genslm_function.get("function").get("id"), genslm_function.get("function").get("versionId")

dep_specs = [TargetedDeploymentSpecification(gpu="L40S", instance_type=l40s_instance.name, min_instances=1, max_instances=1)]

genslm_deployment = clt.cloud_function.functions.deployments.create(function_id=function_id, function_version_id=function_version_id, targeted_deployment_specifications=dep_specs)

Undeploy¶

clt.cloud_function.functions.deployments.delete(function_id=function_id, function_version_id=function_version_id)

Function Authorizations¶

clt.cloud_function.functions.authorizations.add(function_id=function_id, function_version_id, nca_id="")

clt.cloud_function.functions.authorizations.clear(function_id=function_id, function_version_id)

Function Invocations¶

## You can always provide a new key, or you can use the one you provided in the client configuration
SAK = "nvapi-XXX"

Asset management¶

clt.cloud_function.assets.list(SAK)
resp = clt.cloud_function.assets.upload("/path/to/image.png", "description", SAK)
clt.cloud_function.assets.delete(resp.get("asset_id"), SAK)

HTTPS Blocking Invocation¶

payload = {
  "messages": [
    {
      "role": "user",
      "content": "Give me some python code to read a file named temp.txt"
    }
  ],
  "temperature": 0.2,
  "top_p": 0.7,
  "max_tokens": 512,
  "stream": False
}

clt.cloud_function.functions.invoke(function_id="eb1100de-60bf-4e9a-8617-b7d4652e0c37", payload=payload, starfleet_api_key=SAK)

Multiple Invocations¶

To reuse an HTTP session you can use the context manager.

with HTTPSInvocationHandler(starfleet_api_key=SAK) as invocation_handler:
    # Non-stream
    invocation_handler.make_invocation_request("0acb2d4a-47ed-45b8-935c-d93cba5b7485", payload)
    # Stream
    for line in invocation_handler.make_streaming_invocation_request(
        function_id="845ba840-8d13-4151-9682-0b4cdbc93d7b", data=payload
    ):
        print(line.decode("utf-8"))

Using assets in invocations¶

r = clt.cloud_function.assets.upload("/path/to/image.png", "description", SAK)
asset_id = r["asset_id"]

payload = {
    "inputs": [
        {
            "name": "image",
            "shape": [1],
            "datatype": "BYTES",
            "data": [f"{asset_id}"],
        },
    ],
    "outputs": [{"name": "image_generated", "datatype": "BYTES", "shape": [1]}],
}

resp = clt.cloud_function.functions.invoke(function_id, payload, starfleet_api_key=SAK, asset_ids=[asset_id], output_zip_path="/path/to/output.zip")

HTTPS Streaming Invocation¶

payload = {
  "messages": [
    {
      "role": "user",
      "content": "Give me some python code to read a file named temp.txt"
    }
  ],
  "temperature": 0.2,
  "top_p": 0.7,
  "max_tokens": 512,
  "stream": True
}

invoke_request = clt.cloud_function.functions.invoke_stream(function_id="eb1100de-60bf-4e9a-8617-b7d4652e0c37", payload=payload, starfleet_api_key=SAK)

for chunk in invoke_request:
    print(chunk.decode('utf-8'))

GRPC Invocation¶

MODEL_INFER_REQUEST = ModelInferRequest(model_name="simple_identity", inputs=[
    ModelInferRequest.InferInputTensor(
            name="INPUT0", datatype="BYTES",
            shape=[1, 1],
            contents=InferTensorContents(
                    bytes_contents=[b"hello world"]))], outputs=[
    ModelInferRequest.InferRequestedOutputTensor(name="OUTPUT0")])

MODEL_STREAM_REQUEST = ModelInferRequest(
    model_name="DialoGPT",
    inputs=[
        ModelInferRequest.InferInputTensor(
            name="new_inputs",
            datatype="BYTES",
            shape=[1, 1],
            contents=InferTensorContents(bytes_contents=[bytes("hello world", "utf-8")]),
        ),
    ]
,    outputs=[ModelInferRequest.InferRequestedOutputTensor(name="response")],
)

clt.cloud_function.functions.invoke_grpc_triton(function_id = "7dc17808-b067-41dd-bf32-78d242379b6f", function_request = MODEL_INFER_REQUEST, starfleet_api_key=SAK, debug=True)

Multiple Invocations¶

To reuse an HTTP session you can use the context manager.

with TritonGRPCInvocationHandler(SAK) as invocation_handler:
    # Blocking
    invocation_handler.make_invocation_request(
        function_id=function_id,
        function_request=ModelInferRequest,
        function_version_id=function_version_id,
        )
    # Streaming
    for resp in clt.cloud_function.functions.invoke_stream_grpc_triton(function_id, MODEL_INFER_REQUEST, SAK, function_version_id):
        print(resp)

Tasks¶

from isodate import parse_duration
from nvcf.api.deployment_spec import GPUSpecification

clt.cloud_function.tasks.list()

max_runtime_dur = parse_duration("PT1H")
output = clt.cloud_function.tasks.create("sdk-example", container_image="stg.nvcr.io/ax3ysqem02xw/tasks_sample:0.0.4", gpu_specification=GPUSpecification(gpu="T10", instance_type="g6.full", backend="GFN"), max_runtime_duration=max_runtime_dur, result_handl\
ing_strategy="NONE")

clt.cloud_function.tasks.info(output.task.id)
for res in clt.cloud_function.tasks.results(output.task.id):
    print(res