Cloudfunction¶
Functions¶
- class nvcf.api.asset.AssetAPI(api_client)¶
- delete(
- asset_id: str,
- starfleet_api_key: str | None = None,
Delete a given asset, removing the ability to use for future invocations.
- Parameters:
asset_id -- A unique identifier for the asset.
starfleet_api_key -- An API key with access to manage assets.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- info(
- asset_id: str,
- starfleet_api_key: str | None = None,
Get metadata about a given function and version id.
- Parameters:
asset_id -- A unique identifier for the asset.
starfleet_api_key -- An API key with access to manage assets.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- list(starfleet_api_key: str | None = None) dict ¶
List assets available to the account.
- Parameters:
starfleet_api_key -- api key with access to invoke functions
- Returns:
Keyed List of Functions.
- Return type:
dict
- upload(
- path: str,
- description: str,
- starfleet_api_key: str | None = None,
Upload a given metadata about a given function and version id.
- Parameters:
asset_id -- A unique identifier for the asset.
starfleet_api_key -- An API key with access to manage assets.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- class nvcf.api.function.FunctionAPI(api_client)¶
- create(
- name: str,
- inference_url: str,
- health_uri: str | None = None,
- container_image: str | None = None,
- helm_chart: str | None = None,
- helm_chart_service: str | None = None,
- models: list[str] | None = None,
- function_id: str | None = None,
- inference_port: int | None = None,
- container_args: str | None = None,
- api_body_format: str | None = None,
- container_environment_variables: list[str] | None = None,
- tags: list[str] | None = None,
- resources: list[str] | None = None,
- *,
- function_type: str = 'DEFAULT',
- health_expected_status_code: int | None = None,
- health_port: int | None = None,
- health_timeout: str | None = None,
- health_protocol: str | None = None,
- description: str | None = None,
- secrets: list[str] | None = None,
- json_secrets: list[tuple[str, bytes]] | None = None,
- logs_telemetry_id: str | None = None,
- metrics_telemetry_id: str | None = None,
- traces_telemetry_id: str | None = None,
- rate_limit_pattern: str | None = None,
- rate_limit_exempt_nca_ids: list[str] | None = None,
- rate_limit_sync_check: bool | None = None,
Create a function with the input specification provided by input.
- Parameters:
name -- Display name of the function.
inference_url -- Endpoint you wish to use to do invocations.
health_uri -- Health endpoint for inferencing
container_image -- Container Image.
models -- NGC models. In form [override_name:]model
helm_chart -- Helm Chart URL.
helm_chart_service -- Only necessary when helm chart is specified.
function_id -- If provided, generate another version of the same function.
inference_port -- Optional port override which inference is forwarded to.
container_args -- Optional list of arguments to provide to container.
api_body_format -- Optional body format to use.
container_environment_variables -- List of key pair values to pass as variables to container. In form ["key1:value1", "key2:value2"]
tags -- Optional list of tags to create the function with.
resources -- Optional list of resources.
- Keyword Arguments:
function_type -- Used to indicate a streaming function, defaults to DEFAULT.
health_port -- Port number where the health listener is running.
health_protocol -- HTTP/gPRC protocol type for health endpoint. Choices ["HTTP, gRPC"].
health_timeout -- ISO 8601 duration string in PnDTnHnMn.nS format.
health_expected_status_code -- Expected return status code considered as successful.
description -- Optional function/version description.
secrets -- Optional secret key/value pairs. In form ["key1:value1", "key2:value2"]
json_secrets -- Optional secret key/value pairs. In form [("key", {"jsonkey":1, "jsonkey2":{"nestedkey1":"nestedvalue"}}]
logs_telemetry_id -- UUID of telemetry log endpoint to map to function.
metrics_telemetry_id -- UUID of telemetry metrics endpoint to map to function.
traces_telemetry_id -- UUID of telemetry traces endpoint to map to function.
rate_limit_pattern -- Rate limit, format NUMBER-S|M|H|D, ex: 3-S.
rate_limit_exempt_nca_ids -- exempt NCA Ids.
rate_limit_sync_check -- Rate limit sync check.
- Raises:
InvalidArgumentError -- If neither container image, models, or helm chart is provided, this is thrown.
ResourceNotFoundException -- If the image or model or helm chart cannot be found.
- Returns:
Function Response provided by NVCF
- Return type:
DotDict
- delete(function_id: str, function_version_id: str)¶
Delete a function version id.
- Parameters:
function_id -- Function's ID.
function_version_id -- Function's version ID.
- Returns:
JSON Response of NVCF function information.
- Return type:
DotDict
- info(
- function_id: str,
- function_version_id: str,
Get information about a given function version.
- Parameters:
function_id -- Function's ID.
function_version_id -- Function's version ID.
- Returns:
JSON Response of NVCF function information.
- Return type:
DotDict
- invoke(
- function_id: str,
- payload: dict,
- function_version_id: str | None = None,
- starfleet_api_key: str | None = None,
- asset_ids: list[str] | None = None,
- output_zip_path: str | None = None,
- polling_request_timeout: int | None = 300,
- pending_request_timeout: int | None = 600,
- pending_request_interval: float | None = 1.0,
- Parameters:
function_id -- ID of NVCF Function being invoked.
payload -- JSON payload specific to the function you are invoking.
SPEC. (The shape should adhere to your function's API)
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
asset_ids -- Asset ids that are referenced in the payload.
output_zip -- If output provides a zip file, this is the location to save the zip file.
- Raises:
NgcException -- Matching HTTP Response code if fails in any way.
- Returns:
Dictionary corresponding to JSON response from function invoked.
- invoke_grpc(
- function_id: str,
- starfleet_api_key: str,
- function_request: Any,
- grpc_stub_function: Callable,
- function_version_id: str | None = None,
- Parameters:
function_id -- ID of GRPC NVCF Function being invoked.
starfleet_api_key -- Key with invocation access to the function.
function_request -- GRPC Payload specific to the function you are invoking.
grpc_stub_function -- GRPC Stub function to invoke.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
- Raises:
NgcException -- Matching HTTP Response code if fails in any way.
- Returns:
GRPC Response of function invocation.
- Return type:
Any
- invoke_grpc_triton(
- function_id: str,
- function_request,
- starfleet_api_key: str | None = None,
- function_version_id: str | None = None,
- Parameters:
function_id -- ID of Triton based GRPC NVCF Function being invoked.
function_request -- GRPC Payload specific to the function you are invoking.
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
- Returns:
GRPC Response of function invocation
- Return type:
ModelInferResponse
- invoke_stream(
- function_id: str,
- payload: dict,
- starfleet_api_key: str | None = None,
- function_version_id: str | None = None,
- asset_ids: list[str] | None = None,
- request_timeout: int | None = 300,
- Parameters:
function_id -- ID of NVCF Function being invoked.
payload -- JSON payload specific to the function you are invoking.
SPEC. (The shape should adhere to your function's API)
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
- Raises:
NgcException -- Matching HTTP Response code if fails in any way.
- Returns:
Streaming response of function invocation.
- Return type:
Generator[bytes, None, None]
- invoke_stream_grpc(
- function_id: str,
- starfleet_api_key: str,
- function_request: Any,
- grpc_stub_function: Callable,
- function_version_id: str | None = None,
- Parameters:
function_id -- ID of GRPC NVCF Function being invoked.
starfleet_api_key -- Key with invocation access to the function.
function_request -- GRPC Payload specific to the function you are invoking.
grpc_stub_function -- GRPC Stub function to invoke.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
- Raises:
NgcException -- Matching HTTP Response code if fails in any way.
- Returns:
GRPC Response of function invocation.
- Return type:
Any
- invoke_stream_grpc_triton(
- function_id: str,
- function_request: Any,
- starfleet_api_key: str | None = None,
- function_version_id: str | None = None,
- Parameters:
function_id -- ID of Triton based GRPC NVCF Function being invoked.
function_request -- GRPC Payload specific to the function you are invoking.
starfleet_api_key -- Key with invocation access to the function.
function_version_id -- Optionally provide a version id to invoke a specific version of a function.
- Returns:
Streaming response of function invocation
- Return type:
ModelStreamInferResponse
- list(
- function_id: str | None = None,
- name_pattern: str | None = None,
- access_filter: list[str] | None = None,
List functions available to the organization. Currently set.
- Parameters:
function_id -- Optional parameter to list only versions of a specific function. Defaults to None.
name_pattern -- Optional parameter to filter functions that contain this name. Supports wildcards.
access_filter -- Optional parameter to filter functions by their access
["private" (to the account to)
"public"
"authorized"].
- Returns:
Keyed List of Functions.
- Return type:
dict
- class nvcf.api.authorization.FunctionAuthorizationAPI(api_client: Client = None)¶
- add(
- function_id: str,
- function_version_id: str | None = None,
- nca_id: str | None = None,
Authorize additional NCA ids to invoke this function/function version.
- Parameters:
function_id -- Function's ID.
function_version_id -- Function's version ID.
nca_id -- NCA ID of party you wish to authorize.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- clear(
- function_id: str,
- function_version_id: str | None = None,
Delete all extra account authorizations for a given function/function version.
- Parameters:
function_id -- Function's ID.
function_version_id -- Function's version ID.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- info(
- function_id: str,
- function_version_id: str | None = None,
Get account authorization about a given function/function version.
- Parameters:
function_id -- Function's ID.
function_version_id -- Function's version ID.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- remove(
- function_id: str,
- function_version_id: str | None = None,
- nca_id: str | None = None,
Remove authorization for clients to invoke this function/function version.
- Parameters:
function_id -- Function's ID.
function_version_id -- Function's version ID.
nca_id -- NCA ID of party you wish to authorize.
- Returns:
JSON Response of NVCF function information.
- Return type:
dict
- class nvcf.api.deployment_spec.DeploymentSpecification(
- backend: str,
- gpu: str,
- min_instances: int,
- max_instances: int,
- instance_type: str | None = None,
- availability_zones: list[str] | None = None,
- max_request_concurrency: int | None = None,
- configuration: dict | None = None,
Represents a deployment specification for NVCF.
- class nvcf.api.deploy.DeployAPI(api_client: Client = None)¶
- create(
- function_id: str,
- function_version_id: str,
- deployment_specifications: list[DeploymentSpecification] | None = None,
- targeted_deployment_specifications: list[TargetedDeploymentSpecification] | None = None,
Create a deployment with a function id, version and a set of available deployment specifications.
- delete(
- function_id: str,
- function_version_id: str,
- *,
- graceful: bool = False,
Delete a given deployment.
- info(
- function_id: str,
- function_version_id: str,
Get information about a given function's deployment.
- query_logs(
- function_id: str,
- function_version_id: str,
- start_time: datetime | None = None,
- end_time: datetime | None = None,
- duration: timedelta | None = None,
Deployment logs.
- Parameters:
function_id -- Id of function logs are pulled from.
duration -- Specifies the duration of time, either after begin-time or before end-time. Format: [nD][nH][nM][nS]. Default: 1 day, doesn't respect decimal measurements.
start_time -- Specifies the start time for querying logs. Default: None.
end_time -- Specifies the end_time time for querying logs. Default: Now.
function_version_id -- Optional version to specify for function id.
- Returns:
Use to recieve logs one by one.
- Return type:
Iterator
- update(
- function_id: str,
- function_version_id: str,
- deployment_specifications: list[DeploymentSpecification] | None = None,
- targeted_deployment_specifications: list[TargetedDeploymentSpecification] | None = None,
Update a given deployment.
- class nvcf.api.task.TaskAPI(api_client: Client = None)¶
- cancel(task_id)¶
Cancel a task.
- Parameters:
task_id -- The task to cancel
- create(
- name: str,
- container_image: str | None = None,
- container_args: str | None = None,
- container_environment_variables: list[str] | None = None,
- gpu_specification: GPUSpecification | None = None,
- models: list[str] | None = None,
- resources: list[str] | None = None,
- tags: list[str] | None = None,
- description: str | None = None,
- max_runtime_duration: Duration | None = None,
- max_queued_duration: Duration | None = None,
- termination_grace_period_duration: Duration | None = None,
- result_handling_strategy: str = 'UPLOAD',
- result_location: list[str] | None = None,
- secrets: list[str] | None = None,
- helm_chart: str | None = None,
- logs_telemetry_id: str | None = None,
- metrics_telemetry_id: str | None = None,
- traces_telemetry_id: str | None = None,
Create a task with the specification provided by input.
- Parameters:
name -- Display name of the task.
container_image -- Container image.
container_args -- Container args.
container_environment_variables -- Container environment variables.
gpu_specification -- GPU specifications.
models -- NGC models.
resources -- NGC resources.
tags -- Optional list of tags to create the function with.
max_runtime_duration -- Maximum runtime duration for task. Defaults to forever.
max_queued_duration -- Maximum queued duration for task. Defaults to 72 hours.
termination_grace_period_duration -- Grace period after termination. Defaults to 1 hour.
description -- Description of the task.
result_handling_strategy -- How results should be handled.
result_location -- Where results should be stored. Required if result_handling_strategy is UPLOAD.
secrets -- Optional secret key/value pairs. Form: ["key1:value1", "key2:value2"].
helm_chart -- Helm Chart URL.
logs_telemetry_id -- UUID of telemetry log endpoint to map to task.
metrics_telemetry_id -- UUID of telemetry metrics endpoint to map to task.
traces_telemetry_id -- UUID of telemetry traces endpoint to map to task.
- Raises:
InvalidArgumentError -- If result handling strategy is set to upload and required fields aren't provided.
ResourceNotFound -- If the image or resource cannot be found.
- Returns:
Function Response provided by NVCF
- Return type:
DotDict
- delete(task_id)¶
Delete a task.
- Parameters:
task_id -- The task to delete.
- events(
- task_id: str,
- limit: int = 100,
Get a list of the task's events.
- Returns:
Iterable of DotDict of task events.
- Return type:
Iterable[DotDict]
- info(task_id: str) DotDict ¶
Get information about a given task.
- Parameters:
task_id -- The task to get information about.
- Returns:
DotDict of task information.
- Return type:
dict
- list(limit: int = 100) list[DotDict] ¶
List tasks available to the organization currently set.
- Returns:
A list of task DotDicts.
- logs(
- task_id: str,
- limit: int = 100,
- start_time: datetime | None = None,
- end_time: datetime | None = None,
- duration: timedelta | None = None,
Task deployment logs.
- Returns:
Iterable of DotDict of task logs.
- Return type:
Iterable[DotDict]
- results(
- task_id: str,
- limit: int = 100,
Get a list of the task's results.
- Returns:
Iterable of DotDict of task results
- Return type:
Iterable[DotDict]
Examples¶
Import the NGCSDK¶
from ngcsdk import Client
from nvcf.api.deployment_spec import TargetedDeploymentSpecification
from nvcf.api.invocation_handler import HTTPSInvocationHandler
Create and configure your client - put in your api_key, org_name, team name¶
key = "nvapi-***"
clt = Client()
clt.configure(key)
List your current functions¶
functions = clt.cloud_function.functions.list(access_filter=["private"])["functions"]
functions[0]
Create a new function with existing NGC Resources¶
genslm_function = clt.cloud_function.functions.create(name="peek-sdk-demo-sdxl", inference_url="echo",\
container_image="stg.nvcr.io/cv5p43s0htqh/echo:latest", models=["nvidia/genslm:latest"],\
container_environment_variables=["key:vavlue", "CUSTOM:ENV_VAR"],\
inference_port=8000, container_args="bash exec")
Delete Function¶
clt.cloud_function.functions.delete(function_id=function_id, function_version_id=function_version_id)
Deployment¶
List Available GPUS¶
clt.cloud_function.gpus.list().keys()
l40s_instance = clt.cloud_function.gpus.info("L40S").L40S[0]
l40s_instance.name, l40S_instance.currentInstances
Create a deployment¶
function_id, function_version_id = genslm_function.get("function").get("id"), genslm_function.get("function").get("versionId")
dep_specs = [TargetedDeploymentSpecification(gpu="L40S", instance_type=l40s_instance.name, min_instances=1, max_instances=1)]
genslm_deployment = clt.cloud_function.functions.deployments.create(function_id=function_id, function_version_id=function_version_id, targeted_deployment_specifications=dep_specs)
Undeploy¶
clt.cloud_function.functions.deployments.delete(function_id=function_id, function_version_id=function_version_id)
Function Invocations¶
## You can always provide a new key, or you can use the one you provided in the client configuration
SAK = "nvapi-XXX"
Asset management¶
clt.cloud_function.assets.list(SAK)
resp = clt.cloud_function.assets.upload("/path/to/image.png", "description", SAK)
clt.cloud_function.assets.delete(resp.get("asset_id"), SAK)
HTTPS Blocking Invocation¶
payload = {
"messages": [
{
"role": "user",
"content": "Give me some python code to read a file named temp.txt"
}
],
"temperature": 0.2,
"top_p": 0.7,
"max_tokens": 512,
"stream": False
}
clt.cloud_function.functions.invoke(function_id="eb1100de-60bf-4e9a-8617-b7d4652e0c37", payload=payload, starfleet_api_key=SAK)
Multiple Invocations¶
To reuse an HTTP session you can use the context manager.
with HTTPSInvocationHandler(starfleet_api_key=SAK) as invocation_handler:
# Non-stream
invocation_handler.make_invocation_request("0acb2d4a-47ed-45b8-935c-d93cba5b7485", payload)
# Stream
for line in invocation_handler.make_streaming_invocation_request(
function_id="845ba840-8d13-4151-9682-0b4cdbc93d7b", data=payload
):
print(line.decode("utf-8"))
Using assets in invocations¶
r = clt.cloud_function.assets.upload("/path/to/image.png", "description", SAK)
asset_id = r["asset_id"]
payload = {
"inputs": [
{
"name": "image",
"shape": [1],
"datatype": "BYTES",
"data": [f"{asset_id}"],
},
],
"outputs": [{"name": "image_generated", "datatype": "BYTES", "shape": [1]}],
}
resp = clt.cloud_function.functions.invoke(function_id, payload, starfleet_api_key=SAK, asset_ids=[asset_id], output_zip_path="/path/to/output.zip")
HTTPS Streaming Invocation¶
payload = {
"messages": [
{
"role": "user",
"content": "Give me some python code to read a file named temp.txt"
}
],
"temperature": 0.2,
"top_p": 0.7,
"max_tokens": 512,
"stream": True
}
invoke_request = clt.cloud_function.functions.invoke_stream(function_id="eb1100de-60bf-4e9a-8617-b7d4652e0c37", payload=payload, starfleet_api_key=SAK)
for chunk in invoke_request:
print(chunk.decode('utf-8'))
GRPC Invocation¶
MODEL_INFER_REQUEST = ModelInferRequest(model_name="simple_identity", inputs=[
ModelInferRequest.InferInputTensor(
name="INPUT0", datatype="BYTES",
shape=[1, 1],
contents=InferTensorContents(
bytes_contents=[b"hello world"]))], outputs=[
ModelInferRequest.InferRequestedOutputTensor(name="OUTPUT0")])
MODEL_STREAM_REQUEST = ModelInferRequest(
model_name="DialoGPT",
inputs=[
ModelInferRequest.InferInputTensor(
name="new_inputs",
datatype="BYTES",
shape=[1, 1],
contents=InferTensorContents(bytes_contents=[bytes("hello world", "utf-8")]),
),
]
, outputs=[ModelInferRequest.InferRequestedOutputTensor(name="response")],
)
clt.cloud_function.functions.invoke_grpc_triton(function_id = "7dc17808-b067-41dd-bf32-78d242379b6f", function_request = MODEL_INFER_REQUEST, starfleet_api_key=SAK, debug=True)
Multiple Invocations¶
To reuse an HTTP session you can use the context manager.
with TritonGRPCInvocationHandler(SAK) as invocation_handler:
# Blocking
invocation_handler.make_invocation_request(
function_id=function_id,
function_request=ModelInferRequest,
function_version_id=function_version_id,
)
# Streaming
for resp in clt.cloud_function.functions.invoke_stream_grpc_triton(function_id, MODEL_INFER_REQUEST, SAK, function_version_id):
print(resp)
Tasks¶
from isodate import parse_duration
from nvcf.api.deployment_spec import GPUSpecification
clt.cloud_function.tasks.list()
max_runtime_dur = parse_duration("PT1H")
output = clt.cloud_function.tasks.create("sdk-example", container_image="stg.nvcr.io/ax3ysqem02xw/tasks_sample:0.0.4", gpu_specification=GPUSpecification(gpu="T10", instance_type="g6.full", backend="GFN"), max_runtime_duration=max_runtime_dur, result_handl\
ing_strategy="NONE")
clt.cloud_function.tasks.info(output.task.id)
for res in clt.cloud_function.tasks.results(output.task.id):
print(res