Package inference
Interface GrpcService.InferStatisticsOrBuilder
- All Superinterfaces:
com.google.protobuf.MessageLiteOrBuilder,com.google.protobuf.MessageOrBuilder
- All Known Implementing Classes:
GrpcService.InferStatistics,GrpcService.InferStatistics.Builder
- Enclosing class:
GrpcService
public static interface GrpcService.InferStatisticsOrBuilder
extends com.google.protobuf.MessageOrBuilder
-
Method Summary
Methods inherited from interface com.google.protobuf.MessageLiteOrBuilder
isInitializedMethods inherited from interface com.google.protobuf.MessageOrBuilder
findInitializationErrors, getAllFields, getDefaultInstanceForType, getDescriptorForType, getField, getInitializationErrorString, getOneofFieldDescriptor, getRepeatedField, getRepeatedFieldCount, getUnknownFields, hasField, hasOneof
-
Method Details
-
hasSuccess
boolean hasSuccess()@@ .. cpp:var:: StatisticDuration success @@ @@ Cumulative count and duration for successful inference @@ request. The "success" count and cumulative duration includes @@ cache hits. @@
.inference.StatisticDuration success = 1;- Returns:
- Whether the success field is set.
-
getSuccess
GrpcService.StatisticDuration getSuccess()@@ .. cpp:var:: StatisticDuration success @@ @@ Cumulative count and duration for successful inference @@ request. The "success" count and cumulative duration includes @@ cache hits. @@
.inference.StatisticDuration success = 1;- Returns:
- The success.
-
getSuccessOrBuilder
GrpcService.StatisticDurationOrBuilder getSuccessOrBuilder()@@ .. cpp:var:: StatisticDuration success @@ @@ Cumulative count and duration for successful inference @@ request. The "success" count and cumulative duration includes @@ cache hits. @@
.inference.StatisticDuration success = 1; -
hasFail
boolean hasFail()@@ .. cpp:var:: StatisticDuration fail @@ @@ Cumulative count and duration for failed inference @@ request. @@
.inference.StatisticDuration fail = 2;- Returns:
- Whether the fail field is set.
-
getFail
GrpcService.StatisticDuration getFail()@@ .. cpp:var:: StatisticDuration fail @@ @@ Cumulative count and duration for failed inference @@ request. @@
.inference.StatisticDuration fail = 2;- Returns:
- The fail.
-
getFailOrBuilder
GrpcService.StatisticDurationOrBuilder getFailOrBuilder()@@ .. cpp:var:: StatisticDuration fail @@ @@ Cumulative count and duration for failed inference @@ request. @@
.inference.StatisticDuration fail = 2; -
hasQueue
boolean hasQueue()@@ .. cpp:var:: StatisticDuration queue @@ @@ The count and cumulative duration that inference requests wait in @@ scheduling or other queues. The "queue" count and cumulative @@ duration includes cache hits. @@
.inference.StatisticDuration queue = 3;- Returns:
- Whether the queue field is set.
-
getQueue
GrpcService.StatisticDuration getQueue()@@ .. cpp:var:: StatisticDuration queue @@ @@ The count and cumulative duration that inference requests wait in @@ scheduling or other queues. The "queue" count and cumulative @@ duration includes cache hits. @@
.inference.StatisticDuration queue = 3;- Returns:
- The queue.
-
getQueueOrBuilder
GrpcService.StatisticDurationOrBuilder getQueueOrBuilder()@@ .. cpp:var:: StatisticDuration queue @@ @@ The count and cumulative duration that inference requests wait in @@ scheduling or other queues. The "queue" count and cumulative @@ duration includes cache hits. @@
.inference.StatisticDuration queue = 3; -
hasComputeInput
boolean hasComputeInput()@@ .. cpp:var:: StatisticDuration compute_input @@ @@ The count and cumulative duration to prepare input tensor data as @@ required by the model framework / backend. For example, this duration @@ should include the time to copy input tensor data to the GPU. @@ The "compute_input" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_input = 4;- Returns:
- Whether the computeInput field is set.
-
getComputeInput
GrpcService.StatisticDuration getComputeInput()@@ .. cpp:var:: StatisticDuration compute_input @@ @@ The count and cumulative duration to prepare input tensor data as @@ required by the model framework / backend. For example, this duration @@ should include the time to copy input tensor data to the GPU. @@ The "compute_input" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_input = 4;- Returns:
- The computeInput.
-
getComputeInputOrBuilder
GrpcService.StatisticDurationOrBuilder getComputeInputOrBuilder()@@ .. cpp:var:: StatisticDuration compute_input @@ @@ The count and cumulative duration to prepare input tensor data as @@ required by the model framework / backend. For example, this duration @@ should include the time to copy input tensor data to the GPU. @@ The "compute_input" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_input = 4; -
hasComputeInfer
boolean hasComputeInfer()@@ .. cpp:var:: StatisticDuration compute_infer @@ @@ The count and cumulative duration to execute the model. @@ The "compute_infer" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_infer = 5;- Returns:
- Whether the computeInfer field is set.
-
getComputeInfer
GrpcService.StatisticDuration getComputeInfer()@@ .. cpp:var:: StatisticDuration compute_infer @@ @@ The count and cumulative duration to execute the model. @@ The "compute_infer" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_infer = 5;- Returns:
- The computeInfer.
-
getComputeInferOrBuilder
GrpcService.StatisticDurationOrBuilder getComputeInferOrBuilder()@@ .. cpp:var:: StatisticDuration compute_infer @@ @@ The count and cumulative duration to execute the model. @@ The "compute_infer" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_infer = 5; -
hasComputeOutput
boolean hasComputeOutput()@@ .. cpp:var:: StatisticDuration compute_output @@ @@ The count and cumulative duration to extract output tensor data @@ produced by the model framework / backend. For example, this duration @@ should include the time to copy output tensor data from the GPU. @@ The "compute_output" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_output = 6;- Returns:
- Whether the computeOutput field is set.
-
getComputeOutput
GrpcService.StatisticDuration getComputeOutput()@@ .. cpp:var:: StatisticDuration compute_output @@ @@ The count and cumulative duration to extract output tensor data @@ produced by the model framework / backend. For example, this duration @@ should include the time to copy output tensor data from the GPU. @@ The "compute_output" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_output = 6;- Returns:
- The computeOutput.
-
getComputeOutputOrBuilder
GrpcService.StatisticDurationOrBuilder getComputeOutputOrBuilder()@@ .. cpp:var:: StatisticDuration compute_output @@ @@ The count and cumulative duration to extract output tensor data @@ produced by the model framework / backend. For example, this duration @@ should include the time to copy output tensor data from the GPU. @@ The "compute_output" count and cumulative duration do not account for @@ requests that were a cache hit. See the "cache_hit" field for more @@ info. @@
.inference.StatisticDuration compute_output = 6; -
hasCacheHit
boolean hasCacheHit()@@ .. cpp:var:: StatisticDuration cache_hit @@ @@ The count of response cache hits and cumulative duration to lookup @@ and extract output tensor data from the Response Cache on a cache @@ hit. For example, this duration should include the time to copy @@ output tensor data from the Response Cache to the response object. @@ On cache hits, triton does not need to go to the model/backend @@ for the output tensor data, so the "compute_input", "compute_infer", @@ and "compute_output" fields are not updated. Assuming the response @@ cache is enabled for a given model, a cache hit occurs for a @@ request to that model when the request metadata (model name, @@ model version, model inputs) hashes to an existing entry in the @@ cache. On a cache miss, the request hash and response output tensor @@ data is added to the cache. See response cache docs for more info: @@ https://github.com/triton-inference-server/server/blob/main/docs/response_cache.md @@
.inference.StatisticDuration cache_hit = 7;- Returns:
- Whether the cacheHit field is set.
-
getCacheHit
GrpcService.StatisticDuration getCacheHit()@@ .. cpp:var:: StatisticDuration cache_hit @@ @@ The count of response cache hits and cumulative duration to lookup @@ and extract output tensor data from the Response Cache on a cache @@ hit. For example, this duration should include the time to copy @@ output tensor data from the Response Cache to the response object. @@ On cache hits, triton does not need to go to the model/backend @@ for the output tensor data, so the "compute_input", "compute_infer", @@ and "compute_output" fields are not updated. Assuming the response @@ cache is enabled for a given model, a cache hit occurs for a @@ request to that model when the request metadata (model name, @@ model version, model inputs) hashes to an existing entry in the @@ cache. On a cache miss, the request hash and response output tensor @@ data is added to the cache. See response cache docs for more info: @@ https://github.com/triton-inference-server/server/blob/main/docs/response_cache.md @@
.inference.StatisticDuration cache_hit = 7;- Returns:
- The cacheHit.
-
getCacheHitOrBuilder
GrpcService.StatisticDurationOrBuilder getCacheHitOrBuilder()@@ .. cpp:var:: StatisticDuration cache_hit @@ @@ The count of response cache hits and cumulative duration to lookup @@ and extract output tensor data from the Response Cache on a cache @@ hit. For example, this duration should include the time to copy @@ output tensor data from the Response Cache to the response object. @@ On cache hits, triton does not need to go to the model/backend @@ for the output tensor data, so the "compute_input", "compute_infer", @@ and "compute_output" fields are not updated. Assuming the response @@ cache is enabled for a given model, a cache hit occurs for a @@ request to that model when the request metadata (model name, @@ model version, model inputs) hashes to an existing entry in the @@ cache. On a cache miss, the request hash and response output tensor @@ data is added to the cache. See response cache docs for more info: @@ https://github.com/triton-inference-server/server/blob/main/docs/response_cache.md @@
.inference.StatisticDuration cache_hit = 7; -
hasCacheMiss
boolean hasCacheMiss()@@ .. cpp:var:: StatisticDuration cache_miss @@ @@ The count of response cache misses and cumulative duration to lookup @@ and insert output tensor data from the computed response to the cache. @@ For example, this duration should include the time to copy @@ output tensor data from the response object to the Response Cache. @@ Assuming the response cache is enabled for a given model, a cache @@ miss occurs for a request to that model when the request metadata @@ does NOT hash to an existing entry in the cache. See the response @@ cache docs for more info: @@ https://github.com/triton-inference-server/server/blob/main/docs/response_cache.md @@
.inference.StatisticDuration cache_miss = 8;- Returns:
- Whether the cacheMiss field is set.
-
getCacheMiss
GrpcService.StatisticDuration getCacheMiss()@@ .. cpp:var:: StatisticDuration cache_miss @@ @@ The count of response cache misses and cumulative duration to lookup @@ and insert output tensor data from the computed response to the cache. @@ For example, this duration should include the time to copy @@ output tensor data from the response object to the Response Cache. @@ Assuming the response cache is enabled for a given model, a cache @@ miss occurs for a request to that model when the request metadata @@ does NOT hash to an existing entry in the cache. See the response @@ cache docs for more info: @@ https://github.com/triton-inference-server/server/blob/main/docs/response_cache.md @@
.inference.StatisticDuration cache_miss = 8;- Returns:
- The cacheMiss.
-
getCacheMissOrBuilder
GrpcService.StatisticDurationOrBuilder getCacheMissOrBuilder()@@ .. cpp:var:: StatisticDuration cache_miss @@ @@ The count of response cache misses and cumulative duration to lookup @@ and insert output tensor data from the computed response to the cache. @@ For example, this duration should include the time to copy @@ output tensor data from the response object to the Response Cache. @@ Assuming the response cache is enabled for a given model, a cache @@ miss occurs for a request to that model when the request metadata @@ does NOT hash to an existing entry in the cache. See the response @@ cache docs for more info: @@ https://github.com/triton-inference-server/server/blob/main/docs/response_cache.md @@
.inference.StatisticDuration cache_miss = 8;
-