Package com.gencior.triton.core.pojo
Class TritonInferResponseStatistics
java.lang.Object
com.gencior.triton.core.pojo.TritonInferResponseStatistics
Encapsulates statistics for inference response times.
This class represents statistics about the timing of various stages in an inference response, including compute time, output processing time, and request status distribution. These metrics are obtained from the Triton Inference Server gRPC API.
This is an immutable object that wraps the gRPC message InferResponseStatistics.
- Since:
- 1.0.0
- Author:
- sachachoumiloff
-
Method Summary
Modifier and TypeMethodDescriptionReturns statistics for cancelled inference requests.Returns statistics for the time spent in model inference computation.Returns statistics for the time spent preparing output tensors.Returns statistics for inference requests with empty responses.getFail()Returns statistics for inference requests that failed.Returns statistics for successfully completed inference requests.toString()
-
Method Details
-
fromProto
-
getComputeInfer
Returns statistics for the time spent in model inference computation.- Returns:
- duration statistics for inference computation
-
getComputeOutput
Returns statistics for the time spent preparing output tensors.- Returns:
- duration statistics for output preparation
-
getSuccess
Returns statistics for successfully completed inference requests.- Returns:
- duration statistics for successful requests
-
getFail
Returns statistics for inference requests that failed.- Returns:
- duration statistics for failed requests
-
getEmptyResponse
Returns statistics for inference requests with empty responses.- Returns:
- duration statistics for empty responses
-
getCancel
Returns statistics for cancelled inference requests.- Returns:
- duration statistics for cancelled requests
-
toString
-