Package com.gencior.triton.core.pojo
Class TritonInferStatistics
java.lang.Object
com.gencior.triton.core.pojo.TritonInferStatistics
Encapsulates comprehensive inference statistics for a model.
This class aggregates timing and success/failure metrics for all inference requests, including queue times, compute times, and cache hit/miss statistics. It provides detailed insights into model performance and inference processing stages.
This is an immutable object that wraps the gRPC message InferStatistics.
- Since:
- 1.0.0
- Author:
- sachachoumiloff
-
Method Summary
Modifier and TypeMethodDescriptionstatic TritonInferStatisticsReturns statistics for inferences that hit the result cache.Returns statistics for inferences that missed the result cache.Returns statistics for time spent in inference computation.Returns statistics for time spent preparing input tensors.Returns statistics for time spent in model inference computation.getFail()Returns statistics for failed inferences.getQueue()Returns statistics for time spent in the request queue.Returns statistics for successfully completed inferences.
-
Method Details
-
fromProto
-
getSuccess
Returns statistics for successfully completed inferences. -
getFail
Returns statistics for failed inferences. -
getQueue
Returns statistics for time spent in the request queue. -
getComputeInput
Returns statistics for time spent preparing input tensors. -
getComputeOutput
Returns statistics for time spent in model inference computation. -
getComputeInfer
Returns statistics for time spent in inference computation. -
getCacheHit
Returns statistics for inferences that hit the result cache. -
getCacheMiss
Returns statistics for inferences that missed the result cache.
-