Class TritonInferResponseStatistics

java.lang.Object
com.gencior.triton.core.pojo.TritonInferResponseStatistics

public final class TritonInferResponseStatistics extends Object
Encapsulates statistics for inference response times.

This class represents statistics about the timing of various stages in an inference response, including compute time, output processing time, and request status distribution. These metrics are obtained from the Triton Inference Server gRPC API.

This is an immutable object that wraps the gRPC message InferResponseStatistics.

Since:
1.0.0
Author:
sachachoumiloff
  • Method Details

    • fromProto

    • getComputeInfer

      public TritonStatisticDuration getComputeInfer()
      Returns statistics for the time spent in model inference computation.
      Returns:
      duration statistics for inference computation
    • getComputeOutput

      public TritonStatisticDuration getComputeOutput()
      Returns statistics for the time spent preparing output tensors.
      Returns:
      duration statistics for output preparation
    • getSuccess

      public TritonStatisticDuration getSuccess()
      Returns statistics for successfully completed inference requests.
      Returns:
      duration statistics for successful requests
    • getFail

      public TritonStatisticDuration getFail()
      Returns statistics for inference requests that failed.
      Returns:
      duration statistics for failed requests
    • getEmptyResponse

      public TritonStatisticDuration getEmptyResponse()
      Returns statistics for inference requests with empty responses.
      Returns:
      duration statistics for empty responses
    • getCancel

      public TritonStatisticDuration getCancel()
      Returns statistics for cancelled inference requests.
      Returns:
      duration statistics for cancelled requests
    • toString

      public String toString()
      Overrides:
      toString in class Object