Class InferInput

java.lang.Object
com.gencior.triton.core.InferInput

public class InferInput extends Object
An object of InferInput class is used to describe an input tensor for an inference request. *

This class is designed to be aligned with the Python invalid input: '{@':tritonclient.grpc.InferInput} implementation. It provides a high-level API to handle data serialization into the raw byte format required by Triton, supporting various data types and shared memory regions.

Key features include:

  • Automatic Little-Endian serialization for numeric types.
  • Length-prefixed serialization for BYTES (String) data types.
  • Data size validation against the provided tensor shape.
  • Support for System and CUDA Shared Memory parameters.
Author:
sachachoumiloff
  • Constructor Details

    • InferInput

      public InferInput(String name, long[] shape, TritonDataType datatype)
      Creates an InferInput instance.
      Parameters:
      name - The name of the input.
      shape - The shape of the associated input.
      datatype - The TritonDataType of the associated input.
  • Method Details

    • getName

      public String getName()
      Returns:
      The name of the input associated with this Input Tensor.
    • getDatatype

      public TritonDataType getDatatype()
      Returns:
      The datatype of the input associated with this object.
    • getShape

      public long[] getShape()
      Returns:
      The current shape of the input as an array of longs.
    • setShape

      public InferInput setShape(long[] shape)
      Updates the shape of the input. Useful for models with dynamic input shapes. * @param shape The new shape for the associated input.
      Returns:
      This InferInput instance for method chaining.
    • setData

      public InferInput setData(float[] data)
      Sets the tensor data from a float array. Validates that the input matches the expected "FP32" datatype and size. * @param data The float array to be used as input.
      Returns:
      This InferInput instance for method chaining.
      Throws:
      TritonDataTypeException - If data size mismatch.
      TritonShapeMismatchException - If data size mismatch.
    • setData

      public InferInput setData(double[] data)
      Sets the tensor data from a double array (FP64). * @param data The double array to be used as input.
      Returns:
      This InferInput instance for method chaining.
    • setData

      public InferInput setData(int[] data)
      Sets the tensor data from an int array. Supports INT32, INT16, or INT8 datatypes. * @param data The int array to be used as input.
      Returns:
      This InferInput instance for method chaining.
    • setData

      public InferInput setData(long[] data)
      Sets the tensor data from a long array (INT64). * @param data The long array to be used as input.
      Returns:
      This InferInput instance for method chaining.
    • setData

      public InferInput setData(boolean[] data)
      Sets the tensor data from a boolean array. Internally converts booleans to 1-byte integers (1 for true, 0 for false). * @param data The boolean array to be used as input.
      Returns:
      This InferInput instance for method chaining.
    • setData

      public InferInput setData(byte[] data)
      Sets the tensor data from raw bytes. Use this if you have already serialized the data externally. * @param data The raw byte array.
      Returns:
      This InferInput instance for method chaining.
    • setData

      public InferInput setData(String[] data)
      Sets the tensor data from a String array (BYTES datatype). Each string is serialized with a 4-byte Little-Endian length prefix followed by the UTF-8 encoded string bytes. * @param data The string array to be used as input.
      Returns:
      This InferInput instance for method chaining.
    • getTensor

      Returns the underlying Protobuf message builder result. * @return The InferInputTensor message.
    • getRawContent

      public byte[] getRawContent()
      Returns the serialized binary content of the tensor. * @return A byte array containing the tensor data.
    • hasRawContent

      public boolean hasRawContent()
      Returns:
      true if raw content has been set and is not empty.
    • getDataAsFloatArray

      public float[] getDataAsFloatArray()
      Reconstructs a float array from the internal raw content.
      Returns:
      A float array representation of the data.
    • getDataAsDoubleArray

      public double[] getDataAsDoubleArray()
      Reconstructs a double array from the internal raw content.
      Returns:
      A double array representation of the data.
    • getDataAsIntArray

      public int[] getDataAsIntArray()
      Reconstructs an int array from the internal raw content.
      Returns:
      An int array representation of the data.
    • getDataAsLongArray

      public long[] getDataAsLongArray()
      Reconstructs a long array from the internal raw content.
      Returns:
      A long array representation of the data.
    • getDataAsBooleanArray

      public boolean[] getDataAsBooleanArray()
      Reconstructs a boolean array from the internal raw content.
      Returns:
      A boolean array representation of the data.
    • getDataAsStringArray

      public String[] getDataAsStringArray()
      Reconstructs a String array from the internal raw content (BYTES format).
      Returns:
      A String array representation of the data.