Class InferStreamHandle

java.lang.Object
com.gencior.triton.core.InferStreamHandle
All Implemented Interfaces:
AutoCloseable

public final class InferStreamHandle extends Object implements AutoCloseable
Handle for a streaming inference call, providing lifecycle control.

This class is protocol-agnostic and works with both gRPC and HTTP streaming implementations. The cancellation mechanism is injected by the transport layer.

Allows the caller to cancel an ongoing stream (e.g., abort LLM token generation mid-response), check completion status, and block until the stream finishes.

Implements AutoCloseable for use in try-with-resources blocks, where closing automatically cancels the stream if it is still running.

Usage — Wait for full completion:


 InferStreamHandle handle = client.inferStream("llm", inputs, listener);
 handle.await(60, TimeUnit.SECONDS);
 

Usage — Cancel generation early:


 InferStreamHandle handle = client.inferStream("llm", inputs, result -> {
     String token = result.asStringArray("text_output")[0];
 });
 // Cancel after 5 seconds if still running
 Thread.sleep(5000);
 if (!handle.isDone()) {
     handle.cancel();
 }
 

Usage — Auto-close with try-with-resources:


 try (InferStreamHandle handle = client.inferStream("llm", inputs, listener)) {
     handle.await(60, TimeUnit.SECONDS);
 } // stream is cancelled here if still running
 
Since:
1.0.0
Author:
sachachoumiloff
See Also:
  • Constructor Details

    • InferStreamHandle

      public InferStreamHandle(Runnable cancelAction, CompletableFuture<Void> completionFuture)
      Creates a new stream handle.
      Parameters:
      cancelAction - the action to execute when cancel is requested (e.g., gRPC context cancellation, HTTP request abort)
      completionFuture - the future that completes when the stream ends
  • Method Details

    • cancel

      public void cancel()
      Cancels the streaming call, aborting token generation on the server.

      The exact mechanism depends on the transport: gRPC sends a RST_STREAM frame, HTTP aborts the connection. Safe to call multiple times or from any thread.

    • isDone

      public boolean isDone()
      Returns whether the stream has finished (completed, errored, or cancelled).
      Returns:
      true if the stream is done
    • await

      public void await() throws InterruptedException, ExecutionException
      Blocks until the stream completes.
      Throws:
      InterruptedException - if the current thread is interrupted
      ExecutionException - if the stream completed with an error
    • await

      public void await(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, TimeoutException
      Blocks until the stream completes or the timeout expires.
      Parameters:
      timeout - the maximum time to wait
      unit - the time unit
      Throws:
      InterruptedException - if the current thread is interrupted
      ExecutionException - if the stream completed with an error
      TimeoutException - if the timeout expired before the stream completed
    • close

      public void close()
      Cancels the stream if still running. Equivalent to cancel().
      Specified by:
      close in interface AutoCloseable