Package com.gencior.triton.core
Class InferStreamHandle
java.lang.Object
com.gencior.triton.core.InferStreamHandle
- All Implemented Interfaces:
AutoCloseable
Handle for a streaming inference call, providing lifecycle control.
This class is protocol-agnostic and works with both gRPC and HTTP streaming implementations. The cancellation mechanism is injected by the transport layer.
Allows the caller to cancel an ongoing stream (e.g., abort LLM token generation mid-response), check completion status, and block until the stream finishes.
Implements AutoCloseable for use in try-with-resources blocks,
where closing automatically cancels the stream if it is still running.
Usage — Wait for full completion:
InferStreamHandle handle = client.inferStream("llm", inputs, listener);
handle.await(60, TimeUnit.SECONDS);
Usage — Cancel generation early:
InferStreamHandle handle = client.inferStream("llm", inputs, result -> {
String token = result.asStringArray("text_output")[0];
});
// Cancel after 5 seconds if still running
Thread.sleep(5000);
if (!handle.isDone()) {
handle.cancel();
}
Usage — Auto-close with try-with-resources:
try (InferStreamHandle handle = client.inferStream("llm", inputs, listener)) {
handle.await(60, TimeUnit.SECONDS);
} // stream is cancelled here if still running
- Since:
- 1.0.0
- Author:
- sachachoumiloff
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionInferStreamHandle(Runnable cancelAction, CompletableFuture<Void> completionFuture) Creates a new stream handle. -
Method Summary
Modifier and TypeMethodDescriptionvoidawait()Blocks until the stream completes.voidBlocks until the stream completes or the timeout expires.voidcancel()Cancels the streaming call, aborting token generation on the server.voidclose()Cancels the stream if still running.booleanisDone()Returns whether the stream has finished (completed, errored, or cancelled).
-
Constructor Details
-
InferStreamHandle
Creates a new stream handle.- Parameters:
cancelAction- the action to execute when cancel is requested (e.g., gRPC context cancellation, HTTP request abort)completionFuture- the future that completes when the stream ends
-
-
Method Details
-
cancel
public void cancel()Cancels the streaming call, aborting token generation on the server.The exact mechanism depends on the transport: gRPC sends a RST_STREAM frame, HTTP aborts the connection. Safe to call multiple times or from any thread.
-
isDone
public boolean isDone()Returns whether the stream has finished (completed, errored, or cancelled).- Returns:
trueif the stream is done
-
await
Blocks until the stream completes.- Throws:
InterruptedException- if the current thread is interruptedExecutionException- if the stream completed with an error
-
await
public void await(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, TimeoutException Blocks until the stream completes or the timeout expires.- Parameters:
timeout- the maximum time to waitunit- the time unit- Throws:
InterruptedException- if the current thread is interruptedExecutionException- if the stream completed with an errorTimeoutException- if the timeout expired before the stream completed
-
close
public void close()Cancels the stream if still running. Equivalent tocancel().- Specified by:
closein interfaceAutoCloseable
-