Package inference
Interface ModelConfigOuterClass.ModelSequenceBatching.StrategyDirectOrBuilder
- All Superinterfaces:
com.google.protobuf.MessageLiteOrBuilder,com.google.protobuf.MessageOrBuilder
- All Known Implementing Classes:
ModelConfigOuterClass.ModelSequenceBatching.StrategyDirect,ModelConfigOuterClass.ModelSequenceBatching.StrategyDirect.Builder
- Enclosing class:
ModelConfigOuterClass.ModelSequenceBatching
public static interface ModelConfigOuterClass.ModelSequenceBatching.StrategyDirectOrBuilder
extends com.google.protobuf.MessageOrBuilder
-
Method Summary
Modifier and TypeMethodDescriptionlong@@ ..float@@ ..Methods inherited from interface com.google.protobuf.MessageLiteOrBuilder
isInitializedMethods inherited from interface com.google.protobuf.MessageOrBuilder
findInitializationErrors, getAllFields, getDefaultInstanceForType, getDescriptorForType, getField, getInitializationErrorString, getOneofFieldDescriptor, getRepeatedField, getRepeatedFieldCount, getUnknownFields, hasField, hasOneof
-
Method Details
-
getMaxQueueDelayMicroseconds
long getMaxQueueDelayMicroseconds()@@ .. cpp:var:: uint64 max_queue_delay_microseconds @@ @@ The maximum time, in microseconds, a candidate request @@ will be delayed in the sequence batch scheduling queue to @@ wait for additional requests for batching. Default is 0. @@
uint64 max_queue_delay_microseconds = 1;- Returns:
- The maxQueueDelayMicroseconds.
-
getMinimumSlotUtilization
float getMinimumSlotUtilization()@@ .. cpp:var:: float minimum_slot_utilization @@ @@ The minimum slot utilization that must be satisfied to @@ execute the batch before 'max_queue_delay_microseconds' expires. @@ For example, a value of 0.5 indicates that the batch should be @@ executed as soon as 50% or more of the slots are ready even if @@ the 'max_queue_delay_microseconds' timeout has not expired. @@ The default is 0.0, indicating that a batch will be executed @@ before 'max_queue_delay_microseconds' timeout expires if at least @@ one batch slot is ready. 'max_queue_delay_microseconds' will be @@ ignored unless minimum_slot_utilization is set to a non-zero @@ value. @@
float minimum_slot_utilization = 2;- Returns:
- The minimumSlotUtilization.
-