Package inference

Interface ModelConfigOuterClass.ModelInstanceGroupOrBuilder

All Superinterfaces:
com.google.protobuf.MessageLiteOrBuilder, com.google.protobuf.MessageOrBuilder
All Known Implementing Classes:
ModelConfigOuterClass.ModelInstanceGroup, ModelConfigOuterClass.ModelInstanceGroup.Builder
Enclosing class:
ModelConfigOuterClass

public static interface ModelConfigOuterClass.ModelInstanceGroupOrBuilder extends com.google.protobuf.MessageOrBuilder
  • Method Details

    • getName

      String getName()
      @@  .. cpp:var:: string name
      @@
      @@     Optional name of this group of instances. If not specified the
      @@     name will be formed as <model name>_<group number>. The name of
      @@     individual instances will be further formed by a unique instance
      @@     number and GPU index:
      @@
       
      string name = 1;
      Returns:
      The name.
    • getNameBytes

      com.google.protobuf.ByteString getNameBytes()
      @@  .. cpp:var:: string name
      @@
      @@     Optional name of this group of instances. If not specified the
      @@     name will be formed as <model name>_<group number>. The name of
      @@     individual instances will be further formed by a unique instance
      @@     number and GPU index:
      @@
       
      string name = 1;
      Returns:
      The bytes for name.
    • getKindValue

      int getKindValue()
      @@  .. cpp:var:: Kind kind
      @@
      @@     The kind of this instance group. Default is KIND_AUTO. If
      @@     KIND_AUTO or KIND_GPU then both 'count' and 'gpu' are valid and
      @@     may be specified. If KIND_CPU or KIND_MODEL only 'count' is valid
      @@     and 'gpu' cannot be specified.
      @@
       
      .inference.ModelInstanceGroup.Kind kind = 4;
      Returns:
      The enum numeric value on the wire for kind.
    • getKind

      @@  .. cpp:var:: Kind kind
      @@
      @@     The kind of this instance group. Default is KIND_AUTO. If
      @@     KIND_AUTO or KIND_GPU then both 'count' and 'gpu' are valid and
      @@     may be specified. If KIND_CPU or KIND_MODEL only 'count' is valid
      @@     and 'gpu' cannot be specified.
      @@
       
      .inference.ModelInstanceGroup.Kind kind = 4;
      Returns:
      The kind.
    • getCount

      int getCount()
      @@  .. cpp:var:: int32 count
      @@
      @@     For a group assigned to GPU, the number of instances created for
      @@     each GPU listed in 'gpus'. For a group assigned to CPU the number
      @@     of instances created. Default is 1.
       
      int32 count = 2;
      Returns:
      The count.
    • hasRateLimiter

      boolean hasRateLimiter()
      @@  .. cpp:var:: ModelRateLimiter rate_limiter
      @@
      @@     The rate limiter specific settings to be associated with this
      @@     instance group. Optional, if not specified no rate limiting
      @@     will be applied to this instance group.
      @@
       
      .inference.ModelRateLimiter rate_limiter = 6;
      Returns:
      Whether the rateLimiter field is set.
    • getRateLimiter

      @@  .. cpp:var:: ModelRateLimiter rate_limiter
      @@
      @@     The rate limiter specific settings to be associated with this
      @@     instance group. Optional, if not specified no rate limiting
      @@     will be applied to this instance group.
      @@
       
      .inference.ModelRateLimiter rate_limiter = 6;
      Returns:
      The rateLimiter.
    • getRateLimiterOrBuilder

      @@  .. cpp:var:: ModelRateLimiter rate_limiter
      @@
      @@     The rate limiter specific settings to be associated with this
      @@     instance group. Optional, if not specified no rate limiting
      @@     will be applied to this instance group.
      @@
       
      .inference.ModelRateLimiter rate_limiter = 6;
    • getGpusList

      List<Integer> getGpusList()
      @@  .. cpp:var:: int32 gpus (repeated)
      @@
      @@     GPU(s) where instances should be available. For each GPU listed,
      @@     'count' instances of the model will be available. Setting 'gpus'
      @@     to empty (or not specifying at all) is equivalent to listing all
      @@     available GPUs.
      @@
       
      repeated int32 gpus = 3;
      Returns:
      A list containing the gpus.
    • getGpusCount

      int getGpusCount()
      @@  .. cpp:var:: int32 gpus (repeated)
      @@
      @@     GPU(s) where instances should be available. For each GPU listed,
      @@     'count' instances of the model will be available. Setting 'gpus'
      @@     to empty (or not specifying at all) is equivalent to listing all
      @@     available GPUs.
      @@
       
      repeated int32 gpus = 3;
      Returns:
      The count of gpus.
    • getGpus

      int getGpus(int index)
      @@  .. cpp:var:: int32 gpus (repeated)
      @@
      @@     GPU(s) where instances should be available. For each GPU listed,
      @@     'count' instances of the model will be available. Setting 'gpus'
      @@     to empty (or not specifying at all) is equivalent to listing all
      @@     available GPUs.
      @@
       
      repeated int32 gpus = 3;
      Parameters:
      index - The index of the element to return.
      Returns:
      The gpus at the given index.
    • getSecondaryDevicesList

      @@  .. cpp:var:: SecondaryDevice secondary_devices (repeated)
      @@
      @@     Secondary devices that are required by instances specified by this
      @@     instance group. Optional.
      @@
       
      repeated .inference.ModelInstanceGroup.SecondaryDevice secondary_devices = 8;
    • getSecondaryDevices

      @@  .. cpp:var:: SecondaryDevice secondary_devices (repeated)
      @@
      @@     Secondary devices that are required by instances specified by this
      @@     instance group. Optional.
      @@
       
      repeated .inference.ModelInstanceGroup.SecondaryDevice secondary_devices = 8;
    • getSecondaryDevicesCount

      int getSecondaryDevicesCount()
      @@  .. cpp:var:: SecondaryDevice secondary_devices (repeated)
      @@
      @@     Secondary devices that are required by instances specified by this
      @@     instance group. Optional.
      @@
       
      repeated .inference.ModelInstanceGroup.SecondaryDevice secondary_devices = 8;
    • getSecondaryDevicesOrBuilderList

      List<? extends ModelConfigOuterClass.ModelInstanceGroup.SecondaryDeviceOrBuilder> getSecondaryDevicesOrBuilderList()
      @@  .. cpp:var:: SecondaryDevice secondary_devices (repeated)
      @@
      @@     Secondary devices that are required by instances specified by this
      @@     instance group. Optional.
      @@
       
      repeated .inference.ModelInstanceGroup.SecondaryDevice secondary_devices = 8;
    • getSecondaryDevicesOrBuilder

      @@  .. cpp:var:: SecondaryDevice secondary_devices (repeated)
      @@
      @@     Secondary devices that are required by instances specified by this
      @@     instance group. Optional.
      @@
       
      repeated .inference.ModelInstanceGroup.SecondaryDevice secondary_devices = 8;
    • getProfileList

      List<String> getProfileList()
      @@  .. cpp:var:: string profile (repeated)
      @@
      @@     For TensorRT models containing multiple optimization profile, this
      @@     parameter specifies a set of optimization profiles available to this
      @@     instance group. The inference server will choose the optimal profile
      @@     based on the shapes of the input tensors. This field should lie
      @@     between 0 and <TotalNumberOfOptimizationProfilesInPlanModel> - 1
      @@     and be specified only for TensorRT backend, otherwise an error will
      @@     be generated. If not specified, the server will select the first
      @@     optimization profile by default.
      @@
       
      repeated string profile = 5;
      Returns:
      A list containing the profile.
    • getProfileCount

      int getProfileCount()
      @@  .. cpp:var:: string profile (repeated)
      @@
      @@     For TensorRT models containing multiple optimization profile, this
      @@     parameter specifies a set of optimization profiles available to this
      @@     instance group. The inference server will choose the optimal profile
      @@     based on the shapes of the input tensors. This field should lie
      @@     between 0 and <TotalNumberOfOptimizationProfilesInPlanModel> - 1
      @@     and be specified only for TensorRT backend, otherwise an error will
      @@     be generated. If not specified, the server will select the first
      @@     optimization profile by default.
      @@
       
      repeated string profile = 5;
      Returns:
      The count of profile.
    • getProfile

      String getProfile(int index)
      @@  .. cpp:var:: string profile (repeated)
      @@
      @@     For TensorRT models containing multiple optimization profile, this
      @@     parameter specifies a set of optimization profiles available to this
      @@     instance group. The inference server will choose the optimal profile
      @@     based on the shapes of the input tensors. This field should lie
      @@     between 0 and <TotalNumberOfOptimizationProfilesInPlanModel> - 1
      @@     and be specified only for TensorRT backend, otherwise an error will
      @@     be generated. If not specified, the server will select the first
      @@     optimization profile by default.
      @@
       
      repeated string profile = 5;
      Parameters:
      index - The index of the element to return.
      Returns:
      The profile at the given index.
    • getProfileBytes

      com.google.protobuf.ByteString getProfileBytes(int index)
      @@  .. cpp:var:: string profile (repeated)
      @@
      @@     For TensorRT models containing multiple optimization profile, this
      @@     parameter specifies a set of optimization profiles available to this
      @@     instance group. The inference server will choose the optimal profile
      @@     based on the shapes of the input tensors. This field should lie
      @@     between 0 and <TotalNumberOfOptimizationProfilesInPlanModel> - 1
      @@     and be specified only for TensorRT backend, otherwise an error will
      @@     be generated. If not specified, the server will select the first
      @@     optimization profile by default.
      @@
       
      repeated string profile = 5;
      Parameters:
      index - The index of the value to return.
      Returns:
      The bytes of the profile at the given index.
    • getPassive

      boolean getPassive()
      @@  .. cpp:var:: bool passive
      @@
      @@     Whether the instances within this instance group will be accepting
      @@     inference requests from the scheduler. If true, the instances will
      @@     not be added to the scheduler. Default value is false.
      @@
       
      bool passive = 7;
      Returns:
      The passive.
    • getHostPolicy

      String getHostPolicy()
      @@  .. cpp:var:: string host_policy
      @@
      @@     The host policy name that the instance to be associated with.
      @@     The default value is set to reflect the device kind of the instance,
      @@     for instance, KIND_CPU is "cpu", KIND_MODEL is "model" and
      @@     KIND_GPU is "gpu_<gpu_id>".
      @@
       
      string host_policy = 9;
      Returns:
      The hostPolicy.
    • getHostPolicyBytes

      com.google.protobuf.ByteString getHostPolicyBytes()
      @@  .. cpp:var:: string host_policy
      @@
      @@     The host policy name that the instance to be associated with.
      @@     The default value is set to reflect the device kind of the instance,
      @@     for instance, KIND_CPU is "cpu", KIND_MODEL is "model" and
      @@     KIND_GPU is "gpu_<gpu_id>".
      @@
       
      string host_policy = 9;
      Returns:
      The bytes for hostPolicy.