datasetinsights.estimators

datasetinsights.estimators.base

class datasetinsights.estimators.base.Estimator

Bases: object

Abstract base class for estimator.

An estimator is the master class of all modeling operations. At minimum, it includes:

1. input data and output data transformations (e.g. input image cropping, remove unused output labels…) when applicable. 2. neural network graph (model) for either pytorch or tensorflow. 3. procedures to execute model training and evaluation.

One estimator could support multiple tasks (e.g. Mask R-CNN can be used for semantic segmentation and object detection)

abstract evaluate(**kwargs)

Abstract method to evaluate estimators

abstract train(**kwargs)

Abstract method to train estimators

datasetinsights.estimators.base.create_estimator(name, config, *, tb_log_dir=None, no_cuda=None, checkpoint_dir=None, kfp_metrics_dir='/home/docs/checkouts/readthedocs.org/user_builds/datasetinsights/checkouts/0.2.2/metrics/20201027-200642', kfp_metrics_filename='mlpipeline-metrics.json', no_val=None, **kwargs)

Create a new instance of the estimators subclass

Parameters
  • name (str) – unique identifier for a estimators subclass

  • config (dict) – parameters specific to each estimators subclass used to create a estimators instance

Returns

an instance of the specified estimators subclass

datasetinsights.estimators.deeplab

class datasetinsights.estimators.deeplab.DeeplabV3(*, config, writer, checkpointer, device, checkpoint_file=None, **kwargs)

Bases: datasetinsights.estimators.base.Estimator

DeeplabV3 Model https://arxiv.org/abs/1706.05587

Parameters
  • config (CfgNode) – estimator config

  • writer – Tensorboard writer object

  • checkpointer – Model checkpointer callback to save models

  • device – model training on device (cpu|cuda)

backbone

model backbone (resnet50|resnet101)

num_classes

number of classes for semantic segmentation

model

tensorflow or pytorch graph

writer

Tensorboard writer object

checkpointer

Model checkpointer callback to save models

device

model training on device (cpu|cuda)

optimizer

pytorch optimizer

lr_scheduler

pytorch learning rate scheduler

evaluate(**kwargs)

Abstract method to evaluate estimators

load(path)

Load Estimator from path

Parameters

path (str) – full path to the serialized estimator

save(path)

Serialize Estimator to path

Parameters

path (str) – full path to save serialized estimator

Returns

saved full path of the serialized estimator

train(**kwargs)

Abstract method to train estimators

class datasetinsights.estimators.deeplab.Normalize(mean, std)

Bases: object

class datasetinsights.estimators.deeplab.RandomCrop(size)

Bases: object

class datasetinsights.estimators.deeplab.ToTensor

Bases: object

Convert a pair of (image, target) to tensor

datasetinsights.estimators.deeplab.pad_if_smaller(img, size, fill=0)

datasetinsights.estimators.densedepth

class datasetinsights.estimators.densedepth.Decoder(num_features=2208, decoder_width=0.5)

Bases: torch.nn.modules.module.Module

forward(features)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class datasetinsights.estimators.densedepth.DenseDepth(config, writer, checkpointer, device, checkpoint_file, **kwargs)

Bases: datasetinsights.estimators.base.Estimator

config

estimator config

writer

Tensorboard writer object

model

tensorflow or pytorch graph

checkpointer

Model checkpointer callback to save models

device

model training on device (cpu|cuda)

optimizer

pytorch optimizer

evaluate(**kwargs)

Abstract method to evaluate estimators

load(path)

Load Estimator from path

Parameters

path (str) – full path to the serialized estimator

save(path)

Serialize Estimator to path

Parameters

path (str) – full path to save serialized estimator

Returns

saved full path of the serialized estimator

train(**kwargs)

Abstract method to train estimators

class datasetinsights.estimators.densedepth.DenseDepthModel

Bases: torch.nn.modules.module.Module

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class datasetinsights.estimators.densedepth.Encoder

Bases: torch.nn.modules.module.Module

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class datasetinsights.estimators.densedepth.RandomChannelSwap(probability)

Bases: object

Swap color channel of the image

Parameters

probability – the probability to swap color channel of the image

class datasetinsights.estimators.densedepth.ToTensor

Bases: object

Convert the image and depth to tensor.

class datasetinsights.estimators.densedepth.UpSample(skip_input, output_features)

Bases: torch.nn.modules.container.Sequential

forward(x, concat_with)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

datasetinsights.estimators.faster_rcnn

faster rcnn pytorch train and evaluate.

exception datasetinsights.estimators.faster_rcnn.BadLoss

Bases: Exception

pass the exception.

class datasetinsights.estimators.faster_rcnn.BoxListToTensor

Bases: object

transform to bboxes to Tensor.

class datasetinsights.estimators.faster_rcnn.FasterRCNN(*, config, logdir, kfp_writer, checkpointer, box_score_thresh=0.05, no_cuda=None, checkpoint_file=None, **kwargs)

Bases: datasetinsights.estimators.base.Estimator

Faster-RCNN train/evaluate implementation for object detection.

https://github.com/pytorch/vision/tree/master/references/detection https://arxiv.org/abs/1506.01497 :param config: estimator config :type config: CfgNode :param box_score_thresh: (optional) default threshold is 0.05 :param distributed: whether or not the estimator is distributed :param kfp_metrics_filename: Kubeflow Metrics filename :param kfp_metrics_dir: Path to the directory where Kubeflow

metrics files are stored

https://github.com/pytorch/vision/tree/master/references/detection https://arxiv.org/abs/1506.01497

model

pytorch model

writer

Tensorboard writer object

kfp_writer

KubeflowPipelineWriter object

checkpointer

Model checkpointer callback to save models

device

model training on device (cpu|cuda)

static collate_fn(batch)

Prepare batch to be format Faster RCNN expects.

Parameters

batch – mini batch of the form ((x0,x1,x2…xn),(y0,y1,y2…yn))

Returns

mini batch in the form [(x0,y0), (x1,y2), (x2,y2)… (xn,yn)]

static create_optimizer_lrs(config, params)

create optimizer and learning rate scheduler.

Parameters
  • config – (CfgNode): estimator config:

  • params – model parameters

Returns

pytorch optimizer lr_scheduler: pytorch LR scheduler

Return type

optimizer

static create_sampler(is_distributed, *, dataset, is_train)

create sample of data.

Parameters
  • is_distributed – whether or not the model is distributed

  • dataset – dataset obj must have len and __get_item__

  • is_train – whether or not the sampler is for training data

Returns

(torch.utils.data.Sampler)

Return type

data_sampler

evaluate(test_data, **kwargs)

evaluate given dataset.

evaluate_per_epoch(*, data_loader, epoch, label_mappings, max_detections_per_img=100, synchronize_metrics=True)

Evaluate model performance per epoch.

Note, torchvision’s implementation of faster rcnn requires input and gt data for training mode and returns a dictionary of losses (which we need to record the loss). We also need to get the raw predictions, which is only possible in model.eval() mode, to calculate the evaluation metric. :param data_loader: pytorch dataloader :type data_loader: DataLoader :param epoch: current epoch, used for logging :type epoch: int :param label_mappings: a dict of {label_id: label_name} mapping :type label_mappings: dict :param max_detections_per_img: max number of targets or predictions allowed :param per example: :param is_distributed: whether or not the model is distributed :param synchronize_metrics: whether or not to synchronize evaluation :param metrics across processes:

Returns:

static get_transform()

transform bounding box and tesnor.

load(path)

Load Estimator from path.

Parameters

path (str) – full path to the serialized estimator

log_metric_val(label_mappings, epoch)

log metric values.

Parameters
  • label_mappings (dict) – a dict of {label_id: label_name} mapping

  • epoch (int) – current epoch, used for logging

predict(pil_img, box_score_thresh=0.5)

Get prediction from one image using loaded model.

Parameters
  • pil_img (PIL Image) – PIL image from dataset.

  • box_score_thresh (float) – box score threshold for filter out lower

  • bounding boxes. Defaults to 0.5. (score) –

Returns

high predicted score bboxes from the model.

Return type

filtered_pred_annotations (List[BBox2D])

save(path)

Serialize Estimator to path.

Parameters

path (str) – full path to save serialized estimator

Returns

saved full path of the serialized estimator

train(train_data, val_data=None, **kwargs)

start training, save trained model per epoch.

Parameters
  • train_data – Directory on localhost where train dataset is located.

  • val_data – Directory on localhost where

  • dataset is located. (validation) –

train_loop(*, train_dataloader, label_mappings, val_dataloader, train_sampler=None)

train on whole range of epochs.

Parameters
  • train_dataloader (torch.utils.data.DataLoader) –

  • label_mappings (dict) – a dict of {label_id: label_name} mapping

  • val_dataloader (torch.utils.data.DataLoader) –

  • train_sampler – (torch.utils.data.Sampler)

train_one_epoch(*, optimizer, data_loader, epoch, lr_scheduler, accumulation_steps)

train per epoch.

Parameters
  • optimizer – pytorch optimizer

  • data_loader (DataLoader) – pytorch dataloader

  • epoch (int) – lr_scheduler: Pytorch LR scheduler

  • lr_scheduler – Pytorch LR scheduler

  • accumulation_steps (int) – Accumulated Gradients are only updated

  • X steps. This creates an effective batch size of (after) –

  • * accumulation_steps (batch_size) –

class datasetinsights.estimators.faster_rcnn.Loss

Bases: object

Record Loss during epoch.

compute()

compute avg loss.

Returns (float): avg. loss

reset()

reset loss.

update(avg_loss, batch_size)

update loss.

class datasetinsights.estimators.faster_rcnn.ToTensor

Bases: object

transform to tesnor.

datasetinsights.estimators.faster_rcnn.canonical2list(bbox: datasetinsights.io.bbox.BBox2D)

convert a BBox2d into a single list.

Parameters

bbox

Returns

attribute list of BBox2D

datasetinsights.estimators.faster_rcnn.convert_bboxes2canonical(bboxes)

convert bounding boxes to canonical.

convert bounding boxes from the format used by pytorch torchvision’s faster rcnn model into our canonical format, a list of list of BBox2Ds. Faster RCNN format: https://github.com/pytorch/vision/blob/master/torchvision/models/ detection/faster_rcnn.py#L45 :param bboxes: A list of dictionaries. Each :type bboxes: List[Dict[str, torch.Tensor() :param item in the list corresponds to the bounding boxes for one example.: :param The dictionary must have the keys ‘boxes’ and ‘labels’. The value for: :param ‘boxes’ is: the ground-truth boxes in :type ‘boxes’ is: FloatTensor[N, 4] :param [x1: :param y1: :param x2: :param y2] format: :param with values between 0 and H and: :param 0 and W. The value for labels is: the :type 0 and W. The value for labels is: Int64Tensor[N] :param class label for each ground-truth box. If the dictionary has the key: :param scores then these values are used for the confidence score of the: :param BBox2D: :param otherwise the score is set to 1.:

Returns (list[List[BBox2D]]):

Each element in the list corresponds to the list of bounding boxes for an example.

datasetinsights.estimators.faster_rcnn.create_dataloader(distributed, dataset, sampler, train, *, batch_size=1, num_workers=0, collate_fn=None)

load dataset and create dataloader.

Parameters
  • distributed – wether or not the dataloader is distributed

  • dataset – dataset obj must have len and __get_item__

  • sampler – (torch.utils.data.Sampler)

  • train – whether or not the sampler is for training data

  • batch_size – batch_size

  • num_workers – num_workers

  • collate_fn – Prepare batch to be format Faster RCNN expects

Returns data_loader:

torch.utils.data.DataLoader

datasetinsights.estimators.faster_rcnn.create_dataset(config, data_path, split)

download dataset from source.

Parameters
  • config – (CfgNode): estimator config:

  • data_path – Directory on localhost where datasets are located.

  • split – train, val, test

Returns dataset: dataset obj must have len and __get_item__

datasetinsights.estimators.faster_rcnn.dataloader_creator(config, dataset, sampler, split, distributed)

initiate data loading.

Parameters
  • config – (CfgNode): estimator config:

  • dataset – dataset obj must have len and __get_item__

  • sampler – (torch.utils.data.Sampler)

  • split – train, val, test

  • distributed

Returns data_loader:

torch.utils.data.DataLoader

datasetinsights.estimators.faster_rcnn.gather_gt_preds(*, gt_preds, device, max_boxes=100)

gather list of prediction.

Parameters
  • gt_preds (list(tuple(list(BBox2d), (Bbox2d)))) – A list of tuples where

  • first element in each tuple is a list of bounding boxes (the) –

  • to the targets in an example (corresponding) –

  • the second element (and) –

  • the tuple corresponds to the predictions in that example (in) –

  • device

  • max_boxes – the maximum number of boxes allowed for either targets or

  • per image (predictions) –

Returns (list(tuple(list(BBox2d), (Bbox2d)))): a list in the same format as gt_preds but containing all the information across processes e.g. if rank 0 has gt_preds = [([box_0], []), ([box_1], [box_2, box_2.5])] and rank 1 has gt_preds [([], [box_3]), ([box_4, box_5], [])] then this function will return [([box_0], []), ([box_1], [box_2, box_2.5]), ([], [box_3]), ([box_4, box_5], [])] the returned list is consistent across all processes

datasetinsights.estimators.faster_rcnn.list2canonical(box_list)

convert a list into a Bbox2d.

Parameters

box_list – box represented in list format

Returns

BBox2d

datasetinsights.estimators.faster_rcnn.list3d_2canonical(batch)

convert 3d list to canonical.

convert a list of list of padded targets and predictions per examples where bounding boxes are represented by lists into the same format except the boxes are represented by the BBox2d class and the padded boxes are removed. :param batch: [[[gt], [prds]], [[gt], [prds]], ] where gt and prds are list :param of lists:

Returns

[([gt],[preds]), ([gt],[preds])… where gt and preds are lists of BBox2ds

datasetinsights.estimators.faster_rcnn.metric_per_class_plot(metric_name, data, label_mappings, figsize=20, 10)

Bar plot for metric per class.

Parameters
  • metric_name (str) – metric name.

  • data (dict) – a dictionary of metric per label.

  • label_mappings (dict) – a dict of {label_id: label_name} mapping

  • figsize (tuple) – figure size of the plot. Default is (20, 10)

Returns (matplotlib.pyplot.figure):

a bar plot for metric per class.

datasetinsights.estimators.faster_rcnn.pad_box_lists(gt_preds: List[Tuple[List[datasetinsights.io.bbox.BBox2D], List[datasetinsights.io.bbox.BBox2D]]], max_boxes_per_img=100)

Pad the list of boxes.

Pad the list of boxes and targets with place holder boxes so that all targets and predictions have the same number of elements. :param gt_preds: A list of tuples where :type gt_preds: list(tuple(list(BBox2d), (Bbox2d))) :param the first element in each tuple is a list of bounding boxes: :param corresponding to the targets in an example: :param and the second element: :param in the tuple corresponds to the predictions in that example: :param max_boxes_per_img: : maximum number of target boxes and predicted boxes

per image

Returns: same format as gt_preds but all examples will have the same number of targets and predictions. If there are fewer targets or predictions than max_boxes_per_img, then boxes with nan values are added.

datasetinsights.estimators.faster_rcnn.prepare_bboxes(bboxes: List[datasetinsights.io.bbox.BBox2D]) → Dict[str, torch.Tensor]

Prepare bounding boxes for model training.

Parameters
  • bboxes – mini batch of bounding boxes (not including images).

  • example is a list of bounding boxes. Torchvision's implementation (Each) –

  • Faster-RCNN requires bounding boxes to be in the format of a (of) –

  • {'labels' (dictionary) – [label ids, …], and

  • 'boxes' – [[xleft, ytop, xright, ybottom of box1],

  • [xleft

  • ytop

  • xright

  • of box2]..] (ybottom) –

Returns

bounding boxes in the form that Faster RCNN expects

datasetinsights.estimators.faster_rcnn.reduce_dict(input_dict, average=True)

Reduce the values in dictionary.

Reduce the values in the dictionary from all processes so that all processes have the averaged results. :param input_dict: all the values will be reduced :type input_dict: dict :param average: whether to do average or sum :type average: bool

Returns

dict with the same fields as input_dict, after reduction.

datasetinsights.estimators.faster_rcnn.tensorlist2canonical(tensor_list)

convert tensorlist to canonical.

Converts the gt and predictions into the canonical format and removes the boxes with nan values that were added for padding. :param tensor_list: [tensor([[gt, prds]), tensor([gt, prds])], …]

Returns (list(tuple(list(BBox2d), (Bbox2d)))): A list of tuples where

the first element in each tuple is a list of bounding boxes corresponding to the targets in an example, and the second element in the tuple corresponds to the predictions in that example

class datasetinsights.estimators.DeeplabV3(*, config, writer, checkpointer, device, checkpoint_file=None, **kwargs)

Bases: datasetinsights.estimators.base.Estimator

DeeplabV3 Model https://arxiv.org/abs/1706.05587

Parameters
  • config (CfgNode) – estimator config

  • writer – Tensorboard writer object

  • checkpointer – Model checkpointer callback to save models

  • device – model training on device (cpu|cuda)

backbone

model backbone (resnet50|resnet101)

num_classes

number of classes for semantic segmentation

model

tensorflow or pytorch graph

writer

Tensorboard writer object

checkpointer

Model checkpointer callback to save models

device

model training on device (cpu|cuda)

optimizer

pytorch optimizer

lr_scheduler

pytorch learning rate scheduler

evaluate(**kwargs)

Abstract method to evaluate estimators

load(path)

Load Estimator from path

Parameters

path (str) – full path to the serialized estimator

save(path)

Serialize Estimator to path

Parameters

path (str) – full path to save serialized estimator

Returns

saved full path of the serialized estimator

train(**kwargs)

Abstract method to train estimators

class datasetinsights.estimators.DenseDepth(config, writer, checkpointer, device, checkpoint_file, **kwargs)

Bases: datasetinsights.estimators.base.Estimator

config

estimator config

writer

Tensorboard writer object

model

tensorflow or pytorch graph

checkpointer

Model checkpointer callback to save models

device

model training on device (cpu|cuda)

optimizer

pytorch optimizer

evaluate(**kwargs)

Abstract method to evaluate estimators

load(path)

Load Estimator from path

Parameters

path (str) – full path to the serialized estimator

save(path)

Serialize Estimator to path

Parameters

path (str) – full path to save serialized estimator

Returns

saved full path of the serialized estimator

train(**kwargs)

Abstract method to train estimators

class datasetinsights.estimators.Estimator

Bases: object

Abstract base class for estimator.

An estimator is the master class of all modeling operations. At minimum, it includes:

1. input data and output data transformations (e.g. input image cropping, remove unused output labels…) when applicable. 2. neural network graph (model) for either pytorch or tensorflow. 3. procedures to execute model training and evaluation.

One estimator could support multiple tasks (e.g. Mask R-CNN can be used for semantic segmentation and object detection)

abstract evaluate(**kwargs)

Abstract method to evaluate estimators

abstract train(**kwargs)

Abstract method to train estimators

class datasetinsights.estimators.FasterRCNN(*, config, logdir, kfp_writer, checkpointer, box_score_thresh=0.05, no_cuda=None, checkpoint_file=None, **kwargs)

Bases: datasetinsights.estimators.base.Estimator

Faster-RCNN train/evaluate implementation for object detection.

https://github.com/pytorch/vision/tree/master/references/detection https://arxiv.org/abs/1506.01497 :param config: estimator config :type config: CfgNode :param box_score_thresh: (optional) default threshold is 0.05 :param distributed: whether or not the estimator is distributed :param kfp_metrics_filename: Kubeflow Metrics filename :param kfp_metrics_dir: Path to the directory where Kubeflow

metrics files are stored

https://github.com/pytorch/vision/tree/master/references/detection https://arxiv.org/abs/1506.01497

model

pytorch model

writer

Tensorboard writer object

kfp_writer

KubeflowPipelineWriter object

checkpointer

Model checkpointer callback to save models

device

model training on device (cpu|cuda)

static collate_fn(batch)

Prepare batch to be format Faster RCNN expects.

Parameters

batch – mini batch of the form ((x0,x1,x2…xn),(y0,y1,y2…yn))

Returns

mini batch in the form [(x0,y0), (x1,y2), (x2,y2)… (xn,yn)]

static create_optimizer_lrs(config, params)

create optimizer and learning rate scheduler.

Parameters
  • config – (CfgNode): estimator config:

  • params – model parameters

Returns

pytorch optimizer lr_scheduler: pytorch LR scheduler

Return type

optimizer

static create_sampler(is_distributed, *, dataset, is_train)

create sample of data.

Parameters
  • is_distributed – whether or not the model is distributed

  • dataset – dataset obj must have len and __get_item__

  • is_train – whether or not the sampler is for training data

Returns

(torch.utils.data.Sampler)

Return type

data_sampler

evaluate(test_data, **kwargs)

evaluate given dataset.

evaluate_per_epoch(*, data_loader, epoch, label_mappings, max_detections_per_img=100, synchronize_metrics=True)

Evaluate model performance per epoch.

Note, torchvision’s implementation of faster rcnn requires input and gt data for training mode and returns a dictionary of losses (which we need to record the loss). We also need to get the raw predictions, which is only possible in model.eval() mode, to calculate the evaluation metric. :param data_loader: pytorch dataloader :type data_loader: DataLoader :param epoch: current epoch, used for logging :type epoch: int :param label_mappings: a dict of {label_id: label_name} mapping :type label_mappings: dict :param max_detections_per_img: max number of targets or predictions allowed :param per example: :param is_distributed: whether or not the model is distributed :param synchronize_metrics: whether or not to synchronize evaluation :param metrics across processes:

Returns:

static get_transform()

transform bounding box and tesnor.

load(path)

Load Estimator from path.

Parameters

path (str) – full path to the serialized estimator

log_metric_val(label_mappings, epoch)

log metric values.

Parameters
  • label_mappings (dict) – a dict of {label_id: label_name} mapping

  • epoch (int) – current epoch, used for logging

predict(pil_img, box_score_thresh=0.5)

Get prediction from one image using loaded model.

Parameters
  • pil_img (PIL Image) – PIL image from dataset.

  • box_score_thresh (float) – box score threshold for filter out lower

  • bounding boxes. Defaults to 0.5. (score) –

Returns

high predicted score bboxes from the model.

Return type

filtered_pred_annotations (List[BBox2D])

save(path)

Serialize Estimator to path.

Parameters

path (str) – full path to save serialized estimator

Returns

saved full path of the serialized estimator

train(train_data, val_data=None, **kwargs)

start training, save trained model per epoch.

Parameters
  • train_data – Directory on localhost where train dataset is located.

  • val_data – Directory on localhost where

  • dataset is located. (validation) –

train_loop(*, train_dataloader, label_mappings, val_dataloader, train_sampler=None)

train on whole range of epochs.

Parameters
  • train_dataloader (torch.utils.data.DataLoader) –

  • label_mappings (dict) – a dict of {label_id: label_name} mapping

  • val_dataloader (torch.utils.data.DataLoader) –

  • train_sampler – (torch.utils.data.Sampler)

train_one_epoch(*, optimizer, data_loader, epoch, lr_scheduler, accumulation_steps)

train per epoch.

Parameters
  • optimizer – pytorch optimizer

  • data_loader (DataLoader) – pytorch dataloader

  • epoch (int) – lr_scheduler: Pytorch LR scheduler

  • lr_scheduler – Pytorch LR scheduler

  • accumulation_steps (int) – Accumulated Gradients are only updated

  • X steps. This creates an effective batch size of (after) –

  • * accumulation_steps (batch_size) –

datasetinsights.estimators.convert_bboxes2canonical(bboxes)

convert bounding boxes to canonical.

convert bounding boxes from the format used by pytorch torchvision’s faster rcnn model into our canonical format, a list of list of BBox2Ds. Faster RCNN format: https://github.com/pytorch/vision/blob/master/torchvision/models/ detection/faster_rcnn.py#L45 :param bboxes: A list of dictionaries. Each :type bboxes: List[Dict[str, torch.Tensor() :param item in the list corresponds to the bounding boxes for one example.: :param The dictionary must have the keys ‘boxes’ and ‘labels’. The value for: :param ‘boxes’ is: the ground-truth boxes in :type ‘boxes’ is: FloatTensor[N, 4] :param [x1: :param y1: :param x2: :param y2] format: :param with values between 0 and H and: :param 0 and W. The value for labels is: the :type 0 and W. The value for labels is: Int64Tensor[N] :param class label for each ground-truth box. If the dictionary has the key: :param scores then these values are used for the confidence score of the: :param BBox2D: :param otherwise the score is set to 1.:

Returns (list[List[BBox2D]]):

Each element in the list corresponds to the list of bounding boxes for an example.

datasetinsights.estimators.create_estimator(name, config, *, tb_log_dir=None, no_cuda=None, checkpoint_dir=None, kfp_metrics_dir='/home/docs/checkouts/readthedocs.org/user_builds/datasetinsights/checkouts/0.2.2/metrics/20201027-200642', kfp_metrics_filename='mlpipeline-metrics.json', no_val=None, **kwargs)

Create a new instance of the estimators subclass

Parameters
  • name (str) – unique identifier for a estimators subclass

  • config (dict) – parameters specific to each estimators subclass used to create a estimators instance

Returns

an instance of the specified estimators subclass