datasetinsights.evaluation_metrics¶
datasetinsights.evaluation_metrics.average_log10_error¶
Average Log10 Error metric.
The average log10 error can be described as:
-
class
datasetinsights.evaluation_metrics.average_log10_error.
AverageLog10Error
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Log10 Error metric.
The metric is defined for grayscale depth images.
-
sum_of_log10_error
¶ the sum of the log10 errors for all the
- Type
float
-
images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
datasetinsights.evaluation_metrics.average_precision¶
-
class
datasetinsights.evaluation_metrics.average_precision.
AveragePrecision
(config: datasetinsights.evaluation_metrics.average_precision.DetectionConfig = None)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
-
compute
(boxes: Dict[str, Tuple[List[datasetinsights.io.bbox.BBox3D], List[datasetinsights.io.bbox.BBox3D]]] = None) → Dict[str, float]¶ calculate the mean ap for all classes over all distance thresholds defined in the config. The equation is described in the second figure in the nuscenes paper https://arxiv.org/pdf/1903.11027.pdf It is the normalized sum of the ROC curves for each class and distance. :param boxes: the predicted and ground truth bounding boxes per sample. :return: dictionary mapping each label to it’s average precision (averaged across all distance thresholds for that label)
-
reset
()¶
-
update
(boxes)¶
-
-
class
datasetinsights.evaluation_metrics.average_precision.
DetectionConfig
(class_range: Dict[str, int] = {'barrier': 30, 'bicycle': 40, 'bus': 50, 'car': 50, 'construction_vehicle': 50, 'motorcycle': 40, 'pedestrian': 40, 'traffic_cone': 30, 'trailer': 50, 'truck': 50}, dist_fcn: str = 'center_distance', dist_ths: List[float] = [0.5, 1.0, 2.0, 4.0], min_recall: float = 0.1, min_precision: float = 0.1, max_boxes_per_sample: float = 500, mean_ap_weight: int = 5)¶ Bases:
object
Data class that specifies the detection evaluation settings.
-
classmethod
deserialize
(content)¶ Initialize from serialized dictionary.
-
serialize
() → dict¶ Serialize instance into json-friendly format.
-
classmethod
-
datasetinsights.evaluation_metrics.average_precision.
calc_ap
(*, precision, min_recall: float, min_precision: float) → float¶ Calculated average precision.
-
datasetinsights.evaluation_metrics.average_precision.
center_distance
(*, gt_box: datasetinsights.io.bbox.BBox3D, pred_box: datasetinsights.io.bbox.BBox3D) → float¶ L2 distance between the box centers (xy only).
datasetinsights.evaluation_metrics.average_precision_2d¶
Average Precision metrics for 2D object detection
This module provides average precision metics to evaluate 2D object detection models, such as metrics defined in coco evaluation. The most commonly used metrics are MeanAveragePrecisionIOU50 and MeanAveragePrecisionAverageOverIOU which provide average precision for all labels considered.
These metrics are implemented based on this implementation.
AveragePrecision provides AP for each label under a given IOU. AveragePrecisionIOU50 provides AP for each label at IOU=50%. MeanAveragePrecisionIOU50 provides mean AP over all labels at IOU=50%. MeanAveragePrecisionAverageOverIOU provides mean AP over all labels and IOU=[0.5:0.95:0.05].
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
AveragePrecision
(iou_threshold=0.5, interpolation='EveryPointInterpolation', max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision metrics.
This metric would calculate average precision (AP) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
interpolation (string) – AP interoperation method name for AP calculation
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AP for each label.
- Returns
a dictionary of AP scores per label.
- Return type
dict
-
static
every_point_interpolated_ap
(recall, precision)¶ Calculating the interpolation performed in all points.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
- Returns
average precision for all points interpolation
- Return type
float
-
static
n_point_interpolated_ap
(recall, precision, point=11)¶ Calculating the n-point interpolation.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
point (int) – n, n-point interpolation
- Returns
average precision for n-point interpolation
- Return type
float
-
reset
()¶ Reset AP metrics.
-
update
(mini_batch)¶ Update records per mini batch
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
AveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision at \(IOU=50\%\).
This implementation would calculate AP for each label at \(IOU=50\%\). This woud provide a mapping for label and average precision. The maximum number of detections per image is 100.
-
TYPE
= 'metric_per_label'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
MeanAveragePrecisionAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics.
This implementation computes Mean Average Precision (mAP) metric, which is implemented as the Average Precision average over all labels and \(IOU = 0.5, 0.55, 0.60, ..., 0.95\). The max detections per image is limited to 100.
\[mAP = \frac{1}{N_\text{IOU}N_\text{label}}\sum_{\text{label}, \text{IOU}}AP(\text{label}, \text{IOU})\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAP over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
MeanAveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics at \(IOU=50\%\).
This implementation would calculate mAP at \(IOU=50\%\). Average across all the labels.
\[mAP(\text{IOU=50})=\frac{1}{N_{\text{label}}}\sum_{\text{label}}AP(\text{label}, \text{IOU=50})\]where AP is the AveragePrecision metrics competed separately for each label.
-
TYPE
= 'scalar'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
datasetinsights.evaluation_metrics.average_precision_config¶
datasetinsights.evaluation_metrics.average_recall_2d¶
Average Recall metrics for 2D object detection
This module provides average recall metics to evaluate 2D object detection models, such as metrics defined in coco evaluation. The most commonly used metrics are MeanAverageRecallAverageOverIOU which provide average recall for all labels considered.
-
class
datasetinsights.evaluation_metrics.average_recall_2d.
AverageRecall
(iou_threshold=0.5, max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Recall metrics.
This metric would calculate average recall (AR) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AR for each label.
- Returns
a dictionary of AR scores per label.
- Return type
dict
-
reset
()¶ Reset AR metrics.
-
update
(mini_batch)¶ Update records per mini batch.
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.average_recall_2d.
MeanAverageRecallAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Recall metrics.
This implementation computes Mean Average Recall (mAR) metric, which is implemented as the Average Recall average over all labels and \(IOU = 0.5:0.95:0.05\). The max detections per image is limited to 100.
\[mAR = \frac{1}{N_\text{IOU}N_\text{label}}\sum_{\text{label}, \text{IOU}}AR(\text{label}, \text{IOU})\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAR over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
datasetinsights.evaluation_metrics.average_relative_error¶
Average Relative Error metrics.
The average relative error can be described as:
-
class
datasetinsights.evaluation_metrics.average_relative_error.
AverageRelativeError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Relative Error metric.
The metric is defined for grayscale depth images.
-
sum_of_relative_error
¶ the sum of the relative errors for all
- Type
float
-
the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
datasetinsights.evaluation_metrics.base¶
-
class
datasetinsights.evaluation_metrics.base.
EvaluationMetric
¶ Bases:
object
Abstract base class for metrics.
-
COMPUTE_TYPE
= ''¶
-
abstract
compute
()¶
-
static
create
(name, **kwargs)¶ Create a new instance of the metric subclass
- Parameters
name (str) – unique identifier for a metric subclass
config (dict) – parameters specific to each metric subclass used to create a metric instance
- Returns
an instance of the specified metric subclass
-
static
find
(name)¶ Find EvaluationMetric subclass based on the given name
- Parameters
name (str) – unique identifier for a metric subclass
- Returns
a label of the specified metric subclass
-
abstract
reset
()¶
-
abstract
update
(output)¶
-
datasetinsights.evaluation_metrics.confusion_matrix¶
-
datasetinsights.evaluation_metrics.confusion_matrix.
precision_recall
(gt_bboxes, pred_bboxes, iou_thresh=0.5)¶ Calculate precision and recall per image.
-
datasetinsights.evaluation_metrics.confusion_matrix.
prediction_records
(gt_bboxes, pred_bboxes, iou_thresh=0.5)¶ Calculate prediction results per image.
datasetinsights.evaluation_metrics.exceptions¶
-
exception
datasetinsights.evaluation_metrics.exceptions.
NoSampleError
¶ Bases:
Exception
Raise when the number of samples is zero
datasetinsights.evaluation_metrics.iou¶
IoU evaluation metrics
-
class
datasetinsights.evaluation_metrics.iou.
IoU
(num_classes, output_transform=<function IoU.<lambda>>)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Intersection over Union (IoU) metric per class
The metric is defined for a pair of grayscale semantic segmentation images.
- Parameters
num_classes – number of calsses in the ground truth image
output_transform – function that transform output pair of images
-
cm
¶ pytorch ignite confusion matrix
- Type
ignite.metrics.ConfusionMatrix
-
object.
-
compute
()¶
-
reset
()¶
-
update
(output)¶
datasetinsights.evaluation_metrics.records¶
-
class
datasetinsights.evaluation_metrics.records.
Records
(iou_threshold=0.5)¶ Bases:
object
Save prediction records during update.
-
iou_threshold
¶ iou threshold
- Type
float
-
match_results
¶ save the results (TP/FP)
- Type
list
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
-
add_records
(gt_bboxes, pred_bboxes)¶ Add ground truth and prediction records.
- Parameters
gt_bboxes – ground truth bboxes in the current image
pred_bboxes – sorted predicition bboxes in the current image
-
reset
()¶
-
datasetinsights.evaluation_metrics.root_mean_square_error¶
Root Mean Square Error metrics.
The root mean square error can be described as:
-
class
datasetinsights.evaluation_metrics.root_mean_square_error.
RootMeanSquareError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Root Mean Square Error metric.
The metric is defined for grayscale depth images.
-
sum_of_root_mean_square_error
¶ the sum of RMSE
- Type
float
-
for all the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
datasetinsights.evaluation_metrics.threshold_accuracy¶
Average Log10 Error metric.
The average relative error can be described as:
-
class
datasetinsights.evaluation_metrics.threshold_accuracy.
ThresholdAccuracy
(threshold)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Threshold accuracy metric.
The metric is defined for grayscale depth images.
-
sum_of_threshold_acc
¶ the sum of threshold accuracies for all the images in a branch
- Type
int
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
AverageLog10Error
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Log10 Error metric.
The metric is defined for grayscale depth images.
-
sum_of_log10_error
¶ the sum of the log10 errors for all the
- Type
float
-
images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
AveragePrecision
(iou_threshold=0.5, interpolation='EveryPointInterpolation', max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision metrics.
This metric would calculate average precision (AP) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
interpolation (string) – AP interoperation method name for AP calculation
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AP for each label.
- Returns
a dictionary of AP scores per label.
- Return type
dict
-
static
every_point_interpolated_ap
(recall, precision)¶ Calculating the interpolation performed in all points.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
- Returns
average precision for all points interpolation
- Return type
float
-
static
n_point_interpolated_ap
(recall, precision, point=11)¶ Calculating the n-point interpolation.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
point (int) – n, n-point interpolation
- Returns
average precision for n-point interpolation
- Return type
float
-
reset
()¶ Reset AP metrics.
-
update
(mini_batch)¶ Update records per mini batch
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.
AveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision at \(IOU=50\%\).
This implementation would calculate AP for each label at \(IOU=50\%\). This woud provide a mapping for label and average precision. The maximum number of detections per image is 100.
-
TYPE
= 'metric_per_label'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
AverageRecall
(iou_threshold=0.5, max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Recall metrics.
This metric would calculate average recall (AR) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AR for each label.
- Returns
a dictionary of AR scores per label.
- Return type
dict
-
reset
()¶ Reset AR metrics.
-
update
(mini_batch)¶ Update records per mini batch.
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.
AverageRelativeError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Relative Error metric.
The metric is defined for grayscale depth images.
-
sum_of_relative_error
¶ the sum of the relative errors for all
- Type
float
-
the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
EvaluationMetric
¶ Bases:
object
Abstract base class for metrics.
-
COMPUTE_TYPE
= ''¶
-
abstract
compute
()¶
-
static
create
(name, **kwargs)¶ Create a new instance of the metric subclass
- Parameters
name (str) – unique identifier for a metric subclass
config (dict) – parameters specific to each metric subclass used to create a metric instance
- Returns
an instance of the specified metric subclass
-
static
find
(name)¶ Find EvaluationMetric subclass based on the given name
- Parameters
name (str) – unique identifier for a metric subclass
- Returns
a label of the specified metric subclass
-
abstract
reset
()¶
-
abstract
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
IoU
(num_classes, output_transform=<function IoU.<lambda>>)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Intersection over Union (IoU) metric per class
The metric is defined for a pair of grayscale semantic segmentation images.
- Parameters
num_classes – number of calsses in the ground truth image
output_transform – function that transform output pair of images
-
cm
¶ pytorch ignite confusion matrix
- Type
ignite.metrics.ConfusionMatrix
-
object.
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
class
datasetinsights.evaluation_metrics.
MeanAveragePrecisionAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics.
This implementation computes Mean Average Precision (mAP) metric, which is implemented as the Average Precision average over all labels and \(IOU = 0.5, 0.55, 0.60, ..., 0.95\). The max detections per image is limited to 100.
\[mAP = \frac{1}{N_\text{IOU}N_\text{label}}\sum_{\text{label}, \text{IOU}}AP(\text{label}, \text{IOU})\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAP over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
MeanAveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics at \(IOU=50\%\).
This implementation would calculate mAP at \(IOU=50\%\). Average across all the labels.
\[mAP(\text{IOU=50})=\frac{1}{N_{\text{label}}}\sum_{\text{label}}AP(\text{label}, \text{IOU=50})\]where AP is the AveragePrecision metrics competed separately for each label.
-
TYPE
= 'scalar'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
MeanAverageRecallAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Recall metrics.
This implementation computes Mean Average Recall (mAR) metric, which is implemented as the Average Recall average over all labels and \(IOU = 0.5:0.95:0.05\). The max detections per image is limited to 100.
\[mAR = \frac{1}{N_\text{IOU}N_\text{label}}\sum_{\text{label}, \text{IOU}}AR(\text{label}, \text{IOU})\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAR over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
RootMeanSquareError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Root Mean Square Error metric.
The metric is defined for grayscale depth images.
-
sum_of_root_mean_square_error
¶ the sum of RMSE
- Type
float
-
for all the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
ThresholdAccuracy
(threshold)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Threshold accuracy metric.
The metric is defined for grayscale depth images.
-
sum_of_threshold_acc
¶ the sum of threshold accuracies for all the images in a branch
- Type
int
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-