datasetinsights.evaluation_metrics¶
datasetinsights.evaluation_metrics.average_log10_error¶
Average Log10 Error metric.
The average log10 error can be described as:
-
class
datasetinsights.evaluation_metrics.average_log10_error.
AverageLog10Error
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Log10 Error metric.
The metric is defined for grayscale depth images.
-
sum_of_log10_error
¶ the sum of the log10 errors for all the
- Type
float
-
images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
datasetinsights.evaluation_metrics.average_precision¶
-
class
datasetinsights.evaluation_metrics.average_precision.
AveragePrecision
(config: datasetinsights.evaluation_metrics.average_precision.DetectionConfig = None)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
-
compute
(boxes: Dict[str, Tuple[List[datasetinsights.io.bbox.BBox3D], List[datasetinsights.io.bbox.BBox3D]]] = None) → Dict[str, float]¶ calculate the mean ap for all classes over all distance thresholds defined in the config. The equation is described in the second figure in the nuscenes paper https://arxiv.org/pdf/1903.11027.pdf It is the normalized sum of the ROC curves for each class and distance. :param boxes: the predicted and ground truth bounding boxes per sample. :return: dictionary mapping each label to it’s average precision (averaged across all distance thresholds for that label)
-
reset
()¶
-
update
(boxes)¶
-
-
class
datasetinsights.evaluation_metrics.average_precision.
DetectionConfig
(class_range: Dict[str, int] = {'barrier': 30, 'bicycle': 40, 'bus': 50, 'car': 50, 'construction_vehicle': 50, 'motorcycle': 40, 'pedestrian': 40, 'traffic_cone': 30, 'trailer': 50, 'truck': 50}, dist_fcn: str = 'center_distance', dist_ths: List[float] = [0.5, 1.0, 2.0, 4.0], min_recall: float = 0.1, min_precision: float = 0.1, max_boxes_per_sample: float = 500, mean_ap_weight: int = 5)¶ Bases:
object
Data class that specifies the detection evaluation settings.
-
classmethod
deserialize
(content)¶ Initialize from serialized dictionary.
-
serialize
() → dict¶ Serialize instance into json-friendly format.
-
classmethod
-
datasetinsights.evaluation_metrics.average_precision.
calc_ap
(*, precision, min_recall: float, min_precision: float) → float¶ Calculated average precision.
-
datasetinsights.evaluation_metrics.average_precision.
center_distance
(*, gt_box: datasetinsights.io.bbox.BBox3D, pred_box: datasetinsights.io.bbox.BBox3D) → float¶ L2 distance between the box centers (xy only).
datasetinsights.evaluation_metrics.average_precision_2d¶
Reference.
We implement the average precision metrics for object detection based on this: https://github.com/rafaelpadilla/Object-Detection-Metrics#average-precision
We optimize the metric update algorithm based on this: https://github.com/rafaelpadilla/Object-Detection-Metrics/blob/master/lib/Evaluator.py
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
AveragePrecision
(iou_threshold=0.5, interpolation='EveryPointInterpolation', max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision metrics.
This metric would calculate average precision (AP) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
interpolation (string) – AP interoperation method name for AP calculation
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AP for each label.
- Returns
a dictionary of AP scores per label.
- Return type
dict
-
static
every_point_interpolated_ap
(recall, precision)¶ Calculating the interpolation performed in all points.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
- Returns
average precision for all points interpolation
- Return type
float
-
static
n_point_interpolated_ap
(recall, precision, point=11)¶ Calculating the n-point interpolation.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
point (int) – n, n-point interpolation
- Returns
average precision for n-point interpolation
- Return type
float
-
reset
()¶ Reset AP metrics.
-
update
(mini_batch)¶ Update records per mini batch
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
AveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision at IOU = 50%.
This implementation would calculate AP at IOU = 50% for each label.
-
TYPE
= 'metric_per_label'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
MeanAveragePrecisionAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics.
This implementation computes Mean Average Precision (mAP) metric, which is implemented as the Average Precision average over all labels and IOU = 0.5:0.95:0.05. The max detections per image is limited to 100.
\[mAP^{IoU=0.5:0.95:0.05} = mean_{label,IoU}\]\[AP^{label, IoU=0.5:0.95:0.05}\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAP over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.average_precision_2d.
MeanAveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics at IOU=50%.
This implementation would calculate mAP at IOU=50%.
\[mAP^{IoU=50} = mean_{label}AP^{label, IoU=50}\]-
TYPE
= 'scalar'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
datasetinsights.evaluation_metrics.average_precision_config¶
datasetinsights.evaluation_metrics.average_recall_2d¶
Reference.
http://cocodataset.org/#detection-eval https://arxiv.org/pdf/1502.05082.pdf https://github.com/rafaelpadilla/Object-Detection-Metrics/issues/22
-
class
datasetinsights.evaluation_metrics.average_recall_2d.
AverageRecall
(iou_threshold=0.5, max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Recall metrics.
This metric would calculate average recall (AR) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AR for each label.
- Returns
a dictionary of AR scores per label.
- Return type
dict
-
reset
()¶ Reset AR metrics.
-
update
(mini_batch)¶ Update records per mini batch.
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.average_recall_2d.
MeanAverageRecallAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Recall metrics.
This implementation computes Mean Average Recall (mAR) metric, which is implemented as the Average Recall average over all labels and IOU = 0.5:0.95:0.05. The max detections per image is limited to 100.
\[mAR^{IoU=0.5:0.95:0.05} = mean_{label,IoU}\]\[AR^{label, IoU=0.5:0.95:0.05}\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAR over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
datasetinsights.evaluation_metrics.average_relative_error¶
Average Relative Error metrics.
The average relative error can be described as:
-
class
datasetinsights.evaluation_metrics.average_relative_error.
AverageRelativeError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Relative Error metric.
The metric is defined for grayscale depth images.
-
sum_of_relative_error
¶ the sum of the relative errors for all
- Type
float
-
the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
datasetinsights.evaluation_metrics.base¶
-
class
datasetinsights.evaluation_metrics.base.
EvaluationMetric
¶ Bases:
object
Abstract base class for metrics.
-
COMPUTE_TYPE
= ''¶
-
abstract
compute
()¶
-
static
create
(name, **kwargs)¶ Create a new instance of the metric subclass
- Parameters
name (str) – unique identifier for a metric subclass
config (dict) – parameters specific to each metric subclass used to create a metric instance
- Returns
an instance of the specified metric subclass
-
static
find
(name)¶ Find EvaluationMetric subclass based on the given name
- Parameters
name (str) – unique identifier for a metric subclass
- Returns
a label of the specified metric subclass
-
abstract
reset
()¶
-
abstract
update
(output)¶
-
datasetinsights.evaluation_metrics.confusion_matrix¶
-
datasetinsights.evaluation_metrics.confusion_matrix.
precision_recall
(gt_bboxes, pred_bboxes, iou_thresh=0.5)¶ Calculate precision and recall per image.
-
datasetinsights.evaluation_metrics.confusion_matrix.
prediction_records
(gt_bboxes, pred_bboxes, iou_thresh=0.5)¶ Calculate prediction results per image.
datasetinsights.evaluation_metrics.exceptions¶
-
exception
datasetinsights.evaluation_metrics.exceptions.
NoSampleError
¶ Bases:
Exception
Raise when the number of samples is zero
datasetinsights.evaluation_metrics.iou¶
IoU evaluation metrics
-
class
datasetinsights.evaluation_metrics.iou.
IoU
(num_classes, output_transform=<function IoU.<lambda>>)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Intersection over Union (IoU) metric per class
The metric is defined for a pair of grayscale semantic segmentation images.
- Parameters
num_classes – number of calsses in the ground truth image
output_transform – function that transform output pair of images
-
cm
¶ pytorch ignite confusion matrix
- Type
ignite.metrics.ConfusionMatrix
-
object.
-
compute
()¶
-
reset
()¶
-
update
(output)¶
datasetinsights.evaluation_metrics.records¶
-
class
datasetinsights.evaluation_metrics.records.
Records
(iou_threshold=0.5)¶ Bases:
object
Save prediction records during update.
-
iou_threshold
¶ iou threshold
- Type
float
-
match_results
¶ save the results (TP/FP)
- Type
list
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
-
add_records
(gt_bboxes, pred_bboxes)¶ Add ground truth and prediction records.
- Parameters
gt_bboxes – ground truth bboxes in the current image
pred_bboxes – sorted predicition bboxes in the current image
-
reset
()¶
-
datasetinsights.evaluation_metrics.root_mean_square_error¶
Root Mean Square Error metrics.
The root mean square error can be described as:
-
class
datasetinsights.evaluation_metrics.root_mean_square_error.
RootMeanSquareError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Root Mean Square Error metric.
The metric is defined for grayscale depth images.
-
sum_of_root_mean_square_error
¶ the sum of RMSE
- Type
float
-
for all the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
datasetinsights.evaluation_metrics.threshold_accuracy¶
Average Log10 Error metric.
The average relative error can be described as:
-
class
datasetinsights.evaluation_metrics.threshold_accuracy.
ThresholdAccuracy
(threshold)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Threshold accuracy metric.
The metric is defined for grayscale depth images.
-
sum_of_threshold_acc
¶ the sum of threshold accuracies for all the images in a branch
- Type
int
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
AverageLog10Error
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Log10 Error metric.
The metric is defined for grayscale depth images.
-
sum_of_log10_error
¶ the sum of the log10 errors for all the
- Type
float
-
images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
AveragePrecision
(iou_threshold=0.5, interpolation='EveryPointInterpolation', max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision metrics.
This metric would calculate average precision (AP) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
interpolation (string) – AP interoperation method name for AP calculation
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AP for each label.
- Returns
a dictionary of AP scores per label.
- Return type
dict
-
static
every_point_interpolated_ap
(recall, precision)¶ Calculating the interpolation performed in all points.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
- Returns
average precision for all points interpolation
- Return type
float
-
static
n_point_interpolated_ap
(recall, precision, point=11)¶ Calculating the n-point interpolation.
- Parameters
recall (list) – recall history of the prediction
precision (list) – precision history of the prediction
point (int) – n, n-point interpolation
- Returns
average precision for n-point interpolation
- Return type
float
-
reset
()¶ Reset AP metrics.
-
update
(mini_batch)¶ Update records per mini batch
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.
AveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Precision at IOU = 50%.
This implementation would calculate AP at IOU = 50% for each label.
-
TYPE
= 'metric_per_label'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
AverageRecall
(iou_threshold=0.5, max_detections=100)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Average Recall metrics.
This metric would calculate average recall (AR) for each label under an iou threshold (default: 0.5). The maximum number of detections per image is limited (default: 100).
- Parameters
iou_threshold (float) – iou threshold (default: 0.5)
max_detections (int) – max detections per image (default: 100)
-
TYPE
= 'metric_per_label'¶
-
compute
()¶ Compute AR for each label.
- Returns
a dictionary of AR scores per label.
- Return type
dict
-
reset
()¶ Reset AR metrics.
-
update
(mini_batch)¶ Update records per mini batch.
- Parameters
mini_batch (list(list)) – a list which contains batch_size of gt bboxes and pred bboxes pair in each image. For example, if batch size = 2, mini_batch looks like: [[gt_bboxes1, pred_bboxes1], [gt_bboxes2, pred_bboxes2]] where gt_bboxes1, pred_bboxes1 contain gt bboxes and pred bboxes in one image.
-
class
datasetinsights.evaluation_metrics.
AverageRelativeError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Average Relative Error metric.
The metric is defined for grayscale depth images.
-
sum_of_relative_error
¶ the sum of the relative errors for all
- Type
float
-
the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
EvaluationMetric
¶ Bases:
object
Abstract base class for metrics.
-
COMPUTE_TYPE
= ''¶
-
abstract
compute
()¶
-
static
create
(name, **kwargs)¶ Create a new instance of the metric subclass
- Parameters
name (str) – unique identifier for a metric subclass
config (dict) – parameters specific to each metric subclass used to create a metric instance
- Returns
an instance of the specified metric subclass
-
static
find
(name)¶ Find EvaluationMetric subclass based on the given name
- Parameters
name (str) – unique identifier for a metric subclass
- Returns
a label of the specified metric subclass
-
abstract
reset
()¶
-
abstract
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
IoU
(num_classes, output_transform=<function IoU.<lambda>>)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Intersection over Union (IoU) metric per class
The metric is defined for a pair of grayscale semantic segmentation images.
- Parameters
num_classes – number of calsses in the ground truth image
output_transform – function that transform output pair of images
-
cm
¶ pytorch ignite confusion matrix
- Type
ignite.metrics.ConfusionMatrix
-
object.
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
class
datasetinsights.evaluation_metrics.
MeanAveragePrecisionAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics.
This implementation computes Mean Average Precision (mAP) metric, which is implemented as the Average Precision average over all labels and IOU = 0.5:0.95:0.05. The max detections per image is limited to 100.
\[mAP^{IoU=0.5:0.95:0.05} = mean_{label,IoU}\]\[AP^{label, IoU=0.5:0.95:0.05}\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAP over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
MeanAveragePrecisionIOU50
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Precision metrics at IOU=50%.
This implementation would calculate mAP at IOU=50%.
\[mAP^{IoU=50} = mean_{label}AP^{label, IoU=50}\]-
TYPE
= 'scalar'¶
-
compute
()¶
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
MeanAverageRecallAverageOverIOU
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
2D Bounding Box Mean Average Recall metrics.
This implementation computes Mean Average Recall (mAR) metric, which is implemented as the Average Recall average over all labels and IOU = 0.5:0.95:0.05. The max detections per image is limited to 100.
\[mAR^{IoU=0.5:0.95:0.05} = mean_{label,IoU}\]\[AR^{label, IoU=0.5:0.95:0.05}\]-
IOU_THRESHOULDS
= array([0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])¶
-
TYPE
= 'scalar'¶
-
compute
()¶ Compute mAR over IOU.
-
reset
()¶
-
update
(mini_batch)¶
-
-
class
datasetinsights.evaluation_metrics.
RootMeanSquareError
¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Root Mean Square Error metric.
The metric is defined for grayscale depth images.
-
sum_of_root_mean_square_error
¶ the sum of RMSE
- Type
float
-
for all the images in a branch
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-
-
class
datasetinsights.evaluation_metrics.
ThresholdAccuracy
(threshold)¶ Bases:
datasetinsights.evaluation_metrics.base.EvaluationMetric
Threshold accuracy metric.
The metric is defined for grayscale depth images.
-
sum_of_threshold_acc
¶ the sum of threshold accuracies for all the images in a branch
- Type
int
-
num_samples
¶ the number of samples in all mini-batches
- Type
int
-
compute
()¶
-
reset
()¶
-
update
(output)¶
-