datasetinsights.stats¶

datasetinsights.stats.visualization

datasetinsights.stats.statistics¶

class datasetinsights.stats.statistics.RenderedObjectInfo(data_root='/data', version='0.0.1', def_id=None)¶

Bases: object

Rendered Object Info in Captures

This metric stores common object info captured by a sensor in the simulation environment. It can be used to calculate object statistics such as object count, object rotation and visible pixels.

raw_table¶

rendered object info stored with a tidy

Type: pd.DataFrame

pandas dataframe. Columns "label_id", "instance_id", "visible_pixels",

"capture_id, "label_name".

Examples:

>>> # set the data root path to where data was stored
>>> data_root = "$HOME/data"
>>> # use rendered object info definition id
>>> definition_id = "659c6e36-f9f8-4dd6-9651-4a80e51eabc4"
>>> roinfo = RenderedObjectInfo(data_root, definition_id)
#total object count per label dataframe
>>> roinfo.total_counts()
label_id label_name count
       1    object1    10
       2    object2    21
#object count per capture dataframe
>>> roinfo.per_capture_counts()
capture_id  count
    qwerty     10
    asdfgh     21

COUNT_COLUMN = 'count'¶

INDEX_COLUMN = 'capture_id'¶

LABEL = 'label_id'¶

LABEL_READABLE = 'label_name'¶

VALUE_COLUMN = 'values'¶

num_captures()¶

Total number of captures

Returns: Total number of captures
Return type: integer

per_capture_counts()¶

Aggregate Object Counts Per Label

Returns

Total object counts table.: Columns “capture_id”, “count”

Return type

pd.DataFrame

total_counts()¶

Aggregate Total Object Counts Per Label

Returns

Total object counts table.: Columns “label_id”, “label_name”, “count”

Return type

pd.DataFrame

class datasetinsights.stats.RenderedObjectInfo(data_root='/data', version='0.0.1', def_id=None)¶

Bases: object

Rendered Object Info in Captures

This metric stores common object info captured by a sensor in the simulation environment. It can be used to calculate object statistics such as object count, object rotation and visible pixels.

raw_table¶

rendered object info stored with a tidy

Type: pd.DataFrame

pandas dataframe. Columns "label_id", "instance_id", "visible_pixels",

"capture_id, "label_name".

Examples:

>>> # set the data root path to where data was stored
>>> data_root = "$HOME/data"
>>> # use rendered object info definition id
>>> definition_id = "659c6e36-f9f8-4dd6-9651-4a80e51eabc4"
>>> roinfo = RenderedObjectInfo(data_root, definition_id)
#total object count per label dataframe
>>> roinfo.total_counts()
label_id label_name count
       1    object1    10
       2    object2    21
#object count per capture dataframe
>>> roinfo.per_capture_counts()
capture_id  count
    qwerty     10
    asdfgh     21

COUNT_COLUMN = 'count'¶

INDEX_COLUMN = 'capture_id'¶

LABEL = 'label_id'¶

LABEL_READABLE = 'label_name'¶

VALUE_COLUMN = 'values'¶

num_captures()¶

Total number of captures

Returns: Total number of captures
Return type: integer

per_capture_counts()¶

Aggregate Object Counts Per Label

Returns

Total object counts table.: Columns “capture_id”, “count”

Return type

pd.DataFrame

total_counts()¶

Aggregate Total Object Counts Per Label

Returns

Total object counts table.: Columns “label_id”, “label_name”, “count”

Return type

pd.DataFrame

datasetinsights.stats.bar_plot(df, x, y, title=None, x_title=None, y_title=None, x_tickangle=0, **kwargs)¶

Create plotly bar plot

Parameters

df (pd.DataFrame) – A pandas dataframe that contain bar plot data.
x (str) – The column name of the data in x-axis.
y (str) – The column name of the data in y-axis.
title (str, optional) – The title of this plot.
x_title (str, optional) – The x-axis title.
y_title (str, optional) – The y-axis title.
x_tickangle (int, optional) – X-axis text tickangle (default: 0)

This method can also take addition keyword arguments that can be passed to [plotly.express.bar](https://plotly.com/python-api-reference/generated/plotly.express.bar.html#plotly.express.bar) method.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({"id": [0, 1, 2], "name": ["a", "b", "c"],
...                    "count": [10, 20, 30]})
>>> bar_plot(df, x="id", y="count", hover_name="name")

datasetinsights.stats.grid_plot(images, figsize=(3, 5), img_type='rgb', titles=None)¶

Plot 2D array of images in grid. :param images: 2D array of images. :type images: list :param figsize: target figure size of each image in the grid. :type figsize: tuple :param Defaults to: :type Defaults to: 3, 5 :param img_type: image plot type (“rgb”, “gray”). Defaults to “rgb”. :type img_type: string :param titles: a list of titles. Defaults to None. :type titles: list[str]

Returns: matplotlib figure the combined grid plot.

datasetinsights.stats.histogram_plot(df, x, max_samples=None, title=None, x_title=None, y_title=None, **kwargs)¶

Create plotly histogram plot

Parameters

df (pd.DataFrame) – A pandas dataframe that contain raw data.
x (str) – The column name of the raw data for histogram plot.
title (str, optional) – The title of this plot.
x_title (str, optional) – The x-axis title.
y_title (str, optional) – The y-axis title.

This method can also take addition keyword arguments that can be passed to [plotly.express.histogram](https://plotly.com/python-api-reference/generated/plotly.express.histogram.html) method.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({"id": [0, 1, 2], "count": [10, 20, 30]})
>>> histogram_plot(df, x="count")

Histnorm plot using probability density:

>>> histogram_plot(df, x="count", histnorm="probability density")

datasetinsights.stats.model_performance_box_plot(title=None, mean_ap=None, mean_ap_50=None, mean_ar=None, range=[0, 1.0], **kwargs)¶

Create a box plot for one model performance :param title: title of the plot :type title: str :param mean_ap: a list of base mAP :type mean_ap: list :param mean_ap_50: a list of base mAP :type mean_ap_50: list :param mean_ar: a list of base mAP :type mean_ar: list :param range: the range of y axis. Defaults to [0, 1.0] :type range: list

Returns: A plotly.graph_objects.Figure containing the box plot

datasetinsights.stats.model_performance_comparison_box_plot(title=None, mean_ap_base=None, mean_ap_50_base=None, mean_ar_base=None, mean_ap_new=None, mean_ap_50_new=None, mean_ar_new=None, range=[0, 1.0], **kwargs)¶

Create a box plot for a base and new model performance :param title: title of the plot :type title: str :param mean_ap_base: a list of base mAP :type mean_ap_base: list :param mean_ap_50_base: a list of base mAP :type mean_ap_50_base: list :param mean_ar_base: a list of base mAP :type mean_ar_base: list :param mean_ap_new: a list of base mAP :type mean_ap_new: list :param mean_ap_50_new: a list of base mAP :type mean_ap_50_new: list :param mean_ar_new: a list of base mAP :type mean_ar_new: list :param range: the range of y axis. Defaults to [0, 1.0] :type range: list

Returns: A plotly.graph_objects.Figure containing the box plot

datasetinsights.stats.plot_bboxes(image, bboxes, label_mappings=None, colors=None)¶

Plot an image with bounding boxes.

For ground truth image, a color is randomly selected for each bounding box. For prediction, the color of a boundnig box is coded based on IOU value between prediction and ground truth bounding boxes. It is considered true positive if IOU >= 0.5. We only visualize prediction bounding box with score >= 0.5. For prediction, it’s a green box if the predicted bounding box can be matched to a ground truth bounding boxes. It’s a red box if the predicted bounding box can’t be matched to a ground truth bounding boxes.

Parameters

image (PIL Image) – a PIL image.
bboxes (list) – a list of BBox2D objects.
label_mappings (dict) – a dict of {label_id: label_name} mapping
to None. (Defaults) –
colors (list) – a color list for boxes. Defaults to None.
colors = None (If) –
will randomly assign PIL.COLORS for each box. (it) –

Returns

a PIL image with bounding boxes drawn.

Return type

PIL Image

datasetinsights.stats.plot_keypoints(image, annotations, templates, visual_width=6)¶

Plot an image with keypoint data.

Currently only used for ground truth info. Keypoints and colors are defined in templates.

Parameters

image (PIL Image) – a PIL image.
annotations (list) – a list of keypoint annotation data.
templates (list) – a list of keypoint templates.
visual_width (int) – the width of the visual elements

Returns

a PIL image with keypoints drawn.

Return type

PIL Image

datasetinsights.stats.rotation_plot(df, x, y, z=None, max_samples=None, title=None, **kwargs)¶

Create a plotly 3d rotation plot :param df: A pandas dataframe that contains the raw data. :type df: pd.DataFrame :param x: The column name containing x rotations. :type x: str :param y: The column name containing y rotations. :type y: str :param z: The column name containing z rotations. :type z: str, optional :param title: The title of this plot. :type title: str, optional

This method can also take addition keyword arguments that can be passed to [plotly.graph_objects.Scatter3d](https://plotly.com/python-api-reference/generated/plotly.graph_objects.Scatter3d.html) method.

Returns: A plotly.graph_objects.Figure containing the scatter plot