datasetinsights.stats

datasetinsights.stats.statistics

class datasetinsights.stats.statistics.RenderedObjectInfo(data_root='/data', version='0.0.1', def_id=None)

Bases: object

Rendered Object Info in Captures

This metric stores common object info captured by a sensor in the simulation environment. It can be used to calculate object statistics such as object count, object rotation and visible pixels.

raw_table

rendered object info stored with a tidy

Type

pd.DataFrame

pandas dataframe. Columns "label_id", "instance_id", "visible_pixels",
"capture_id, "label_name".

Examples:

>>> # set the data root path to where data was stored
>>> data_root = "$HOME/data"
>>> # use rendered object info definition id
>>> definition_id = "659c6e36-f9f8-4dd6-9651-4a80e51eabc4"
>>> roinfo = RenderedObjectInfo(data_root, definition_id)
#total object count per label dataframe
>>> roinfo.total_counts()
label_id label_name count
       1    object1    10
       2    object2    21
#object count per capture dataframe
>>> roinfo.per_capture_counts()
capture_id  count
    qwerty     10
    asdfgh     21
COUNT_COLUMN = 'count'
INDEX_COLUMN = 'capture_id'
LABEL = 'label_id'
LABEL_READABLE = 'label_name'
VALUE_COLUMN = 'values'
num_captures()

Total number of captures

Returns

Total number of captures

Return type

integer

per_capture_counts()

Aggregate Object Counts Per Label

Returns

Total object counts table.

Columns “capture_id”, “count”

Return type

pd.DataFrame

total_counts()

Aggregate Total Object Counts Per Label

Returns

Total object counts table.

Columns “label_id”, “label_name”, “count”

Return type

pd.DataFrame

class datasetinsights.stats.RenderedObjectInfo(data_root='/data', version='0.0.1', def_id=None)

Bases: object

Rendered Object Info in Captures

This metric stores common object info captured by a sensor in the simulation environment. It can be used to calculate object statistics such as object count, object rotation and visible pixels.

raw_table

rendered object info stored with a tidy

Type

pd.DataFrame

pandas dataframe. Columns "label_id", "instance_id", "visible_pixels",
"capture_id, "label_name".

Examples:

>>> # set the data root path to where data was stored
>>> data_root = "$HOME/data"
>>> # use rendered object info definition id
>>> definition_id = "659c6e36-f9f8-4dd6-9651-4a80e51eabc4"
>>> roinfo = RenderedObjectInfo(data_root, definition_id)
#total object count per label dataframe
>>> roinfo.total_counts()
label_id label_name count
       1    object1    10
       2    object2    21
#object count per capture dataframe
>>> roinfo.per_capture_counts()
capture_id  count
    qwerty     10
    asdfgh     21
COUNT_COLUMN = 'count'
INDEX_COLUMN = 'capture_id'
LABEL = 'label_id'
LABEL_READABLE = 'label_name'
VALUE_COLUMN = 'values'
num_captures()

Total number of captures

Returns

Total number of captures

Return type

integer

per_capture_counts()

Aggregate Object Counts Per Label

Returns

Total object counts table.

Columns “capture_id”, “count”

Return type

pd.DataFrame

total_counts()

Aggregate Total Object Counts Per Label

Returns

Total object counts table.

Columns “label_id”, “label_name”, “count”

Return type

pd.DataFrame

datasetinsights.stats.bar_plot(df, x, y, title=None, x_title=None, y_title=None, x_tickangle=0, **kwargs)

Create plotly bar plot

Parameters
  • df (pd.DataFrame) – A pandas dataframe that contain bar plot data.

  • x (str) – The column name of the data in x-axis.

  • y (str) – The column name of the data in y-axis.

  • title (str, optional) – The title of this plot.

  • x_title (str, optional) – The x-axis title.

  • y_title (str, optional) – The y-axis title.

  • x_tickangle (int, optional) – X-axis text tickangle (default: 0)

This method can also take addition keyword arguments that can be passed to [plotly.express.bar](https://plotly.com/python-api-reference/generated/plotly.express.bar.html#plotly.express.bar) method.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({"id": [0, 1, 2], "name": ["a", "b", "c"],
...                    "count": [10, 20, 30]})
>>> bar_plot(df, x="id", y="count", hover_name="name")
datasetinsights.stats.grid_plot(images, figsize=(3, 5), img_type='rgb', titles=None)

Plot 2D array of images in grid. :param images: 2D array of images. :type images: list :param figsize: target figure size of each image in the grid. :type figsize: tuple :param Defaults to: :type Defaults to: 3, 5 :param img_type: image plot type (“rgb”, “gray”). Defaults to “rgb”. :type img_type: string :param titles: a list of titles. Defaults to None. :type titles: list[str]

Returns

matplotlib figure the combined grid plot.

datasetinsights.stats.histogram_plot(df, x, max_samples=None, title=None, x_title=None, y_title=None, **kwargs)

Create plotly histogram plot

Parameters
  • df (pd.DataFrame) – A pandas dataframe that contain raw data.

  • x (str) – The column name of the raw data for histogram plot.

  • title (str, optional) – The title of this plot.

  • x_title (str, optional) – The x-axis title.

  • y_title (str, optional) – The y-axis title.

This method can also take addition keyword arguments that can be passed to [plotly.express.histogram](https://plotly.com/python-api-reference/generated/plotly.express.histogram.html) method.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({"id": [0, 1, 2], "count": [10, 20, 30]})
>>> histogram_plot(df, x="count")

Histnorm plot using probability density:

>>> histogram_plot(df, x="count", histnorm="probability density")
datasetinsights.stats.model_performance_box_plot(title=None, mean_ap=None, mean_ap_50=None, mean_ar=None, range=[0, 1.0], **kwargs)

Create a box plot for one model performance :param title: title of the plot :type title: str :param mean_ap: a list of base mAP :type mean_ap: list :param mean_ap_50: a list of base mAP :type mean_ap_50: list :param mean_ar: a list of base mAP :type mean_ar: list :param range: the range of y axis. Defaults to [0, 1.0] :type range: list

Returns

A plotly.graph_objects.Figure containing the box plot

datasetinsights.stats.model_performance_comparison_box_plot(title=None, mean_ap_base=None, mean_ap_50_base=None, mean_ar_base=None, mean_ap_new=None, mean_ap_50_new=None, mean_ar_new=None, range=[0, 1.0], **kwargs)

Create a box plot for a base and new model performance :param title: title of the plot :type title: str :param mean_ap_base: a list of base mAP :type mean_ap_base: list :param mean_ap_50_base: a list of base mAP :type mean_ap_50_base: list :param mean_ar_base: a list of base mAP :type mean_ar_base: list :param mean_ap_new: a list of base mAP :type mean_ap_new: list :param mean_ap_50_new: a list of base mAP :type mean_ap_50_new: list :param mean_ar_new: a list of base mAP :type mean_ar_new: list :param range: the range of y axis. Defaults to [0, 1.0] :type range: list

Returns

A plotly.graph_objects.Figure containing the box plot

datasetinsights.stats.plot_bboxes(image, bboxes, label_mappings=None, colors=None)

Plot an image with bounding boxes.

For ground truth image, a color is randomly selected for each bounding box. For prediction, the color of a boundnig box is coded based on IOU value between prediction and ground truth bounding boxes. It is considered true positive if IOU >= 0.5. We only visualize prediction bounding box with score >= 0.5. For prediction, it’s a green box if the predicted bounding box can be matched to a ground truth bounding boxes. It’s a red box if the predicted bounding box can’t be matched to a ground truth bounding boxes.

Parameters
  • image (PIL Image) – a PIL image.

  • bboxes (list) – a list of BBox2D objects.

  • label_mappings (dict) – a dict of {label_id: label_name} mapping

  • to None. (Defaults) –

  • colors (list) – a color list for boxes. Defaults to None.

  • colors = None (If) –

  • will randomly assign PIL.COLORS for each box. (it) –

Returns

a PIL image with bounding boxes drawn.

Return type

PIL Image

datasetinsights.stats.plot_keypoints(image, annotations, templates, visual_width=6)

Plot an image with keypoint data.

Currently only used for ground truth info. Keypoints and colors are defined in templates.

Parameters
  • image (PIL Image) – a PIL image.

  • annotations (list) – a list of keypoint annotation data.

  • templates (list) – a list of keypoint templates.

  • visual_width (int) – the width of the visual elements

Returns

a PIL image with keypoints drawn.

Return type

PIL Image

datasetinsights.stats.rotation_plot(df, x, y, z=None, max_samples=None, title=None, **kwargs)

Create a plotly 3d rotation plot :param df: A pandas dataframe that contains the raw data. :type df: pd.DataFrame :param x: The column name containing x rotations. :type x: str :param y: The column name containing y rotations. :type y: str :param z: The column name containing z rotations. :type z: str, optional :param title: The title of this plot. :type title: str, optional

This method can also take addition keyword arguments that can be passed to [plotly.graph_objects.Scatter3d](https://plotly.com/python-api-reference/generated/plotly.graph_objects.Scatter3d.html) method.

Returns

A plotly.graph_objects.Figure containing the scatter plot