datasetinsights.datasets.unity_perception¶

datasetinsights.datasets.unity_perception.captures¶

Load Synthetic dataset captures and annotations tables

class datasetinsights.datasets.unity_perception.captures.Captures(data_root='/data', version='0.0.1')¶

Bases: object

Load captures table

A capture record stores the relationship between a captured file, a collection of annotations, and extra metadata that describes this capture. For more detail, see schema design here:

captures

Examples:

>>> captures = Captures(data_root="/data")
#captures class automatically loads the captures (e.g. lidar scan,
image, depth map) and the annotations (e.g semantic segmentation
labels, bounding boxes, etc.)
>>> data = captures.filter(def_id="6716c783-1c0e-44ae-b1b5-7f068454b66e") # noqa E501 table command not be broken down into multiple lines
#return the captures and annotations filtered by the annotation
definition id

captures¶

a collection of captures without annotations

Type: pd.DataFrame

annotations¶

a collection of annotations

Type: pd.DataFrame

FILE_PATTERN = '**/captures_*.json'¶

TABLE_NAME = 'captures'¶

filter(def_id)¶

Get captures and annotations filtered by annotation definition id captures

Parameters

def_id (int) – annotation definition id used to filter results

Returns

A pandas dataframe with captures and annotations Columns: ‘id’ (capture id), ‘sequence_id’, ‘step’, ‘timestamp’, ‘sensor’, ‘ego’,

’filename’, ‘format’, ‘annotation.id’, ‘annotation.annotation_definition’,’annotation.values’

Raises

DefinitionIDError – Raised if none of the annotation records in the combined annotation and captures dataframe match the def_id specified as a parameter.

Example Returned Dataframe (first row):

label_id(int)	sequence_id(str)	step(int)	timestamp (float)	sensor (dict)	ego (dict)	filename(str)	format (str)	annotation.id (str)	annotation.annotation_definition(str)	annotation.values
2	None	50	4.9	{‘sensor_id’: ‘dDa873b…’, ‘ego_id’: ‘44ca9…’, ‘modality’: ‘camera’,’translation’: [0.0, 0.0, 0.0], ‘rotation’: [0.0, 0.0, 0.0, 1.0],’scale’: 0.344577253}	…	RGB3/asd.png	PNG	ace0	6716c	[{‘label_id’: 34, ‘label_name’: ‘snack_chips_pringles’,…’height’: 118.0}, {‘label_id’: 35, ‘… ‘height’: 91.0}…]

datasetinsights.datasets.unity_perception.exceptions¶

exception datasetinsights.datasets.unity_perception.exceptions.DefinitionIDError¶

Bases: Exception

Raise when a given definition id can’t be found.

datasetinsights.datasets.unity_perception.metrics¶

Load Synthetic dataset Metrics

class datasetinsights.datasets.unity_perception.metrics.Metrics(data_root='/data', version='0.0.1')¶

Bases: object

Load metrics table

Metrics store extra metadata that can be used to describe a particular sequence, capture or annotation. Metric records are stored as arbitrary number (M) of key-value pairs. For more detail, see schema design doc: metrics

metrics¶

a collection of metrics records

Type: dask.bag.core.Bag

Examples

>>> metrics = Metrics(data_root="/data")
>>> metrics_df = metrics.filter_metrics(def_id="my_definition_id")
#metrics_df now contains all the metrics data corresponding to
"my_definition_id"

One example of metrics_df (first row shown below):

label_id(int)	instance_id(int)	visible_pixels(int)
2	2	2231

FILE_PATTERN = '**/metrics_*.json'¶

TABLE_NAME = 'metrics'¶

filter_metrics(def_id)¶

Get all metrics filtered by a given metric definition id

Parameters

def_id (str) – metric definition id used to filter results

Raises

DefinitionIDError – raised if no metrics records match the given
def_id –

Returns (pd.DataFrame): Columns: “label_id”, “capture_id”, “annotation_id”, “sequence_id”, “step”

datasetinsights.datasets.unity_perception.references¶

Load Synthetic dataset references tables

class datasetinsights.datasets.unity_perception.references.AnnotationDefinitions(data_root, version='0.0.1')¶

Bases: object

Load annotation_definitions table

For more detail, see schema design here: annotation_definitions.json

table¶

a collection of annotation_definitions records

Type: pd

FILE_PATTERN = '**/annotation_definitions.json'¶

TABLE_NAME = 'annotation_definitions'¶

find_by_name(pattern)¶

Get the annotation definition by matching patterns

This method will try to match the pattern of the annotation definition by name to determine

Parameters

pattern (srt) – the regex pattern

Returns

a dictionary containing the annotation definition

Return type

dict

Raises

NoRecordError – If no annotation are found for a given definition id
DuplicateRecordError – If more than one record is found by the given pattern

get_definition(def_id)¶

Get the annotation definition for a given definition id

Parameters: def_id (int) – annotation definition id used to filter results
Returns: a dictionary containing the annotation definition
Return type: dict
Raises: NoRecordError – If no annotation are found for a given definition id

load_annotation_definitions(data_root, version)¶

Load annotation definition files.

For more detail, see schema design here: annotation_definitions.json

Parameters: data_root (str) – the root directory of the dataset containing

tables: version (str): desired schema version

Returns: A Pandas dataframe with annotation definition records. Columns: ‘id’ (annotation id), ‘name’ (annotation name), ‘description’ (string description), ‘format’ (string describing format), ‘spec’ ( Format-specific specification for the annotation values)

class datasetinsights.datasets.unity_perception.references.Egos(data_root, version='0.0.1')¶

Bases: object

Load egos table

For more detail, see schema design here: egos.json

table¶

a collection of egos records

Type: pd

FILE_PATTERN = '**/egos.json'¶

TABLE_NAME = 'egos'¶

load_egos(data_root, version)¶

Load egos files. For more detail, see schema design here:

egos.json

Parameters

data_root (str) – the root directory of the dataset containing
tables (ego) –
version (str) – desired schema version

Returns

id (ego id) and description

Return type

A pandas dataframe with all ego records with two columns

class datasetinsights.datasets.unity_perception.references.MetricDefinitions(data_root, version='0.0.1')¶

Bases: object

Load metric_definitions table

For more detail, see schema design here:

metric_definitions.json

table¶

a collection of metric_definitions records with columns: id

Type: pd

(id for metric definition), name, description, spec (definition specific spec)

FILE_PATTERN = '**/metric_definitions.json'¶

TABLE_NAME = 'metric_definitions'¶

get_definition(def_id)¶

Get the metric definition for a given definition id

Parameters: def_id (int) – metric definition id used to filter results
Returns: a dictionary containing metric definition

load_metric_definitions(data_root, version)¶

Load metric definition files.

metric_definitions.json

Args:
data_root (str): the root directory of the dataset containing tables version (str): desired schema version

Returns:
A Pandas dataframe with metric definition records.

a collection of metric_definitions records with columns: id

(id for metric definition), name, description, spec (definition specific spec)

class datasetinsights.datasets.unity_perception.references.Sensors(data_root, version='0.0.1')¶

Bases: object

Load sensors table

For more detail, see schema design here:

sensors.json

table¶

a collection of sensors records with columns:

Type: pd

'id'

Type: sensor id

({camera, lidar, radar, sonar,...} -- Sensor modality), 'description'

FILE_PATTERN = '**/sensors.json'¶

TABLE_NAME = 'sensors'¶

load_sensors(data_root, version)¶

Load sensors files.

For more detail, see schema design here:

sensors.json

Parameters: data_root (str) – the root directory of the dataset containing

tables: version (str): desired schema version

Returns
Return type: A pandas dataframe with all sensors records with columns

‘id’ (sensor id), ‘ego_id’, ‘modality’ ({camera, lidar, radar, sonar,…} – Sensor modality), ‘description’

datasetinsights.datasets.unity_perception.tables¶

class datasetinsights.datasets.unity_perception.tables.FileType(value)¶

Bases: enum.Enum

An enumeration.

BINARY = 'binary'¶

CAPTURE = 'capture'¶

METRIC = 'metric'¶

REFERENCE = 'reference'¶

class datasetinsights.datasets.unity_perception.tables.Table(file, pattern, filetype)¶

Bases: tuple

property file¶: Alias for field number 0

property filetype¶: Alias for field number 2

property pattern¶: Alias for field number 1

datasetinsights.datasets.unity_perception.tables.glob(data_root, pattern)¶

Find all matching files in a directory.

Parameters

data_root (str) – directory containing capture files
pattern (str) – Unix file pattern

Yields

str – matched filenames in a directory

datasetinsights.datasets.unity_perception.tables.load_table(json_file, table_name, version, **kwargs)¶

Load records from json files into a pandas table

Parameters

json_file (str) – filename to json.
table_name (str) – table name in the json file to be loaded
version (str) – requested version of this table
**kwargs – arbitrary keyword arguments to be passed to pandas’ json_normalize method.

Returns

a pandas dataframe of the loaded table.

Raises

VersionError – If the version in json file does not match the requested
version. –

datasetinsights.datasets.unity_perception.validation¶

Validate Simulation Data

exception datasetinsights.datasets.unity_perception.validation.DuplicateRecordError¶

Bases: Exception

Raise when the definition file has duplicate definition id

exception datasetinsights.datasets.unity_perception.validation.NoRecordError¶

Bases: Exception

Raise when no record is found matching a given definition id

exception datasetinsights.datasets.unity_perception.validation.VersionError¶

Bases: Exception

Raise when the data file version does not match

datasetinsights.datasets.unity_perception.validation.check_duplicate_records(table, column, table_name)¶

Check if table has duplicate records for a given column

Parameters

table (pd.DataFrame) – a pandas dataframe
column (str) – the column where no duplication is allowed
table_name (str) – table name

Raises

DuplicateRecordError – If duplicate records are found in a column

datasetinsights.datasets.unity_perception.validation.verify_version(json_data, version)¶

Verify json schema version

Parameters

json_data (json) – a json object loaded from file.
version (str) – string of the requested version.

Raises

VersionError – If the version in json file does not match the requested

version.

class datasetinsights.datasets.unity_perception.AnnotationDefinitions(data_root, version='0.0.1')¶

Bases: object

Load annotation_definitions table

For more detail, see schema design here: annotation_definitions.json

table¶

a collection of annotation_definitions records

Type: pd

FILE_PATTERN = '**/annotation_definitions.json'¶

TABLE_NAME = 'annotation_definitions'¶

find_by_name(pattern)¶

Get the annotation definition by matching patterns

This method will try to match the pattern of the annotation definition by name to determine

Parameters

pattern (srt) – the regex pattern

Returns

a dictionary containing the annotation definition

Return type

dict

Raises

NoRecordError – If no annotation are found for a given definition id
DuplicateRecordError – If more than one record is found by the given pattern

get_definition(def_id)¶

Get the annotation definition for a given definition id

Parameters: def_id (int) – annotation definition id used to filter results
Returns: a dictionary containing the annotation definition
Return type: dict
Raises: NoRecordError – If no annotation are found for a given definition id

load_annotation_definitions(data_root, version)¶

Load annotation definition files.

For more detail, see schema design here: annotation_definitions.json

Parameters: data_root (str) – the root directory of the dataset containing

tables: version (str): desired schema version

Returns: A Pandas dataframe with annotation definition records. Columns: ‘id’ (annotation id), ‘name’ (annotation name), ‘description’ (string description), ‘format’ (string describing format), ‘spec’ ( Format-specific specification for the annotation values)

class datasetinsights.datasets.unity_perception.Captures(data_root='/data', version='0.0.1')¶

Bases: object

Load captures table

A capture record stores the relationship between a captured file, a collection of annotations, and extra metadata that describes this capture. For more detail, see schema design here:

captures

Examples:

>>> captures = Captures(data_root="/data")
#captures class automatically loads the captures (e.g. lidar scan,
image, depth map) and the annotations (e.g semantic segmentation
labels, bounding boxes, etc.)
>>> data = captures.filter(def_id="6716c783-1c0e-44ae-b1b5-7f068454b66e") # noqa E501 table command not be broken down into multiple lines
#return the captures and annotations filtered by the annotation
definition id

captures¶

a collection of captures without annotations

Type: pd.DataFrame

annotations¶

a collection of annotations

Type: pd.DataFrame

FILE_PATTERN = '**/captures_*.json'¶

TABLE_NAME = 'captures'¶

filter(def_id)¶

Get captures and annotations filtered by annotation definition id captures

Parameters

def_id (int) – annotation definition id used to filter results

Returns

A pandas dataframe with captures and annotations Columns: ‘id’ (capture id), ‘sequence_id’, ‘step’, ‘timestamp’, ‘sensor’, ‘ego’,

’filename’, ‘format’, ‘annotation.id’, ‘annotation.annotation_definition’,’annotation.values’

Raises

DefinitionIDError – Raised if none of the annotation records in the combined annotation and captures dataframe match the def_id specified as a parameter.

Example Returned Dataframe (first row):

label_id(int)	sequence_id(str)	step(int)	timestamp (float)	sensor (dict)	ego (dict)	filename(str)	format (str)	annotation.id (str)	annotation.annotation_definition(str)	annotation.values
2	None	50	4.9	{‘sensor_id’: ‘dDa873b…’, ‘ego_id’: ‘44ca9…’, ‘modality’: ‘camera’,’translation’: [0.0, 0.0, 0.0], ‘rotation’: [0.0, 0.0, 0.0, 1.0],’scale’: 0.344577253}	…	RGB3/asd.png	PNG	ace0	6716c	[{‘label_id’: 34, ‘label_name’: ‘snack_chips_pringles’,…’height’: 118.0}, {‘label_id’: 35, ‘… ‘height’: 91.0}…]

class datasetinsights.datasets.unity_perception.Egos(data_root, version='0.0.1')¶

Bases: object

Load egos table

For more detail, see schema design here: egos.json

table¶

a collection of egos records

Type: pd

FILE_PATTERN = '**/egos.json'¶

TABLE_NAME = 'egos'¶

load_egos(data_root, version)¶

Load egos files. For more detail, see schema design here:

egos.json

Parameters

data_root (str) – the root directory of the dataset containing
tables (ego) –
version (str) – desired schema version

Returns

id (ego id) and description

Return type

A pandas dataframe with all ego records with two columns

class datasetinsights.datasets.unity_perception.MetricDefinitions(data_root, version='0.0.1')¶

Bases: object

Load metric_definitions table

For more detail, see schema design here:

metric_definitions.json

table¶

a collection of metric_definitions records with columns: id

Type: pd

(id for metric definition), name, description, spec (definition specific spec)

FILE_PATTERN = '**/metric_definitions.json'¶

TABLE_NAME = 'metric_definitions'¶

get_definition(def_id)¶

Get the metric definition for a given definition id

Parameters: def_id (int) – metric definition id used to filter results
Returns: a dictionary containing metric definition

load_metric_definitions(data_root, version)¶

Load metric definition files.

metric_definitions.json

Args:
data_root (str): the root directory of the dataset containing tables version (str): desired schema version

Returns:
A Pandas dataframe with metric definition records.

a collection of metric_definitions records with columns: id

(id for metric definition), name, description, spec (definition specific spec)

class datasetinsights.datasets.unity_perception.Metrics(data_root='/data', version='0.0.1')¶

Bases: object

Load metrics table

Metrics store extra metadata that can be used to describe a particular sequence, capture or annotation. Metric records are stored as arbitrary number (M) of key-value pairs. For more detail, see schema design doc: metrics

metrics¶

a collection of metrics records

Type: dask.bag.core.Bag

Examples

>>> metrics = Metrics(data_root="/data")
>>> metrics_df = metrics.filter_metrics(def_id="my_definition_id")
#metrics_df now contains all the metrics data corresponding to
"my_definition_id"

One example of metrics_df (first row shown below):

label_id(int)	instance_id(int)	visible_pixels(int)
2	2	2231

FILE_PATTERN = '**/metrics_*.json'¶

TABLE_NAME = 'metrics'¶

filter_metrics(def_id)¶

Get all metrics filtered by a given metric definition id

Parameters

def_id (str) – metric definition id used to filter results

Raises

DefinitionIDError – raised if no metrics records match the given
def_id –

Returns (pd.DataFrame): Columns: “label_id”, “capture_id”, “annotation_id”, “sequence_id”, “step”

class datasetinsights.datasets.unity_perception.Sensors(data_root, version='0.0.1')¶

Bases: object

Load sensors table

For more detail, see schema design here:

sensors.json

table¶

a collection of sensors records with columns:

Type: pd

'id'

Type: sensor id

({camera, lidar, radar, sonar,...} -- Sensor modality), 'description'

FILE_PATTERN = '**/sensors.json'¶

TABLE_NAME = 'sensors'¶

load_sensors(data_root, version)¶

Load sensors files.

For more detail, see schema design here:

sensors.json

Parameters: data_root (str) – the root directory of the dataset containing

tables: version (str): desired schema version

Returns
Return type: A pandas dataframe with all sensors records with columns

‘id’ (sensor id), ‘ego_id’, ‘modality’ ({camera, lidar, radar, sonar,…} – Sensor modality), ‘description’