datasetinsights.datasets.unity_perception

datasetinsights.datasets.unity_perception.captures

Load Synthetic dataset captures and annotations tables

class datasetinsights.datasets.unity_perception.captures.Captures(data_root='/data', version='0.0.1')

Bases: object

Load captures table

A capture record stores the relationship between a captured file, a collection of annotations, and extra metadata that describes this capture. For more detail, see schema design here:

captures

Examples:

>>> captures = Captures(data_root="/data")
#captures class automatically loads the captures (e.g. lidar scan,
image, depth map) and the annotations (e.g semantic segmentation
labels, bounding boxes, etc.)
>>> data = captures.filter(def_id="6716c783-1c0e-44ae-b1b5-7f068454b66e") # noqa E501 table command not be broken down into multiple lines
#return the captures and annotations filtered by the annotation
definition id
captures

a collection of captures without annotations

Type

pd.DataFrame

annotations

a collection of annotations

Type

pd.DataFrame

FILE_PATTERN = '**/captures_*.json'
TABLE_NAME = 'captures'
filter(def_id)

Get captures and annotations filtered by annotation definition id captures

Parameters

def_id (int) – annotation definition id used to filter results

Returns

A pandas dataframe with captures and annotations Columns: ‘id’ (capture id), ‘sequence_id’, ‘step’, ‘timestamp’, ‘sensor’, ‘ego’,

’filename’, ‘format’, ‘annotation.id’, ‘annotation.annotation_definition’,’annotation.values’

Raises

DefinitionIDError – Raised if none of the annotation records in the combined annotation and captures dataframe match the def_id specified as a parameter.

Example Returned Dataframe (first row):

label_id(int)

sequence_id(str)

step(int)

timestamp (float)

sensor (dict)

ego (dict)

filename(str)

format (str)

annotation.id (str)

annotation.annotation_definition(str)

annotation.values

2

None

50

4.9

{‘sensor_id’: ‘dDa873b…’, ‘ego_id’: ‘44ca9…’, ‘modality’: ‘camera’,’translation’: [0.0, 0.0, 0.0], ‘rotation’: [0.0, 0.0, 0.0, 1.0],’scale’: 0.344577253}

RGB3/asd.png

PNG

ace0

6716c

[{‘label_id’: 34, ‘label_name’: ‘snack_chips_pringles’,…’height’: 118.0}, {‘label_id’: 35, ‘… ‘height’: 91.0}…]

datasetinsights.datasets.unity_perception.exceptions

exception datasetinsights.datasets.unity_perception.exceptions.DefinitionIDError

Bases: Exception

Raise when a given definition id can’t be found.

datasetinsights.datasets.unity_perception.metrics

Load Synthetic dataset Metrics

class datasetinsights.datasets.unity_perception.metrics.Metrics(data_root='/data', version='0.0.1')

Bases: object

Load metrics table

Metrics store extra metadata that can be used to describe a particular sequence, capture or annotation. Metric records are stored as arbitrary number (M) of key-value pairs. For more detail, see schema design doc: metrics

metrics

a collection of metrics records

Type

dask.bag.core.Bag

Examples

>>> metrics = Metrics(data_root="/data")
>>> metrics_df = metrics.filter_metrics(def_id="my_definition_id")
#metrics_df now contains all the metrics data corresponding to
"my_definition_id"

One example of metrics_df (first row shown below):

label_id(int)

instance_id(int)

visible_pixels(int)

2

2

2231

FILE_PATTERN = '**/metrics_*.json'
TABLE_NAME = 'metrics'
filter_metrics(def_id)

Get all metrics filtered by a given metric definition id

Parameters

def_id (str) – metric definition id used to filter results

Raises

Returns (pd.DataFrame): Columns: “label_id”, “capture_id”, “annotation_id”, “sequence_id”, “step”

datasetinsights.datasets.unity_perception.references

Load Synthetic dataset references tables

class datasetinsights.datasets.unity_perception.references.AnnotationDefinitions(data_root, version='0.0.1')

Bases: object

Load annotation_definitions table

For more detail, see schema design here: annotation_definitions.json

table

a collection of annotation_definitions records

Type

pd

FILE_PATTERN = '**/annotation_definitions.json'
TABLE_NAME = 'annotation_definitions'
find_by_name(pattern)

Get the annotation definition by matching patterns

This method will try to match the pattern of the annotation definition by name to determine

Parameters

pattern (srt) – the regex pattern

Returns

a dictionary containing the annotation definition

Return type

dict

Raises
get_definition(def_id)

Get the annotation definition for a given definition id

Parameters

def_id (int) – annotation definition id used to filter results

Returns

a dictionary containing the annotation definition

Return type

dict

Raises

NoRecordError – If no annotation are found for a given definition id

load_annotation_definitions(data_root, version)

Load annotation definition files.

For more detail, see schema design here: annotation_definitions.json

Parameters

data_root (str) – the root directory of the dataset containing

tables

version (str): desired schema version

Returns

A Pandas dataframe with annotation definition records. Columns: ‘id’ (annotation id), ‘name’ (annotation name), ‘description’ (string description), ‘format’ (string describing format), ‘spec’ ( Format-specific specification for the annotation values)

class datasetinsights.datasets.unity_perception.references.Egos(data_root, version='0.0.1')

Bases: object

Load egos table

For more detail, see schema design here: egos.json

table

a collection of egos records

Type

pd

FILE_PATTERN = '**/egos.json'
TABLE_NAME = 'egos'
load_egos(data_root, version)

Load egos files. For more detail, see schema design here:

egos.json

Parameters
  • data_root (str) – the root directory of the dataset containing

  • tables (ego) –

  • version (str) – desired schema version

Returns

id (ego id) and description

Return type

A pandas dataframe with all ego records with two columns

class datasetinsights.datasets.unity_perception.references.MetricDefinitions(data_root, version='0.0.1')

Bases: object

Load metric_definitions table

For more detail, see schema design here:

metric_definitions.json

table

a collection of metric_definitions records with columns: id

Type

pd

(id for metric definition), name, description, spec (definition specific spec)

FILE_PATTERN = '**/metric_definitions.json'
TABLE_NAME = 'metric_definitions'
get_definition(def_id)

Get the metric definition for a given definition id

Parameters

def_id (int) – metric definition id used to filter results

Returns

a dictionary containing metric definition

load_metric_definitions(data_root, version)

Load metric definition files.

metric_definitions.json

Args:

data_root (str): the root directory of the dataset containing tables version (str): desired schema version

Returns:

A Pandas dataframe with metric definition records.

a collection of metric_definitions records with columns: id

(id for metric definition), name, description, spec (definition specific spec)

class datasetinsights.datasets.unity_perception.references.Sensors(data_root, version='0.0.1')

Bases: object

Load sensors table

For more detail, see schema design here:

sensors.json

table

a collection of sensors records with columns:

Type

pd

'id'
Type

sensor id

({camera, lidar, radar, sonar,...} -- Sensor modality), 'description'
FILE_PATTERN = '**/sensors.json'
TABLE_NAME = 'sensors'
load_sensors(data_root, version)

Load sensors files.

For more detail, see schema design here:

sensors.json

Parameters

data_root (str) – the root directory of the dataset containing

tables

version (str): desired schema version

Returns

Return type

A pandas dataframe with all sensors records with columns

‘id’ (sensor id), ‘ego_id’, ‘modality’ ({camera, lidar, radar, sonar,…} – Sensor modality), ‘description’

datasetinsights.datasets.unity_perception.tables

class datasetinsights.datasets.unity_perception.tables.FileType(value)

Bases: enum.Enum

An enumeration.

BINARY = 'binary'
CAPTURE = 'capture'
METRIC = 'metric'
REFERENCE = 'reference'
class datasetinsights.datasets.unity_perception.tables.Table(file, pattern, filetype)

Bases: tuple

property file

Alias for field number 0

property filetype

Alias for field number 2

property pattern

Alias for field number 1

datasetinsights.datasets.unity_perception.tables.glob(data_root, pattern)

Find all matching files in a directory.

Parameters
  • data_root (str) – directory containing capture files

  • pattern (str) – Unix file pattern

Yields

str – matched filenames in a directory

datasetinsights.datasets.unity_perception.tables.load_table(json_file, table_name, version, **kwargs)

Load records from json files into a pandas table

Parameters
  • json_file (str) – filename to json.

  • table_name (str) – table name in the json file to be loaded

  • version (str) – requested version of this table

  • **kwargs – arbitrary keyword arguments to be passed to pandas’ json_normalize method.

Returns

a pandas dataframe of the loaded table.

Raises
  • VersionError – If the version in json file does not match the requested

  • version.

datasetinsights.datasets.unity_perception.validation

Validate Simulation Data

exception datasetinsights.datasets.unity_perception.validation.DuplicateRecordError

Bases: Exception

Raise when the definition file has duplicate definition id

exception datasetinsights.datasets.unity_perception.validation.NoRecordError

Bases: Exception

Raise when no record is found matching a given definition id

exception datasetinsights.datasets.unity_perception.validation.VersionError

Bases: Exception

Raise when the data file version does not match

datasetinsights.datasets.unity_perception.validation.check_duplicate_records(table, column, table_name)

Check if table has duplicate records for a given column

Parameters
  • table (pd.DataFrame) – a pandas dataframe

  • column (str) – the column where no duplication is allowed

  • table_name (str) – table name

Raises

DuplicateRecordError – If duplicate records are found in a column

datasetinsights.datasets.unity_perception.validation.verify_version(json_data, version)

Verify json schema version

Parameters
  • json_data (json) – a json object loaded from file.

  • version (str) – string of the requested version.

Raises

VersionError – If the version in json file does not match the requested

version.

class datasetinsights.datasets.unity_perception.AnnotationDefinitions(data_root, version='0.0.1')

Bases: object

Load annotation_definitions table

For more detail, see schema design here: annotation_definitions.json

table

a collection of annotation_definitions records

Type

pd

FILE_PATTERN = '**/annotation_definitions.json'
TABLE_NAME = 'annotation_definitions'
find_by_name(pattern)

Get the annotation definition by matching patterns

This method will try to match the pattern of the annotation definition by name to determine

Parameters

pattern (srt) – the regex pattern

Returns

a dictionary containing the annotation definition

Return type

dict

Raises
get_definition(def_id)

Get the annotation definition for a given definition id

Parameters

def_id (int) – annotation definition id used to filter results

Returns

a dictionary containing the annotation definition

Return type

dict

Raises

NoRecordError – If no annotation are found for a given definition id

load_annotation_definitions(data_root, version)

Load annotation definition files.

For more detail, see schema design here: annotation_definitions.json

Parameters

data_root (str) – the root directory of the dataset containing

tables

version (str): desired schema version

Returns

A Pandas dataframe with annotation definition records. Columns: ‘id’ (annotation id), ‘name’ (annotation name), ‘description’ (string description), ‘format’ (string describing format), ‘spec’ ( Format-specific specification for the annotation values)

class datasetinsights.datasets.unity_perception.Captures(data_root='/data', version='0.0.1')

Bases: object

Load captures table

A capture record stores the relationship between a captured file, a collection of annotations, and extra metadata that describes this capture. For more detail, see schema design here:

captures

Examples:

>>> captures = Captures(data_root="/data")
#captures class automatically loads the captures (e.g. lidar scan,
image, depth map) and the annotations (e.g semantic segmentation
labels, bounding boxes, etc.)
>>> data = captures.filter(def_id="6716c783-1c0e-44ae-b1b5-7f068454b66e") # noqa E501 table command not be broken down into multiple lines
#return the captures and annotations filtered by the annotation
definition id
captures

a collection of captures without annotations

Type

pd.DataFrame

annotations

a collection of annotations

Type

pd.DataFrame

FILE_PATTERN = '**/captures_*.json'
TABLE_NAME = 'captures'
filter(def_id)

Get captures and annotations filtered by annotation definition id captures

Parameters

def_id (int) – annotation definition id used to filter results

Returns

A pandas dataframe with captures and annotations Columns: ‘id’ (capture id), ‘sequence_id’, ‘step’, ‘timestamp’, ‘sensor’, ‘ego’,

’filename’, ‘format’, ‘annotation.id’, ‘annotation.annotation_definition’,’annotation.values’

Raises

DefinitionIDError – Raised if none of the annotation records in the combined annotation and captures dataframe match the def_id specified as a parameter.

Example Returned Dataframe (first row):

label_id(int)

sequence_id(str)

step(int)

timestamp (float)

sensor (dict)

ego (dict)

filename(str)

format (str)

annotation.id (str)

annotation.annotation_definition(str)

annotation.values

2

None

50

4.9

{‘sensor_id’: ‘dDa873b…’, ‘ego_id’: ‘44ca9…’, ‘modality’: ‘camera’,’translation’: [0.0, 0.0, 0.0], ‘rotation’: [0.0, 0.0, 0.0, 1.0],’scale’: 0.344577253}

RGB3/asd.png

PNG

ace0

6716c

[{‘label_id’: 34, ‘label_name’: ‘snack_chips_pringles’,…’height’: 118.0}, {‘label_id’: 35, ‘… ‘height’: 91.0}…]

class datasetinsights.datasets.unity_perception.Egos(data_root, version='0.0.1')

Bases: object

Load egos table

For more detail, see schema design here: egos.json

table

a collection of egos records

Type

pd

FILE_PATTERN = '**/egos.json'
TABLE_NAME = 'egos'
load_egos(data_root, version)

Load egos files. For more detail, see schema design here:

egos.json

Parameters
  • data_root (str) – the root directory of the dataset containing

  • tables (ego) –

  • version (str) – desired schema version

Returns

id (ego id) and description

Return type

A pandas dataframe with all ego records with two columns

class datasetinsights.datasets.unity_perception.MetricDefinitions(data_root, version='0.0.1')

Bases: object

Load metric_definitions table

For more detail, see schema design here:

metric_definitions.json

table

a collection of metric_definitions records with columns: id

Type

pd

(id for metric definition), name, description, spec (definition specific spec)

FILE_PATTERN = '**/metric_definitions.json'
TABLE_NAME = 'metric_definitions'
get_definition(def_id)

Get the metric definition for a given definition id

Parameters

def_id (int) – metric definition id used to filter results

Returns

a dictionary containing metric definition

load_metric_definitions(data_root, version)

Load metric definition files.

metric_definitions.json

Args:

data_root (str): the root directory of the dataset containing tables version (str): desired schema version

Returns:

A Pandas dataframe with metric definition records.

a collection of metric_definitions records with columns: id

(id for metric definition), name, description, spec (definition specific spec)

class datasetinsights.datasets.unity_perception.Metrics(data_root='/data', version='0.0.1')

Bases: object

Load metrics table

Metrics store extra metadata that can be used to describe a particular sequence, capture or annotation. Metric records are stored as arbitrary number (M) of key-value pairs. For more detail, see schema design doc: metrics

metrics

a collection of metrics records

Type

dask.bag.core.Bag

Examples

>>> metrics = Metrics(data_root="/data")
>>> metrics_df = metrics.filter_metrics(def_id="my_definition_id")
#metrics_df now contains all the metrics data corresponding to
"my_definition_id"

One example of metrics_df (first row shown below):

label_id(int)

instance_id(int)

visible_pixels(int)

2

2

2231

FILE_PATTERN = '**/metrics_*.json'
TABLE_NAME = 'metrics'
filter_metrics(def_id)

Get all metrics filtered by a given metric definition id

Parameters

def_id (str) – metric definition id used to filter results

Raises

Returns (pd.DataFrame): Columns: “label_id”, “capture_id”, “annotation_id”, “sequence_id”, “step”

class datasetinsights.datasets.unity_perception.Sensors(data_root, version='0.0.1')

Bases: object

Load sensors table

For more detail, see schema design here:

sensors.json

table

a collection of sensors records with columns:

Type

pd

'id'
Type

sensor id

({camera, lidar, radar, sonar,...} -- Sensor modality), 'description'
FILE_PATTERN = '**/sensors.json'
TABLE_NAME = 'sensors'
load_sensors(data_root, version)

Load sensors files.

For more detail, see schema design here:

sensors.json

Parameters

data_root (str) – the root directory of the dataset containing

tables

version (str): desired schema version

Returns

Return type

A pandas dataframe with all sensors records with columns

‘id’ (sensor id), ‘ego_id’, ‘modality’ ({camera, lidar, radar, sonar,…} – Sensor modality), ‘description’