datasetinsights.datasets.unity_perception¶
datasetinsights.datasets.unity_perception.captures¶
Load Synthetic dataset captures and annotations tables
-
class
datasetinsights.datasets.unity_perception.captures.
Captures
(data_root='/data', version='0.0.1')¶ Bases:
object
Load captures table
A capture record stores the relationship between a captured file, a collection of annotations, and extra metadata that describes this capture. For more detail, see schema design here:
Examples:
>>> captures = Captures(data_root="/data") #captures class automatically loads the captures (e.g. lidar scan, image, depth map) and the annotations (e.g semantic segmentation labels, bounding boxes, etc.) >>> data = captures.filter(def_id="6716c783-1c0e-44ae-b1b5-7f068454b66e") # noqa E501 table command not be broken down into multiple lines #return the captures and annotations filtered by the annotation definition id
-
captures
¶ a collection of captures without annotations
- Type
pd.DataFrame
-
annotations
¶ a collection of annotations
- Type
pd.DataFrame
-
FILE_PATTERN
= '**/captures_*.json'¶
-
TABLE_NAME
= 'captures'¶
-
filter
(def_id)¶ Get captures and annotations filtered by annotation definition id captures
- Parameters
def_id (int) – annotation definition id used to filter results
- Returns
A pandas dataframe with captures and annotations Columns: ‘id’ (capture id), ‘sequence_id’, ‘step’, ‘timestamp’, ‘sensor’, ‘ego’,
’filename’, ‘format’, ‘annotation.id’, ‘annotation.annotation_definition’,’annotation.values’
- Raises
DefinitionIDError – Raised if none of the annotation records in the combined annotation and captures dataframe match the def_id specified as a parameter.
Example Returned Dataframe (first row):
label_id(int)
sequence_id(str)
step(int)
timestamp (float)
sensor (dict)
ego (dict)
filename(str)
format (str)
annotation.id (str)
annotation.annotation_definition(str)
annotation.values
2
None
50
4.9
{‘sensor_id’: ‘dDa873b…’, ‘ego_id’: ‘44ca9…’, ‘modality’: ‘camera’,’translation’: [0.0, 0.0, 0.0], ‘rotation’: [0.0, 0.0, 0.0, 1.0],’scale’: 0.344577253}
…
RGB3/asd.png
PNG
ace0
6716c
[{‘label_id’: 34, ‘label_name’: ‘snack_chips_pringles’,…’height’: 118.0}, {‘label_id’: 35, ‘… ‘height’: 91.0}…]
-
datasetinsights.datasets.unity_perception.exceptions¶
-
exception
datasetinsights.datasets.unity_perception.exceptions.
DefinitionIDError
¶ Bases:
Exception
Raise when a given definition id can’t be found.
datasetinsights.datasets.unity_perception.metrics¶
Load Synthetic dataset Metrics
-
class
datasetinsights.datasets.unity_perception.metrics.
Metrics
(data_root='/data', version='0.0.1')¶ Bases:
object
Load metrics table
Metrics store extra metadata that can be used to describe a particular sequence, capture or annotation. Metric records are stored as arbitrary number (M) of key-value pairs. For more detail, see schema design doc: metrics
-
metrics
¶ a collection of metrics records
- Type
dask.bag.core.Bag
Examples
>>> metrics = Metrics(data_root="/data") >>> metrics_df = metrics.filter_metrics(def_id="my_definition_id") #metrics_df now contains all the metrics data corresponding to "my_definition_id"
One example of metrics_df (first row shown below):
label_id(int)
instance_id(int)
visible_pixels(int)
2
2
2231
-
FILE_PATTERN
= '**/metrics_*.json'¶
-
TABLE_NAME
= 'metrics'¶
-
filter_metrics
(def_id)¶ Get all metrics filtered by a given metric definition id
- Parameters
def_id (str) – metric definition id used to filter results
- Raises
DefinitionIDError – raised if no metrics records match the given
def_id –
Returns (pd.DataFrame): Columns: “label_id”, “capture_id”, “annotation_id”, “sequence_id”, “step”
-
datasetinsights.datasets.unity_perception.references¶
Load Synthetic dataset references tables
-
class
datasetinsights.datasets.unity_perception.references.
AnnotationDefinitions
(data_root, version='0.0.1')¶ Bases:
object
Load annotation_definitions table
For more detail, see schema design here: annotation_definitions.json
-
table
¶ a collection of annotation_definitions records
- Type
pd
-
FILE_PATTERN
= '**/annotation_definitions.json'¶
-
TABLE_NAME
= 'annotation_definitions'¶
-
find_by_name
(pattern)¶ Get the annotation definition by matching patterns
This method will try to match the pattern of the annotation definition by name to determine
- Parameters
pattern (srt) – the regex pattern
- Returns
a dictionary containing the annotation definition
- Return type
dict
- Raises
NoRecordError – If no annotation are found for a given definition id
DuplicateRecordError – If more than one record is found by the given pattern
-
get_definition
(def_id)¶ Get the annotation definition for a given definition id
- Parameters
def_id (int) – annotation definition id used to filter results
- Returns
a dictionary containing the annotation definition
- Return type
dict
- Raises
NoRecordError – If no annotation are found for a given definition id
-
load_annotation_definitions
(data_root, version)¶ Load annotation definition files.
For more detail, see schema design here: annotation_definitions.json
- Parameters
data_root (str) – the root directory of the dataset containing
- tables
version (str): desired schema version
- Returns
A Pandas dataframe with annotation definition records. Columns: ‘id’ (annotation id), ‘name’ (annotation name), ‘description’ (string description), ‘format’ (string describing format), ‘spec’ ( Format-specific specification for the annotation values)
-
-
class
datasetinsights.datasets.unity_perception.references.
Egos
(data_root, version='0.0.1')¶ Bases:
object
Load egos table
For more detail, see schema design here: egos.json
-
table
¶ a collection of egos records
- Type
pd
-
FILE_PATTERN
= '**/egos.json'¶
-
TABLE_NAME
= 'egos'¶
-
load_egos
(data_root, version)¶ Load egos files. For more detail, see schema design here:
- Parameters
data_root (str) – the root directory of the dataset containing
tables (ego) –
version (str) – desired schema version
- Returns
id (ego id) and description
- Return type
A pandas dataframe with all ego records with two columns
-
-
class
datasetinsights.datasets.unity_perception.references.
MetricDefinitions
(data_root, version='0.0.1')¶ Bases:
object
Load metric_definitions table
For more detail, see schema design here:
-
table
¶ a collection of metric_definitions records with columns: id
- Type
pd
(id for metric definition), name, description, spec (definition specific spec)
-
FILE_PATTERN
= '**/metric_definitions.json'¶
-
TABLE_NAME
= 'metric_definitions'¶
-
get_definition
(def_id)¶ Get the metric definition for a given definition id
- Parameters
def_id (int) – metric definition id used to filter results
- Returns
a dictionary containing metric definition
-
load_metric_definitions
(data_root, version)¶ Load metric definition files.
- Args:
data_root (str): the root directory of the dataset containing tables version (str): desired schema version
- Returns:
A Pandas dataframe with metric definition records.
a collection of metric_definitions records with columns: id
(id for metric definition), name, description, spec (definition specific spec)
-
-
class
datasetinsights.datasets.unity_perception.references.
Sensors
(data_root, version='0.0.1')¶ Bases:
object
Load sensors table
For more detail, see schema design here:
-
table
¶ a collection of sensors records with columns:
- Type
pd
-
'id'
- Type
sensor id
-
({camera, lidar, radar, sonar,...} -- Sensor modality), 'description'
-
FILE_PATTERN
= '**/sensors.json'¶
-
TABLE_NAME
= 'sensors'¶
-
load_sensors
(data_root, version)¶ Load sensors files.
For more detail, see schema design here:
- Parameters
data_root (str) – the root directory of the dataset containing
- tables
version (str): desired schema version
- Returns
- Return type
A pandas dataframe with all sensors records with columns
‘id’ (sensor id), ‘ego_id’, ‘modality’ ({camera, lidar, radar, sonar,…} – Sensor modality), ‘description’
-
datasetinsights.datasets.unity_perception.tables¶
-
class
datasetinsights.datasets.unity_perception.tables.
FileType
(value)¶ Bases:
enum.Enum
An enumeration.
-
BINARY
= 'binary'¶
-
CAPTURE
= 'capture'¶
-
METRIC
= 'metric'¶
-
REFERENCE
= 'reference'¶
-
-
class
datasetinsights.datasets.unity_perception.tables.
Table
(file, pattern, filetype)¶ Bases:
tuple
-
property
file
¶ Alias for field number 0
-
property
filetype
¶ Alias for field number 2
-
property
pattern
¶ Alias for field number 1
-
property
-
datasetinsights.datasets.unity_perception.tables.
glob
(data_root, pattern)¶ Find all matching files in a directory.
- Parameters
data_root (str) – directory containing capture files
pattern (str) – Unix file pattern
- Yields
str – matched filenames in a directory
-
datasetinsights.datasets.unity_perception.tables.
load_table
(json_file, table_name, version, **kwargs)¶ Load records from json files into a pandas table
- Parameters
json_file (str) – filename to json.
table_name (str) – table name in the json file to be loaded
version (str) – requested version of this table
**kwargs – arbitrary keyword arguments to be passed to pandas’ json_normalize method.
- Returns
a pandas dataframe of the loaded table.
- Raises
VersionError – If the version in json file does not match the requested
version. –
datasetinsights.datasets.unity_perception.validation¶
Validate Simulation Data
-
exception
datasetinsights.datasets.unity_perception.validation.
DuplicateRecordError
¶ Bases:
Exception
Raise when the definition file has duplicate definition id
-
exception
datasetinsights.datasets.unity_perception.validation.
NoRecordError
¶ Bases:
Exception
Raise when no record is found matching a given definition id
-
exception
datasetinsights.datasets.unity_perception.validation.
VersionError
¶ Bases:
Exception
Raise when the data file version does not match
-
datasetinsights.datasets.unity_perception.validation.
check_duplicate_records
(table, column, table_name)¶ Check if table has duplicate records for a given column
- Parameters
table (pd.DataFrame) – a pandas dataframe
column (str) – the column where no duplication is allowed
table_name (str) – table name
- Raises
DuplicateRecordError – If duplicate records are found in a column
-
datasetinsights.datasets.unity_perception.validation.
verify_version
(json_data, version)¶ Verify json schema version
- Parameters
json_data (json) – a json object loaded from file.
version (str) – string of the requested version.
- Raises
VersionError – If the version in json file does not match the requested
version.
-
class
datasetinsights.datasets.unity_perception.
AnnotationDefinitions
(data_root, version='0.0.1')¶ Bases:
object
Load annotation_definitions table
For more detail, see schema design here: annotation_definitions.json
-
table
¶ a collection of annotation_definitions records
- Type
pd
-
FILE_PATTERN
= '**/annotation_definitions.json'¶
-
TABLE_NAME
= 'annotation_definitions'¶
-
find_by_name
(pattern)¶ Get the annotation definition by matching patterns
This method will try to match the pattern of the annotation definition by name to determine
- Parameters
pattern (srt) – the regex pattern
- Returns
a dictionary containing the annotation definition
- Return type
dict
- Raises
NoRecordError – If no annotation are found for a given definition id
DuplicateRecordError – If more than one record is found by the given pattern
-
get_definition
(def_id)¶ Get the annotation definition for a given definition id
- Parameters
def_id (int) – annotation definition id used to filter results
- Returns
a dictionary containing the annotation definition
- Return type
dict
- Raises
NoRecordError – If no annotation are found for a given definition id
-
load_annotation_definitions
(data_root, version)¶ Load annotation definition files.
For more detail, see schema design here: annotation_definitions.json
- Parameters
data_root (str) – the root directory of the dataset containing
- tables
version (str): desired schema version
- Returns
A Pandas dataframe with annotation definition records. Columns: ‘id’ (annotation id), ‘name’ (annotation name), ‘description’ (string description), ‘format’ (string describing format), ‘spec’ ( Format-specific specification for the annotation values)
-
-
class
datasetinsights.datasets.unity_perception.
Captures
(data_root='/data', version='0.0.1')¶ Bases:
object
Load captures table
A capture record stores the relationship between a captured file, a collection of annotations, and extra metadata that describes this capture. For more detail, see schema design here:
Examples:
>>> captures = Captures(data_root="/data") #captures class automatically loads the captures (e.g. lidar scan, image, depth map) and the annotations (e.g semantic segmentation labels, bounding boxes, etc.) >>> data = captures.filter(def_id="6716c783-1c0e-44ae-b1b5-7f068454b66e") # noqa E501 table command not be broken down into multiple lines #return the captures and annotations filtered by the annotation definition id
-
captures
¶ a collection of captures without annotations
- Type
pd.DataFrame
-
annotations
¶ a collection of annotations
- Type
pd.DataFrame
-
FILE_PATTERN
= '**/captures_*.json'¶
-
TABLE_NAME
= 'captures'¶
-
filter
(def_id)¶ Get captures and annotations filtered by annotation definition id captures
- Parameters
def_id (int) – annotation definition id used to filter results
- Returns
A pandas dataframe with captures and annotations Columns: ‘id’ (capture id), ‘sequence_id’, ‘step’, ‘timestamp’, ‘sensor’, ‘ego’,
’filename’, ‘format’, ‘annotation.id’, ‘annotation.annotation_definition’,’annotation.values’
- Raises
DefinitionIDError – Raised if none of the annotation records in the combined annotation and captures dataframe match the def_id specified as a parameter.
Example Returned Dataframe (first row):
label_id(int)
sequence_id(str)
step(int)
timestamp (float)
sensor (dict)
ego (dict)
filename(str)
format (str)
annotation.id (str)
annotation.annotation_definition(str)
annotation.values
2
None
50
4.9
{‘sensor_id’: ‘dDa873b…’, ‘ego_id’: ‘44ca9…’, ‘modality’: ‘camera’,’translation’: [0.0, 0.0, 0.0], ‘rotation’: [0.0, 0.0, 0.0, 1.0],’scale’: 0.344577253}
…
RGB3/asd.png
PNG
ace0
6716c
[{‘label_id’: 34, ‘label_name’: ‘snack_chips_pringles’,…’height’: 118.0}, {‘label_id’: 35, ‘… ‘height’: 91.0}…]
-
-
class
datasetinsights.datasets.unity_perception.
Egos
(data_root, version='0.0.1')¶ Bases:
object
Load egos table
For more detail, see schema design here: egos.json
-
table
¶ a collection of egos records
- Type
pd
-
FILE_PATTERN
= '**/egos.json'¶
-
TABLE_NAME
= 'egos'¶
-
load_egos
(data_root, version)¶ Load egos files. For more detail, see schema design here:
- Parameters
data_root (str) – the root directory of the dataset containing
tables (ego) –
version (str) – desired schema version
- Returns
id (ego id) and description
- Return type
A pandas dataframe with all ego records with two columns
-
-
class
datasetinsights.datasets.unity_perception.
MetricDefinitions
(data_root, version='0.0.1')¶ Bases:
object
Load metric_definitions table
For more detail, see schema design here:
-
table
¶ a collection of metric_definitions records with columns: id
- Type
pd
(id for metric definition), name, description, spec (definition specific spec)
-
FILE_PATTERN
= '**/metric_definitions.json'¶
-
TABLE_NAME
= 'metric_definitions'¶
-
get_definition
(def_id)¶ Get the metric definition for a given definition id
- Parameters
def_id (int) – metric definition id used to filter results
- Returns
a dictionary containing metric definition
-
load_metric_definitions
(data_root, version)¶ Load metric definition files.
- Args:
data_root (str): the root directory of the dataset containing tables version (str): desired schema version
- Returns:
A Pandas dataframe with metric definition records.
a collection of metric_definitions records with columns: id
(id for metric definition), name, description, spec (definition specific spec)
-
-
class
datasetinsights.datasets.unity_perception.
Metrics
(data_root='/data', version='0.0.1')¶ Bases:
object
Load metrics table
Metrics store extra metadata that can be used to describe a particular sequence, capture or annotation. Metric records are stored as arbitrary number (M) of key-value pairs. For more detail, see schema design doc: metrics
-
metrics
¶ a collection of metrics records
- Type
dask.bag.core.Bag
Examples
>>> metrics = Metrics(data_root="/data") >>> metrics_df = metrics.filter_metrics(def_id="my_definition_id") #metrics_df now contains all the metrics data corresponding to "my_definition_id"
One example of metrics_df (first row shown below):
label_id(int)
instance_id(int)
visible_pixels(int)
2
2
2231
-
FILE_PATTERN
= '**/metrics_*.json'¶
-
TABLE_NAME
= 'metrics'¶
-
filter_metrics
(def_id)¶ Get all metrics filtered by a given metric definition id
- Parameters
def_id (str) – metric definition id used to filter results
- Raises
DefinitionIDError – raised if no metrics records match the given
def_id –
Returns (pd.DataFrame): Columns: “label_id”, “capture_id”, “annotation_id”, “sequence_id”, “step”
-
-
class
datasetinsights.datasets.unity_perception.
Sensors
(data_root, version='0.0.1')¶ Bases:
object
Load sensors table
For more detail, see schema design here:
-
table
¶ a collection of sensors records with columns:
- Type
pd
-
'id'
- Type
sensor id
-
({camera, lidar, radar, sonar,...} -- Sensor modality), 'description'
-
FILE_PATTERN
= '**/sensors.json'¶
-
TABLE_NAME
= 'sensors'¶
-
load_sensors
(data_root, version)¶ Load sensors files.
For more detail, see schema design here:
- Parameters
data_root (str) – the root directory of the dataset containing
- tables
version (str): desired schema version
- Returns
- Return type
A pandas dataframe with all sensors records with columns
‘id’ (sensor id), ‘ego_id’, ‘modality’ ({camera, lidar, radar, sonar,…} – Sensor modality), ‘description’
-