Pre-annotations
Pre-annotations have many uses in ground-truth production. The pre-annotations feature allows information about the objects already known to be present in an input to be specified. Please reach out to our Advisory Services team to see how they can best be used for your use-case.
The Kognic platform supports uploading pre-annotations in the OpenLabel format using the kognic-openlabel package
There are 3 steps that are needed in order to create pre-annotations in the Kognic platform.
- Create a scene by uploading all the needed data
- Upload an OpenLabel annotation as a pre-annotation
- Create an input from the scene
Note that these steps can be performed in one call with the create_inputs function, see Creating Multiple Inputs With One Call.
Start by creating a scene
Note that you now have to wait for the scene to be created before you can proceed to the next step. More information this can be found Waiting for Scene Creation.
The pre-annotation can be uploaded to the Kognic platform once the scene has been created successfully.
Load your OpenLabel annotation according to the documentation in kognic-openlabel and upload it to the Kognic platform as such:
When the scene and pre-annotation have been successfully created, the input can be created. This will add it to the latest open batch in a project, or the specific batch that's specified, and be ready for annotation with the pre-annotation present.
Pre-annotations use the OpenLabel format/schema but not all OpenLabel features are supported in pre-annotations.
These features or combinations of features are not currently supported, or only have partial support.
- Static geometries: not supported
- These are bounding boxes, cuboids, etc. declared in the OpenLabel under objects.*.objectData
- Geometry-specific attributes: not supported on 3D geometry
- These are attributes declared in the OpenLabel on a single geometric shape, in other words an attribute that only applies to the object as seen by one sensor; a common example is occlusion which is recorded separately for each camera.
- May also be referred to as source-, stream- or sensor-specific attributes.
- 3D geometry is anything that can be drawn when annotating a pointcloud, e.g. cuboids.
- Geometry-specific attributes are permitted on 2D geometry e.g. bounding boxes
- Note that the task definition, must designate a property as source specific before it may be used in this way.
- The stream attribute is a special case and is excepted from this rule
Objects cannot have multiple 3D geometries in the same frame
Name | OpenLABEL field | Description | Attributes |
---|---|---|---|
Cuboid | cuboid | Cuboid in 3D | - |
Bounding box | bbox | Bounding box in 2D | - |
3D line | poly3d | Line in 3D. Append the first point at the end if you want it to be closed. | - |
Polygon | poly2d | Polygon in 2D | is_hole |
Multi-polygon | poly2d | Multi-polygon in 2D | is_hole & polygon_id |
Curve | poly2d | Curve or line in 2D | interpolation_method |
2D point | point2d | - | |
Group of 2D points | point2d | point_class | |
3D Semantic Segmentation | binary | | |
3D Instance Segmentation | binary | | |
Note that all geometries should be specified under frames rather than in the root of the pre-annotation. 3D geometries should be expressed in the lidar coordinate system in the single-lidar case, but in the reference coordinate system in the multi-lidar case. The rotation of cuboids should be the same as that in exports. 2D geometries should be expressed in pixel coordinates. See Coordinate Systems for more information.
- Text
- Num
- Boolean
For 2D geometry, attributes may be specified as geometry specific (aka source/sensor specific), or object specific. Attributes can be static (specified in the objects key) or dynamic (specified in the object_data for the object in the frame) and must be allowed by the task definition , if one exists. Geometry specific attributes (those which appear on a single shape within frames) must also be declared as such in the task definition; arbitrary properties cannot be used in a source-specific way.
Context is used to define scene properties in the kognic platform. Each context contains one property value.
Contexts comes in four modes:
- Static global - The value is valid for all frames and sensor of the scene
- Dynamic global - The value is valid for all sensor, but may change across frames
- Static source specific - The value is valid for all frames but only for 1 specific stream
- Dynamic source specific - The value is valid for only 1 specific stream and may change across frames
We will be referring to two types of context object:
- context object (or simply context) - defined under the root key contexts
- frame context object (or simply frame context) - defined in a frame
All contexts used in the openlabel have to be defined under the context key in the root of the openlabel. At kognic the key of the context is only used to reference a context when used in other places in the openlabel (e.g. in a frame) and we set it to a string of incrementing numbers.
The context type will corresponde to property name in the kognic app and the value of the property is set in the attributes of the context.
If a source specific property should exist in multiple sensors, e.g. each camera sensor has a boolean property "sees car", then there will be one context object per sensor all with the same context type.
For contexts we support the following attribute types:
- text
- num
- boolean
- vec - only string vecs
Context Mode | Has stream attribute in context | Has non stream attribute in context | Has attribute in frame contex | Has more than 1 non stream attribute in context | Has more than 1 attribute in frame contex |
---|---|---|---|---|---|
Static Global | ❌ | ✅ | ❌ | ❌ | ❌ |
Dynamic Global | ❌ | ❌ | ✅ | ❌ | ❌ |
Static Source Specific | ✅ | ✅ | ❌ | ❌ | ❌ |
Dynamic Source Specific | ✅ | ❌ | ✅ | ❌ | ❌ |
See examples at.
Every pre-annotation must contain frames with unique timestamps that are among the ones specified in the scene. The reason for this is that the timestamps are used to map the frame in the pre-annotation to the correct frame in the scene. In the static case, one frame should be used with timestamp 0.
Currently not supported. Contact Kognic if you need support for this or use regular attributes instead.
Every geometry must have the stream property specified. This property determines which stream (or sensor) that the geometry appears in. It is important that the stream is among the ones specified in the scene and of the same type, for example camera or lidar.
Pre-annotations can be sparse, meaning that its objects or geometries do not need to be present in every frame. Instead, they can be present in a subset of frames and then interpolated in the frames in between. Utilizing this feature can speed up the annotation process significantly for sequences. Sparseness can be accomplished in two different ways, either by using object data pointers or the boolean property interpolated. The former is the recommended way of doing it in most cases since it will lead to a more compact pre-annotation. The latter is useful when the pre-annotation is created from exported annotations from the Kognic platform.
Interpolation is done by linearly interpolating the geometry values between key frames. This is done in pixel coordinates for 2D geometries. For 3D geometries, the interpolation can be done in either the frame local coordinate system or the world coordinate system (see Coordinate Systems). This is configured in the annotation instruction so reach out to the Kognic team about this if you are unsure. Note that interpolation in the world coordinate system is recommended but requires that the scene contains ego poses.
In OpenLABEL, object data pointers are used to create a specification for objects. For example, you can specify what attributes and geometries that are used for specific objects. In addition, you can specify for which frames that these are present. If a geometry is specified in the object data pointer, it will be present in all frames that the object data pointer is pointing to. If the geometry is not provided in some of these frames, it will be interpolated. Note that geometries must be provided for the first and last frame in the object data pointer. Otherwise, the pre-annotation will be rejected.
One limitation is that a geometry must be in the same stream for all frames when using object data pointers. This is because interpolation is done in the stream coordinate system. If you need to use geometries of the same type in different streams, you can simply use different names for the geometries in the different streams.
Sparseness with Object Data Pointers shows an example of how to use object data pointers.
The boolean property interpolated can be used to specify that a geometry should be interpolated. Geometries are still required to be present in interpolated frames but their geometry values will be ignored. Note that interpolated geometries must have corresponding geometries (interpolated or not) in the first and last frame of the pre-annotation. Otherwise, the pre-annotation will be rejected.
Using the interpolated property is the recommended way of doing it when the pre-annotation is created from exported annotations from the Kognic platform.
Sparseness with Interpolated Property shows an example of how to use the interpolated property.
Attributes are handled differently compared to geometries. If an attribute is not present in a frame, its last value will simply be used if the object (or geometry if the property is source-specific) is present in the frame. If the object is not present in the frame, the attribute will be ignored. Dense attributes will be sparsified automatically when the pre-annotation is uploaded to the Kognic platform.
There are certain properties that can be set on an object to toggle various behavior in the Kognic platform.
If an object and its geometries in the pre-annotation is already of sufficient quality, or should remain unchanged during use of the pre-annotation, you can mark it as locked. The lock is put on an object level, and will affect all the objects geometries.
A stationary object is something that can move, but doesn't. A good example of this is a parked car. This is different from a static object, which can't move, such as a landmark.
Objects can be marked as stationary to enable certain platform features.
Below follows examples of supported pre-annotations.
In the example below the object 1232b4f4-e3ca-446a-91cb-d8d403703df7 has a bounding box called the-bbox-name that is provided in frames 0 and 3. In frames 1 and 2, the bounding box will be interpolated.
In the example below sparseness is determined using the interpolated property. The object 1232b4f4-e3ca-446a-91cb-d8d403703df7 has a bounding box for which the interpolated property is set to true in frames 1 and 2 but not in frames 0 and 3. The geometry values in frames 1 and 2 are ignored and instead interpolated from the geometry values in frames 0 and 3.
When uploading pre-annotations for 3D segmentation tasks the openlabel contains both the classifications as well as any instances that are present in the scene. The classification is contained by a special object with the object type 3DPointcloudSegmentation . This object must contain a binary object data entry with the classifications encoded using RLE. Any instances should have an object entry with a classification_id value.
Kognic imposes a classification numbering scheme as follows:
Range | Meaning |
---|---|
0 | Unclassified |
1-255 | Semantic classification, e.g. Road, Building |
256 - 65535 | Instance classification, e.g. Car1, Car2, Pedestrian1, Pedestrian2 |