Expert Verification

26 min

the expert verification (ev) phase is part of the docid\ seukel bw5wxsob2z2tuk , which aims to enable the delivery of sufficient quality annotations on time and at the expected cost in this phase, only annotated inputs that were previously reviewed by qms in the quality review phase are selected for review this enables comparison of errors found by the expert (lqm) against those originally identified by the qm goal the phases' main goal is to ensure that expert's such as lead quality managers (lqms) can verify that quality managers (qms) are performing their duties adequately users and actions the phase contains two actions, review and correct, which should be configured to include different user groups qms to ensure the phase works correctly, qms who review labels in the quality review phase must be configured with the correct action in the ev phase this configuration ensures they are available for monitoring and can receive correction tasks for inputs they previously reviewed experts (lqms) experts (lqms) responsible for qm performance in the request should be configured with the review action they will receive review tasks that allow them to assess the quality of labels following the quality review phase's review monitoring the primary monitoring tool for the ev phase is the quality manager sampling table it contains some basic user information (name, e mail, organization) as well as some phase specific metrics and controls r1 (round 1) acceptance the basic measurement of qm label quality is the percentage of their sampled reviewed annotations that are accepted after an expert's (lqm's) first look in this phase alignment score the alignment score indicates how well a qm's review aligns with the expert's expectation for the specific phase and in general for the project these scores help assess the quality manager's performance and identify areas for improvement read more docid\ pgncji7eztsd4pekq2pia you can leverage the alignment score to understand how well the qms review aligns with the expert's expectation for the specific phase and in general for the project a high alignment score indicates that the expert don't need to review the qms work before accepting it read more docid\ pgncji7eztsd4pekq2pia found errors the found errors metric compares the errors found by the qm as a reviewer in the previous phase with the errors found in the ev phase by the expert it is shown as the percentage of errors found by the qm compared to the total errors found by both the qm and the expert (in the ev phase) note that only inputs that have been reviewed in both phases are considered in this metric current review sample the percentage and absolute numbers of a qm's reviewed annotations that have been reviewed by an expert and used to calculate the r1 acceptance ratio the absolute numbers help you determine whether the sample size is large or small in the context of this specific request tooling various controls and tools are available to reviewers in the ev phase these are mainly concerned with modifying how labels are sampled for review depending on the users' performance and the overall state of the request sampled review label quality is primarily estimated based on feedback items generated in review tasks performed by the experts these items are then aggregated and various metrics are calculated and provided for the experts to evaluate and coach qms based on their performance sampling rate to ensure that all users' performance is continuously evaluated, a sample of tasks will be generated for all users in the phase, regardless of performance and other settings this is called "baseline sampling" and assures that at least one task will be sampled for every annotator in the phase, so that metrics can be calculated for them in the quality manager sampling table, it's possible to configure custom sampling rates for each user annotating in the request, to allow experts to more thoroughly evaluate qms who may currently not be supporting their annotators in producing labels of sufficient quality this will both allow for more accurate performance measurements, as well as allow the expert to provide direct feedback on the issues missed by the qm, who will then get a correction task where they can learn from the feedback while improving the labels before it moves forward in the pipeline you can leverage the alignment score to understand how well the qms review aligns with the experts' expectation for the specific phase and in general for the project a high alignment score indicates that the expert don't need to review the qms work before accepting it read more docid\ pgncji7eztsd4pekq2pia send non selected inputs for review depending on the sampling rate configured for a qm at different times, only a subset of their labels may have been selected for review in cases where performance is sub par, it is possible to later create review tasks for all labels created by a specific user this feature is available in the annotator sampling table, under the menu on the right it is also possible to send individual non selected inputs for review from the phase inputs table gate to allow for manual review selection of inputs, the gate is a mechanism for preventing inputs from moving out of the phase before adequate assesments of label quality have can be made there are two actions that can be taken in the gate controls, and which have different effects on currently waiting inputs as well as future inputs arriving in the phase open gate for non sampled inputs opening the gate will send all currently waiting inputs forward to the next phase, and make it so that any future inputs are sent forward to the next phase without review review everything if this option is selected, review tasks will be created for all currently wating un selected inputs any future inputs will automatically be selected for review, and have tasks created when this option has been selected, sampling levels for individual users will no longer be relevant, and it will not be possible to change them since they are not longer relevant phase inputs in the phase input table, you can see all inputs that are currently inside the phase for each input, you can see when it changed workflow stage, how many tasks have been done on it in the current phase, and what the state and type is of any current task the actions quick accept and quick reject are available for inputs with unstarted review tasks you can read more about them https //docs kognic com/review#fviud edit summary the edit summary is still under development and, therefore, only available to a limited number of organizations if you would like us to enable this for your organization, contact kognic the edit summary enables insight into how the annotations entering the phase had to be adjusted to meet the reviewers' quality expectations this is done by comparing an input’s annotation when entering the phase to when it leaves the phase after being accepted in a review task note that quick accepted inputs aren't used in this comparison , and any edits done in the phase for these inputs won't be included as edits in the edit summary currently, you can investigate the edits in three different tables added and removed objects , edited objects , and edited scene properties added and removed objects helps you understand how often specific objects were missed or incorrectly included, thus impacting the recall and precision of the annotations the table contains the following information per class added objects the percentage of added objects seen in the review accepted inputs’ annotations compared to their content when entering the phase the numbers in this column enable insights such as 💡 to ensure all relevant vehicles were annotated we had to add 20 % more objects during review phase 1 we need to understand what objects were missed and how we can prevent that from happening in the future removed objects the percentage of removed objects seen in the review accepted inputs’ annotations compared to their content when entering the phase the numbers in this column enable insights such as 💡 we seem to have removed 50 % of all reviewed obstacle objects has the workforce misunderstood what defines an obstacle? objects after phase the percentage difference in object count between review accepted inputs' annotations and their content at the start of the phase object review ratio compares the initial count of objects from review accepted inputs (at phase entry) to the total number of objects from all inputs (incl those currently in review correction and review) that have entered the phase this helps you understand the current review sample of a specific class ⚠️ note that the review ratio is based on objects that have entered the phase; objects still in earlier phases are excluded this means that the ratio doesn't represent the total object sample rate until all objects have completed the phase edited objects this table helps you understand how often properties and geometries had to be edited to meet the reviewers' quality expectations property edit and sample rates are presented per property, while geometry edits are presented per geometry type and type of edit (2d box position) edited objects the percentage of objects for which the attribute got edited between the annotated input entering the phase and being accepted in review 💡 15 % of the reviewed 3d box objects have been resized; were the changes in size significant or just minor adjustments? 💡 35 % of the reviewed objects with the property "age" had their property value changed is the definition of the different values unclear to the workforce? object review ratio percentage of objects that are review accepted compared to all objects in this phase the value updates when inputs are reviewed and accepted, or when new inputs enter the phase ⚠️ note that the review ratio is based on objects that have entered the phase; objects still in earlier phases are excluded this means that the ratio doesn't represent the total object attribute sample rate until all objects have completed the phase edited scene properties this table helps you understand how often individual scene properties had to be edited to meet the reviewers' quality expectations edited inputs the percentage of inputs where the scene property was edited between initial phase entry and review acceptance 💡 for 23 % of the reviewed inputs the scene property weather got edited were there any particular property values that the workforce members had a hard time distinguishing in between? object review ratio the percentage of inputs that are review accepted compared to all inputs in this phase the value updates when inputs are reviewed and accepted or when new inputs enter the phase ⚠️ note that the review ratio is based on inputs that have entered the phase; inputs still in earlier phases are excluded this means that the ratio doesn't represent the total input property sample rate until all inputs have completed the phase error summary with the error summary you get insight into what issues reviewers have found and commented on during the phase's review tasks it helps you understand the most common and less frequent identified issues the error summary insights are based on feedback items written by the phase's reviewers absolute numbers represent actual feedback items, not the edits made in response to the feedback if you are interested in understanding how the annotations were edited based on the feedback, you can use the edit summary described in the section above no of correction requests the number of feedback items of the type "correction requests" this is the sum of all errors shown in the "error type distribution" to the right no of feedback items the number of feedback items of the type "advice" these are excluded from the chart "error type distribution" and "suggested properties" error type distribution shows the absolute count and relative share of all feedback items categorized as "correction requests", grouped by their error type suggested properties for those items with the error type properties, this shows the distribution of properties that we affected each error indicated as properties has a single property connected to it individual feedback items this section helps you to get an overview of all given feedback, to answer questions such as how detailed and critical is the feedback of my colleague reviewers? are the reviewers giving valid feedback given the current guideline? what is feedback where the reviewer and annotator are discussing in the comments? how does feedback of type "missingobject" look? what type of feedback is marked as "invalid"? the items are split up by their feedback type correction requests in this section, you see feedback items for the type correction request these are things that the reviewer wants to get corrected before accepting the review you can filter the feedback items by their resolved status, the error type, whether a discussion thread exists, or whether the overall review of the input has been accepted yet below is a description of what information is available for each correction request status true 330,331 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type an item can be "unresolved" even if the overall review was accepted, or the other way around error type the type of error that was selected in the correction request suggested property if the error type is "properties", this column shows which property and value was suggested by the reviewer comment shows the description that the reviewer might have given thread exists will say "yes", if there was any reply to the item, i e a discussion thread has been started in relation to the item external scene id the scene id of the reviewed annotation current round the review round in which the input of this feedback item currently is all inputs start in round 1 with each rejected review, they progress 1 round forward accepted review whether the overall review was accepted or not feedback in this section you see feedback items of the type advice as the underlying data has less structure, the table has fewer columns and filtering options, but otherwise, it looks the same as the one above