Banner National Institute of Standards and Technology
ISD Research Areas
ISD home About ISD ISD Research Areas ISD's Products and Services What's New in ISD Search ISD
Manufacturing Engineering Laboratory Skip navigation

 

Home

TRANSTAC

Publications

Contact

 


System, Component, and Operationally-Relevant Evaluations (SCORE)

SCORE (System, Component and Operationally Relevant Evaluations) is a unified set of criteria and software tools for defining a performance evaluation approach for complex intelligent systems. It provides a comprehensive evaluation blueprint that assesses the technical performance of a system and its components through isolating and changing variables as well as capturing end-user utility of the system in realistic use-case environments.

SCORE is unique in that:

  • It is applicable to a wide range of technologies, from manufacturing to defense systems
  • Elements of SCORE can be decoupled and customized based upon evaluation goals
  • It has the ability to evaluate a technology at various stages of development, from conceptual to full maturation
  • It combines the results of targeted evaluations to produce an extensive picture of a systems’ capabilities and utility

Intelligent systems tend to be complex and non-deterministic, involving numerous components that are jointly working together to accomplish some overall goal. Existing approaches to measuring such systems often focus on evaluating the system as a whole or individually evaluating some of the individual components under very controlled, but limited, conditions. These approaches do not comprehensively and quantitatively assess the impact of variables such as environmental variables (e.g, lighting, external distances) and system variables (e.g., processing power, memory size) on the system’s overall performance. Through its comprehensive evaluation criteria and software tools, the SCORE framework has greatly enhanced the ability to quantitatively and qualitatively evaluate intelligent systems at the component level -- and the system level -- in operationally relevant environments.  

Applications
SCORE was initially applied to intelligent systems developed under the DARPA (Defense Advanced Research Projects Agency) ASSIST and TRANSTAC programs. The SCORE-based evaluations also provided the researchers and end users with the information that they needed to determine if and when the technology will be ready to be put to use. SCORE allowed developers to identify the various key components of the system and evaluate them both independently and as a whole, thus helping to determine the impact of the individual components on the performance of the overall system. This detailed analysis allows one to more accurately target the aspects of the systems that were shown to provide the greatest benefit to the overall advancement of the technology and therefore helped to identify where the program funding should be applied to get the most “bang for the buck.”

Framework

Score Framework
Click on image to download PowerPoint Show of SCORE Framework, however,
it's best when being viewed in either Firefox or Chrome.

Design Elements
Factors the must be considered when planning a technology evaluation that are driven by system development status and program goals include:

Identification of the system or component to be assessed
Definition of the goal(s), objective(s), metrics/measures
Specification of testing environment (system maturity & intended use, physical environment factors, site suitability, and availability)
Identification of participants (system users and actors)
Specification of required participant training
Specification of data collection methods
Specification of the use-scenarios to challenge the evaluation system

 


 

 

 


The SCORE framework has proven to be widely-applicable in nature and equally relevant to technologies ranging from manufacturing  to military systems. It has been applied to the evaluation of technologies in DARPA programs that range from soldier-worn sensor technology that enhance battlefield awareness by soldiers on patrol to speech-to-speech translation system that automatically translates between English and Arabic spoken utterances. It is also currently being applied to the assessing the control of autonomous vehicles on a shop floor. The SCORE framework has been applied to eight week-long evaluations  (involving over 60 personnel at each evaluation) assessing the performance of technologies developed by twelve independent research teams under these two DARPA programs, yielding results that at the level of detail described throughout this write-up. Additionally, SCORE is being applied to the Virtual Manufacturing Automation Competition (VMAC).

Demonstrated Impact/Accomplishments Exceeds Performance Expectations
The impact of this work has been far-reaching and substantial. This can be seen by:

The SCORE framework has been adopted by multiple programs within DARPA, which has greatly enhanced their ability to quantitatively and qualitatively evaluate intelligent systems at multiple levels.
The approaches used in SCORE are starting to redefine the way that performance evaluation is performed on intelligent systems. As a result of the DARPA evaluations, the SCORE Evaluation Team has been asked to advise other programs on how to apply the techniques for their purposes.
Research teams are starting to use the SCORE evaluation approach to evaluate their own systems. One researcher stated “We switched to NIST’s evaluation procedures because we found them superior to our own.

 

isd-webmaster@cme.nist.gov
Date Created:11/14/2008
Last updated: 02/26/2009