Figure 1 illustrates the general approach to test development and validation recommended by Alpine. Each task and its underlying subtasks is an opportunity to evaluate validity. We will help you navigate the process by employing industry-supported methodologies and our considerable expertise.  By recognizing how each task is related and dependent on other tasks, and by deliberately considering the potential impact to validity during the execution of each task, Alpine is able to effectively guide and support organizations in the development and maintenance of their testing programs.

 

Figure 1

Old wheel

Validity

According to the Standards1, "Validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of tests. Validity is, therefore, the most fundamental consideration in developing and evaluating tests."

Design Program (Purpose)

The initial and ongoing process to determine and document the testing program’s goals, audience, value propositions, architecture, and infrastructure.

Design Test

A structured process to determine and document a test’s defining characteristics such as intended and unintended interpretations and uses of the test scores, the examinee population, score reporting needs, test parameters, and the validation plan.

Analyze Domain

A structured process is conducted to define and document the knowledge, skills, and abilities relevant to the intended interpretation and use of the test scores.

Develop Blueprint/Test Specifications

The test content is defined and weighted based upon information collected in the domain analysis and professional judgment.

Develop Content/Review Content

Test items and/or tasks are drafted, reviewed and revised, and ultimately approved for pre-testing, flagged for further review and revision, or rejected.

Pre-Test & Analyze

Items and/or tasks are administered on beta/pilot forms or as pilot items on operational forms for the purpose of collecting response data and evaluating the usefulness of the items/tasks based on statistical characteristics such as model fit, difficulty, and descrimination.

Assemble Operational Test

Items and/or tasks are assembled into one or more test forms against which test takers will be scored. The forms meet the blueprint specifications and are balanced for content and statistical characteristics such as difficulty, discrimination, test time, reliability, and standard error.

Conduct Standard Setting

Performance standards are translated into one or more cut points on a test form.

Maintain Test

Once a test is developed and put into operational use, it requires ongoing care and attention to improve upon or at a minimum maintain validity evidence.