Psychometric Perspectives on Remote Proctoring

In the current unprecedented times, credentialing programs have been challenged to quickly evaluate, prioritize, and adapt elements of their testing policies and procedures to allow for continued candidate engagement. The viability of employing alternate exam delivery approaches has been a particularly predominant programmatic problem in recent months, with many programs considering a rapid shift to remote proctoring to prevent large-scale exam administration interruptions. The consideration of this potential transition away from traditional testing centers to remote proctored delivery often comes with many psychometric, practical, and policy concerns for program stakeholders.

In the past few weeks, Alpine has conducted two interactive webinars led by Alpine psychometricians, Brett Foley, Ph.D., and Corina Owens, Ph.D. These webinars addressed common psychometric topics considered by certification programs in advance of implementing remote proctoring as a test delivery modality. These sessions addressed frequently expressed apprehensions related to remote proctoring, including: fairness of the testing experience (e.g., access, administration approaches, accommodations), security of exam content and control over the testing environment, scalability, accreditation considerations, and candidate perceptions.

The majority of attendees were considering supplementing (or have supplemented) their existing delivery mode with remote proctoring (54%); an additional 30% of attendees were considering both options. Only a small fraction of attendees (1%) indicated a complete shift to a solely remote-proctored solution given the pandemic. The decision to offer multiple modes including remote proctoring or single-vendor solutions should be driven by the particular business goals, constraints and requirements, and candidate populations of each individual certification program. It should be noted that if a multi-delivery mode solution is selected, the certification program may need to establish different policies than those in place for existing delivery channels. It is key that certification programs provide equivalent testing experiences regardless of delivery mode. This means that the exam should be fair and balanced across modes with fairly similar expectations of and functions of the exam. The delivery procedures do not need to be exactly the same; however, the exam experience should be as consistent and standardized as possible and not biased against any particular group of test takers (e.g., adjustments to ADA accommodations to allow all test takers the option of remote proctoring).

A few key questions that certification programs need to address in their decision-making process for moving to remote proctoring about their candidate population are as follows:

What proportion of the candidate population has access to appropriate technology and what are the system requirements to be administered a remote proctored exam?
Are there known bandwidth and/or connectivity issues for any subgroup of the candidate population?
Are there regional or cultural differences/factors that need to be considered and accounted for to ensure a fair testing experience?

For many programs, a multi-mode delivery solution offers the most flexibility for candidates while enabling all candidates access to exam administration. If there are any concerns within the certification program that certain candidates may be excluded from or denied a fair and equivalent testing experience if only remote proctored delivery is offered, then a single-delivery channel solution is not a viable option. Multiple delivery modes should be available. Once a remote proctoring vendor is selected after thorough vetting and requirements for exam administration are set, these requirements should be openly shared with candidates so that individuals can make informed decisions as to which delivery option is best suited for their needs.

In advance of the recent webinars, attendees indicated their biggest concern about implementing remote proctoring for their program. Overwhelmingly, security was top-of-mind for the majority of credentialing organizations, with 78% of attendees citing this as their largest apprehension to introducing the delivery modality. Other frequent sources of uneasiness included: consistency of experience, equivalence, and fairness (22%), access, capacity, and ease of use (17%), and the test delivery vendor’s capacity for alternate item types (7%). Some programs were focused on how remote proctoring may impact their program’s reputation and how to garner stakeholder buy-in (5%), proctor concerns (4%), cost (4%), and accreditation considerations (3%).

From a psychometric perspective, cutting across all of these topics is certification programs’ most critical validity claim: to ensure that the intended uses and interpretations of the resulting test scores from their exams can be supported. The validity evidence underlying proper use of the exam results should not be diminished nor weakened by switching delivery modes. It is the responsibility of the certification program to assess remote proctoring options to ensure that equivalence of the candidate testing experience is possible and that factors irrelevant to the exam’s ability to measure a candidate’s knowledge, skills, and abilities are not introduced by moving to remote proctoring for delivery. Existing exam administration policies may need to be evaluated and modified to ensure equivalence of exam experience across delivery mode.

With regard to security in particular, test taker authentication and content theft are consistent security concerns regarding the switch to remote proctoring. These are systemic and pervasive risks with traditional testing modalities as well. As always, regardless of delivery mode, certification programs should engage in rapid and robust content development to enable deep item banks that support the publication of multiple forms and regular form republication for content refresh. Additionally, programs should conduct data forensic analyses at a routine cadence to gauge the extent of potential form piracy, item exposure problems, and unexpected candidate testing trends no matter which test delivery mode or modes are leveraged by a certification program. Certification programs implementing remote proctoring could extend these analyses to explore patterns by test delivery modality, test center, or proctor to more specifically examine delivery impact on security issues within the program.

In terms of vetting potential remote proctoring vendors, it is important for certification programs to determine their requisite level of security for their exams and seek out a vendor that provides prevention solutions that align well with those needs. The techniques employed to reduce content theft, impede candidate fraud, and maintain proctor consistency should parallel those techniques utilized in previously used or concurrently offered traditional delivery modalities. Additional elements may need to be controlled for in the remote proctored environment, including the need to balance a candidate’s privacy versus regulation of the testing atmosphere. If differences in mitigation approaches are necessary given logistical constraints, programs should work to ensure the equivalence of the candidate experience to the extent possible. This may necessitate procedural adjustments at traditional testing centers to better mimic those being used in the program’s remote proctored delivery solution. As for the prevalence of anomalous behaviors and detection of anomalous candidates, Alpine has engaged in research and comparative studies with a few certification programs that investigated differences by delivery mode; this research has shown that remote proctoring is no better or worse with regard to security issues than traditional testing centers.

Certification programs must consider many aspects with regard to the fairness of the testing experience, from access to appropriate technology to regional differences (e.g., privacy, firewalls) to bandwidth capacity. The coordination of breaks in the test administration time allotment is another decision-point many certification programs must face when establishing polices for remote proctoring that are equivalent but perhaps not exactly the same as at traditional testing centers to ensure fairness in the testing experience. Considerations for programs will include but not be limited to whether to offer breaks, how many breaks should be offered, how long those breaks should last, and what guidelines should be in place to restrict candidate behavior during the breaks. The break schedule should provide consistency between delivery modes if multiple delivery options are available to candidates; however, the break events do not need to be identical to be considered equivalent.

As with all other logistical decisions and policies related to implementing remote proctoring, communication with candidates should be clear and transparent about expectations of breaks and requirements during those timeframes (e.g., continued proctoring and monitoring of both key strokes and webcam during the break period). Candidates should also be informed of any exam guidelines prior to their break with built-in exam warnings such as if they will not be able to return to review items submitted prior to the break. Certification programs could employ multiple publication strategies to mitigate practical and security concerns about allowing breaks in remote proctored exams, including differing options for randomly administering items. A few examples would be administering items within sections and randomizing section blocks around breaks or administering half of the items from a fixed form, allowing a break, and then administering the remaining half of the items. In both scenarios, both item and option order should be randomly administered to individual candidates if possible. These publication approaches will both mitigate order effects as well as help negate nefarious behavior that could potentially occur in the break time (e.g., reviewing materials to answer previously viewed items).

Certification programs also need to determine the best overall publication strategy for their forms in a multi-delivery mode solution, not just the presentation of individual items within forms. Again, a variety of techniques exist that promote equivalence and fairness for all test takers based on the specific scenarios presenting each certification program. For example, if there are no differences in security incidents between delivery modes, a certification program could opt to administer multiple forms concurrently across modes. In contrast, if a certification program experiences differential security problems by delivery mode, a publication cycle could be established through which new content is initially launched in the most secure delivery channel and then moved to the less secure delivery channel once a new form is launched to replace the initial form. This would introduce a time-lag in exam availability between the modalities, but it would help control for both item exposure and loss as well as contribute to maintaining an equivalent candidate experience.

In both situations, the certification program should conduct item- and form-level analyses at a regular cadence to monitor and evaluate form performance for equivalence across delivery mode (e.g., average score, average time, reliability, pass rates). Additionally, certification programs should evaluate items for equivalence across delivery mode through analyses such as Differential Item Functioning (DIF). If differences in form- and item-level performance emerge and persist, the certification program should consider the viability of continuing a multi-mode delivery solution resulting in the same credential as the level of validity and fairness associated with this could legitimately be called into question by candidates.

In terms of specific item-type performance, some certification programs are particularly interested in remote proctoring delivery vendors’ ability to handle alternate item types. The feasibility of administering alternate item types should be discussed with test delivery providers in advance of selecting a remote proctoring solution; functionality should then be tested in advance of exam administration to ensure consistency and equivalence across delivery options. The technical review of the alternate item types should assess functionality on different systems and how the item presentation and process might differ by different bandwidth capacities, monitor sizes, and availability of plug-ins, etc. Any differences introduced in the appearance or processing of the items could represent fairness issues across candidates; therefore, the technical dependencies should be minimized and known system requirements should again be communicated clearly to candidates.

While remote proctoring does present potential psychometric, logistical, and policy challenges to certification programs, the risks can be mitigated with advance planning and continued monitoring and offers programs unique benefits. Each certification program should carefully consider the opportunities and threats that remote proctoring as a delivery option gives its program and work in conjunction with legal, marketing, and stakeholder groups to establish policies and communications that safeguard the equivalence of the testing administration and candidate exam experience regardless of delivery mode.

Links to Remote Proctoring Webinar series included below:

Psychometric Perspectives on Remote Proctoring

PREVIOUS

NEXT

Recent Posts

A Simple Method to Detect Score Similarity and Practical Implications for Its Use

Effortlessly Set Candidate Communication Language Preferences with CM Connect-Credly Integration

Alpine Testing Solutions Welcomes Stephen Price as New Chief Executive Officer

Badging Options for Alpine Testing Solutions’ Partner Programs Continue to Expand