A different set of essay and performance test (PT) questions is used for each administration of the California Bar Exam. The questions asked on one administration may, as a group, be easier or more difficult than those asked on some other administration. Similarly, the graders who grade the answers to the questions on one administration may, as a group, be more lenient or stringent than the graders who grade the answers on another administration. This potential variation in question difficulty and reader leniency between administrations could unduly affect an applicant’s chances of passing.
Scaling is a statistical procedure that adjusts the scores assigned by the graders. This adjustment is used to ensure that an applicant’s likelihood of passing is not affected by any variation in the difficulty of the written section (includes both the essay and PT questions) across administrations. Almost all United States jurisdictions scale their written scores.
Scaling involves converting the sum of all scores assigned by the Graders to the same units of measurement as that used for the multiple-choice question (MCQ) portion of the exam. This is analogous to converting degrees of temperature measured in Fahrenheit to Celsius. On the bar exam, this conversion (or "scaling") essentially involves assigning the highest total written score that was earned by any applicant the same value as the highest MCQ score that was earned by any applicant. The second highest written score is assigned the same value as the second highest MCQ score, and so on. This process is done separately for each administration of the exam. An applicant’s written score is not affected by that applicant’s own MCQ score.
To improve the accuracy of the scaling process, a formula is used to make the conversion from the Grader-assigned written scores to scale scores. This formula is shown below where: A = the sum of the applicant’s Grader assigned scores across all the essay and PT questions in the test, B = mean of these scores across all applicants, C = the standard deviation (i.e., spread) of these scores for all applicants, D = the standard deviation of all applicants’ MCQ scores, and E = the mean of their MCQ scores. A simpler, but algebraically equivalent, version of this formula is used to report results.
Written Scale Score = [{(A – B) /C} x D] + E
The MCQ portion of the exam has 200 multiple-choice questions of which 175 are scored. Different MCQs are used for each administration of the exam. An applicant’s MCQ “raw” score is the number of questions answered correctly. Because the questions asked on one administration may, as a group, be easier or more difficult than those asked on another administration, a raw score earned on one occasion may not signify the same degree of proficiency as that same score earned on another administration. This problem is addressed by a process called "equating'; equated scores are often labeled as MCQ "scale" scores in reports of the results.
Equating adjusts for differences in the difficulty of the MCQs across administrations. It does this by calibrating the scores on each new version of the exam so that a given equated MCQ score indicates the same degree of proficiency regardless of the exam on which it was earned. The equating process involves inserting into each new version of the MCQ exam a set of questions that have been used before and whose difficulty is known. Equating adjusts the raw scores on the basis of whether the applicants taking the current version of the exam earn higher or lower scores on the repeated questions than did the applicants who answered these same questions on a previous administration. As a result of this adjustment, which is based on essentially all bar exam MCQ takers, an MCQ-equated score that was earned on one administration corresponds to the same level of proficiency as that same scale score on another administration. The State Bar will conduct a Standard Setting study after the February bar exam to establish a new baseline, and will then use a method of equating known as Item Response Theory or IRT equating for all exams thereafter. Equating is used on virtually all large-scale tests (such as the LSAT, ACT, SAT, USMLE, and NCLEX).
Equating cannot be directly used for bar exam written sections because essay and PT questions are not reused across administrations of the exam because they are published following each administration. However, there is a strong underlying relationship between written and MCQ scores. As a result of this relationship, an increase or decrease in average MCQ scores between administrations signals a corresponding change in average applicant ability. Because of this relationship and the equating process, MCQ “scale” scores provide the best way to monitor differences in average applicant ability across administrations of the exam. Scaling written scores to the MCQ subsequently results in having a given written scale score indicate about the same level of proficiency regardless of the exam on which that written scale score was earned.