Scores of .70 or higher indicate that the instrument has high reliability when the stakes are moderate.

The rubric can capture the type of target behaviors, qualities, or products that professors are interested in evaluating.

Reliability means that individual scores from an instrument should be the same or nearly the same from one administration of the instrument to another.

Reliability estimates for Form Z range from .49 to .87 across the 42 groups who have been tested.

Measures of validity were computed in standard conditions, roughly defined as conditions that do not adversely affect test performance.

This test measures health science undergraduate and graduate students' CTS.

Although test items are set in health sciences and clinical practice contexts, test takers are not required to have discipline-specific health sciences knowledge.If more than one rater is used, then inter-rater reliability must be established among the raters to yield meaningful results.While the PJRF can be used to assess the effectiveness of training programs for individuals or groups, the evaluation of participants' actual skills are best measured by an objective tool such as the California Critical Thinking Skills Test.In a study of instructional strategies and their influence on the development of critical thinking among undergraduate nursing students, Tiwari, Lai, and Yuen found that, compared with lecture students, PBL students showed significantly greater improvement in overall CCTDI (p = .0048), Truth seeking (p = .0008), Analyticity (p =.0368) and Critical Thinking Self-confidence (p =.0342) subscales from the first to the second time points; in overall CCTDI (p = .0083), Truth seeking (p= .0090), and Analyticity (p =.0354) subscales from the second to the third time points; and in Truth seeking (p = .0173) and Systematicity (p = .0440) subscales scores from the first to the fourth time points (76).Studies have shown the California Critical Thinking Skills Test captured gain scores in students' critical thinking over one quarter or one semester.Validity means that individual scores from a particular instrument are meaningful, make sense, and allow researchers to draw conclusions from the sample to the population that is being studied (69) Researchers often refer to "content" or "face" validity.Content validity or face validity is the extent to which questions on an instrument are representative of the possible questions that a researcher could ask about that particular content or skills.The instrument can be assumed to be free of bias and measurement error (68).Alpha coefficients are often used to report an estimate of internal consistency.Course evaluations typically ask for responses of "agree" or "disagree" to items focusing on teacher behavior.Typically the questions do not solicit information about student learning.


