Taking tests is often connected to some kind of
(dichotomous) decision.
Do I pass or fail the statistics exam?
Am I pregnant or not?
Am I qualified or unqualified for the position I'm applying to?
Do I or don't I suffer from a certain illness?
The real life criterion which is reflected through the test result
can also often be dichotomised (qualified versus unqualified,
pregnant versus not pregnant ...).
Criterion and test result can be unanimous - Sarah passes the
statistics exam and is actually a good student in statistics,
or divergent - Jon passes the exam but only through
sheer luck, since he knows nothing about statistics.
The first
example is called a hit (being selected by the test and
fulfilling the criterion), the second a false alarm
(being selected but not fulfilling the criterion).
There are two more possible outcomes: correct rejection
(being rejected by test and not meeting the criterion) and a
miss (being rejected but in fact meeting the
criterion).
The four different possibilities are visualised in the following
table:
Criterion |
|||
While talking about "getting selected" makes sense in a school or job context, it's inappropriate in a medical context, for example testing for HIV. In a medical context, we therefore talk about a test being positive or negative and about being ill or not being ill. This results in the following table:
Illness |
|||
Validity of Selection Decisions
The validity of the selection decision is determined by several factors which can also influence each other:
- Validity of the test
What proportion of the variance in the criterion is explained through the test result? - Natural success rate (prevalence)
What proportion of all participants would meet the criterion? - Selection quota
What proportion of participants is selected? - Success rate (positive predictive value)
What proportion of selected participants is successfull/ does fulfill the criterion? - Sensitivity
What is the probability of selection when one would meet the criterion? - Specifity
What is the probability of rejection when one would not meet the criterion?
Which measures should be taken into consideration to assess the validity of a decision depends on the kind of test and the criterion the test refers to. In a medical context, sensitivity is important, as it can be essential for patient survival to detect an illness. In a job screening, specifity should be given a greater weight, since here the goal is to reject all inadequate applicants.
To put these concepts into practice use the Try it! tab and then Test yourself! :)
Notice how those changes affect the outcome of the test decision (visualized in the scatterplot) and the different measures which assess the validity of the decision! You can change the critical criterion and test value without changing the data. However when you change the validity of a test the participants have to retake it which leads to slightly different values.
Sensitivity: Probability of selection when meeting the criterion?
(in a medical context: prevalence)
Specifity: Probability of rejection when not meeting the criterion?
Success Rate: Conditional probability of suitable candidates
in selected sample
Scatterplot of Participants' Test and Criterion Values
What are the defenitions of the following quotas?
Effects of different variables
The smaller the selection quota, the ...
The higher the natural success rate, the ...