$go to Math$

A-Level Statistics

STATS - test and measures midterm 2

Studied by 8 people

0.0(0)

get a hint

hint

Validity

1 / 56

Tags and Description

Statistics

A-Level Statistics

57 Terms

Validity

Defined as the agreement between a test score or measure and the quality is believed to measure

New cards

Standards of psychological testing

Foundation, operations and applications

New cards

Face validity

Not an actual category. Measure looks like it has validity but it's based on judgment without systematic evidence

New cards

Content related evidence for validity

Evidence that show the test adequately covers the content it is supposed to measure

New cards

Construct underrepresentation

Failure to capture important components of a construct

New cards

Construct irrelevant variance

Where scores are influenced by factors irrelevant to the construct

New cards

Criterion related evidence for validity

Evidence that support the test ability to predict or correlate with external criteria

New cards

Construct related evidence for validity

It's evidence that supports the underlying theoretical construct being measured by the test

New cards

Predictive evidence

How well a test can predict future outcome like SAT

New cards

Effect of restricted range

Most of the data point Falls within a small or limited range of values

New cards

Concurrent validity

Evaluating whether a new test or questionnaire to provide results that are consistent with an existing measure

New cards

Convergent evidence

Measure correlates well with other tests

New cards

Discriminant evidence

A test should have low correlation with the measure of unrelated construct

New cards

Criteria - refence list

Have items that are designed to match certain specific instructional objective

New cards

Relationship between reliability and validity

We can have reliability without validity but we can't have validity without reliability

New cards

Item format

The way in which questions or statements are presented in a test such as true or false multiple choice or polytomous formats

New cards

Dichotomous format

A type of format where each item provides two Alternatives true or false, one being correct

New cards

Polytomous format

A type of item format where each item has more than two alternatives. this is multiple choice

New cards

Distractors

Incorrect choices and multiple choice items that test takers can select

New cards

Correction for guessing

A formula used to adjust test scores in multiple choice exams to account for the likelihood of obtaining Answers by random guessing

New cards

Omitted responses

Answers left blank or not attempted by test takers which can typically not account in correction for guessing formulas

New cards

Random guessing

Selecting answers in multiple choice items without any knowledge of the correct answer which may or may not be advantageous

New cards

Speeded tests

Test with time constraints where the correction for guessing formula may only include items attempted, making random guessing and leaving items like have the same expected effect

New cards

Elimination method

A strategy where test takers eliminate obviously incorrect Alternatives and multiple choice items increasing their chances of getting the right answer

New cards

Likert format

A scale that uses strongly disagree to strongly agree to a particular question

New cards

Reverse scoring

Reversing the original scoring used to maintain consistency in a scale construction

New cards

Category format

Similar to the Likert method, but with greater numbers of choices

New cards

Endpoints

The extreme values or labels of the category scale which should be avoided to minimize potential response bias

New cards

Context effect

The phenomenon where ratings on a category format skills may change based on the context or grouping of people

New cards

Optimal number of categories

The number of response categories and a format scale varies depending on the level of involvement of respondent, considered sufficient for most waiting tasks

New cards

Visual analogue scale

A method where there is a scale that is like a line and and you're supposed to Mark between two endpoints

New cards

Confidence intervals

A statistical method used to calculate a range of values that is likely to contain a population parameter

New cards

Adjective checklist

A method commonly used and personality measurement where subject received a list of adjectives and indicate how characteristic of them

New cards

Q sort

Technique that increases the number of responses but have a subject sort statements into nine piles to describe themselves

New cards

Forced choice format

Item formats that require subjects to make choices from given alternative

New cards

Checklists

A format that has become less popular in the recent years were subject respond to the list of items

New cards

Item writing

The process of creating tests items including selecting appropriate format wording and response choices

New cards

All of the above

A response option commonly advised against in item writing as it can be problematic and lead to confusion in multiple choice questions

New cards

Item analysis techniques

Methods used to evaluate the effectiveness and quality of test items after they have been administered including measures of reliability difficulty and discrimination

New cards

Precise language

The use of clear specific on ambiguous wording and test item to ensure they are accurately assesses the intended trait or knowledge

New cards

Subject matter knowledge

A deep understanding of the content and concept being tested in order to create accurate effective test item

New cards

Item difficulty

In the context of a test that measures achievement or ability, item difficulty is defined by the number of people who answer the particular item correctly

New cards

Optimal difficulty level

Ideal level of difficulty for test items usually halfway between 100% correct responses and the level of success expected by chance

New cards

Discriminability

And measure of the value of test items assessing the extent to which individuals who perform well on specific items also perform well on the entire test

New cards

Extreme group method

A method to assess an item discriminability by comparing performance of individual who have done well to those who have not done well

New cards

Point biserial method

And approach the evaluate discriminability of test items by finding the correlation between performance on a specific item and overall test performance

New cards

Item characteristic curve

Represents the relationship between the tests items difficulty and the proportion of examines who answer it correctly

New cards

Item discriminability

Is the extent to which high performing individuals on a specific item also perform well on the entire test

New cards

Difficulty and discriminability

An items difficulty level is essential and items should ideally have a difficulty between 30% and 70%

New cards

Item selection

Final version of the test should consider both difficulty and discriminability

New cards

Item response Theory

And you were approved to test construction that considers the probability of getting specific items correct based on the individual's ability level.

New cards

Computer adaptive testing

Significant advantage of IRT allowing for personalized assessments

New cards

Measurement precision

The choice of this design impact the measurement Precision across various ability levels. computer adaptive testing offers the advantage of maintaining consistent measurement position for defined ability levels

New cards