Normal view

There are new articles available, click to refresh the page.
Before yesterdaySAGE Publications Inc: Educational and Psychological Measurement: Table of Contents

Improving the Use of Parallel Analysis by Accounting for Sampling Variability of the Observed Correlation Matrix

Educational and Psychological Measurement, Volume 85, Issue 1, Page 114-133, February 2025.
Parallel analysis has been considered one of the most accurate methods for determining the number of factors in factor analysis. One major advantage of parallel analysis over traditional factor retention methods (e.g., Kaiser’s rule) is that it addresses ...

Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items

Educational and Psychological Measurement, Volume 85, Issue 1, Page 60-81, February 2025.
The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic ...

Examining the Dynamic of Clustering Effects in Multilevel Designs: A Latent Variable Method Application

Educational and Psychological Measurement, Volume 85, Issue 1, Page 156-177, February 2025.
This note is concerned with the study of temporal development in several indices reflecting clustering effects in multilevel designs that are frequently utilized in educational and behavioral research. A latent variable method-based approach is outlined, ...

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Educational and Psychological Measurement, Volume 85, Issue 1, Page 134-155, February 2025.
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these ...

Added Value of Subscores for Tests With Polytomous Items

Educational and Psychological Measurement, Volume 85, Issue 1, Page 38-59, February 2025.
Test-takers, policymakers, teachers, and institutions are increasingly demanding that testing programs provide more detailed feedback regarding test performance. As a result, there has been a growing interest in the reporting of subscores that potentially ...

Evaluating The Predictive Reliability of Neural Networks in Psychological Research With Random Datasets

Educational and Psychological Measurement, Volume 85, Issue 1, Page 5-37, February 2025.
Psychologists are emphasizing the importance of predictive conclusions. Machine learning methods, such as supervised neural networks, have been used in psychological studies as they naturally fit prediction tasks. However, we are concerned about whether ...

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights From a Novel Modeling Approach

Educational and Psychological Measurement, Volume 85, Issue 1, Page 178-214, February 2025.
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater ...

Evaluating Imputation-Based Fit Statistics in Structural Equation Modeling With Ordinal Data: The MI2S Approach

Educational and Psychological Measurement, Volume 85, Issue 1, Page 82-113, February 2025.
The multiple imputation two-stage (MI2S) approach holds promise for evaluating the model fit of structural equation models for ordinal variables with multiply imputed data. However, previous studies only examined the performance of MI2S-based residual-...

Field-Testing Multiple-Choice Questions With AI Examinees: English Grammar Items

Educational and Psychological Measurement, Ahead of Print.
Field-testing is an essential yet often resource-intensive step in the development of high-quality educational assessments. I introduce an innovative method for field-testing newly written exam items by substituting human examinees with artificially ...

Interpretation of the Standardized Mean Difference Effect Size When Distributions Are Not Normal or Homoscedastic

Educational and Psychological Measurement, Ahead of Print.
The standardized mean difference (sometimes called Cohen’s d) is an effect size measure widely used to describe the outcomes of experiments. It is mathematically natural to describe differences between groups of data that are normally distributed with ...

On Latent Structure Examination of Behavioral Measuring Instruments in Complex Empirical Settings

Educational and Psychological Measurement, Ahead of Print.
A multiple-step procedure is outlined that can be used for examining the latent structure of behavior measurement instruments in complex empirical settings. The method permits one to study their latent structure after assessing the need to account for ...

A Note on Evaluation of Polytomous Item Locations With the Rating Scale Model and Testing Its Fit

Educational and Psychological Measurement, Ahead of Print.
A procedure is outlined for point and interval estimation of location parameters associated with polytomous items, or raters assessing studied subjects or cases, which follow the rating scale model. The method is developed within the framework of latent ...

Studying Factorial Invariance With Nominal Items: A Note on a Latent Variable Modeling Procedure

Educational and Psychological Measurement, Ahead of Print.
A latent variable modeling procedure for studying factorial invariance and differential item functioning for multi-component measuring instruments with nominal items is discussed. The method is based on a multiple testing approach utilizing the false ...

Can One Pool Over Site in a Multi-Site Study With Categorical Item Measuring Instruments?: A Multiple Testing Procedure

Educational and Psychological Measurement, Ahead of Print.
We outline a procedure for examining collapsibility over site in multiple-location settings that are frequently utilized in contemporary educational and behavioral research. The method is based on a test of cross-site identity of the response ...

Investigating the Ordering Structure of Clustered Items Using Nonparametric Item Response Theory

Educational and Psychological Measurement, Ahead of Print.
Educational and psychological tests with an ordered item structure enable efficient test administration procedures and allow for intuitive score interpretation and monitoring. The effectiveness of the measurement instrument relies to a large extent on the ...

Using ROC Analysis to Refine Cut Scores Following a Standard Setting Process

Educational and Psychological Measurement, Ahead of Print.
In educational assessment, cut scores are often defined through standard setting by a group of subject matter experts. This study aims to investigate the impact of several factors on classification accuracy using the receiver operating characteristic (ROC)...

Detecting Differential Item Functioning Using Response Time

Educational and Psychological Measurement, Ahead of Print.
This study investigated uniform differential item functioning (DIF) detection in response times. We proposed a regression analysis approach with both the working speed and the group membership as independent variables, and logarithm transformed response ...

Differential Item Functioning Effect Size Use for Validity Information

Educational and Psychological Measurement, Ahead of Print.
There has been an emphasis on effect sizes for differential item functioning (DIF) with the purpose to understand the magnitude of the differences that are detected through statistical significance testing. Several different effect sizes have been ...

Discriminant Validity of Interval Response Formats: Investigating the Dimensional Structure of Interval Widths

Educational and Psychological Measurement, Ahead of Print.
In psychological research, respondents are usually asked to answer questions with a single response value. A useful alternative are interval response formats like the dual-range slider (DRS) where respondents provide an interval with a lower and an upper ...

Novick Meets Bayes: Improving the Assessment of Individual Students in Educational Practice and Research by Capitalizing on Assessors’ Prior Beliefs

Educational and Psychological Measurement, Ahead of Print.
The assessment of individual students is not only crucial in the school setting but also at the core of educational research. Although classical test theory focuses on maximizing insights from student responses, the Bayesian perspective incorporates the ...

Treating Noneffortful Responses as Missing

Educational and Psychological Measurement, Ahead of Print.
This study investigates the treatment of rapid-guess (RG) responses as missing data within the context of the effort-moderated model. Through a series of illustrations, this study demonstrates that the effort-moderated model assumes missing at random (MAR)...

Examination of ChatGPT’s Performance as a Data Analysis Tool

Educational and Psychological Measurement, Ahead of Print.
This study examines the performance of ChatGPT, developed by OpenAI and widely used as an AI-based conversational tool, as a data analysis tool through exploratory factor analysis (EFA). To this end, simulated data were generated under various data ...

Item Classification by Difficulty Using Functional Principal Component Clustering and Neural Networks

Educational and Psychological Measurement, Ahead of Print.
Maintaining consistent item difficulty across test forms is crucial for accurately and fairly classifying examinees into pass or fail categories. This article presents a practical procedure for classifying items based on difficulty levels using functional ...

The Impact of Missing Data on Parameter Estimation: Three Examples in Computerized Adaptive Testing

Educational and Psychological Measurement, Ahead of Print.
In computerized adaptive testing (CAT), examinees see items targeted to their ability level. Postoperational data have a high degree of missing information relative to designs where everyone answers all questions. Item responses are observed over a ...

“What If Applicants Fake Their Responses?”: Modeling Faking and Response Styles in High-Stakes Assessments Using the Multidimensional Nominal Response Model

Educational and Psychological Measurement, Ahead of Print.
Self-report personality tests used in high-stakes assessments hold the risk that test-takers engage in faking. In this article, we demonstrate an extension of the multidimensional nominal response model (MNRM) to account for the response bias of faking. ...

The Impact of Attentiveness Interventions on Survey Data

Educational and Psychological Measurement, Ahead of Print.
Social and behavioral science researchers who use survey data are vigilant about data quality, with an increasing emphasis on avoiding common method variance (CMV) and insufficient effort responding (IER). Each of these errors can inflate and deflate ...

Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores

Educational and Psychological Measurement, Ahead of Print.
Inferences about student learning from large-scale achievement test scores are fundamental in education. For achievement test scores to provide useful information about student learning progress, differences in the content of instruction (i.e., the ...

Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution

Educational and Psychological Measurement, Ahead of Print.
Coefficient Omega measuring internal consistency is investigated for its deviations from expected outcomes when applied to correlational patterns that produce variable-focused factor solutions in confirmatory factor analysis. In these solutions, the ...

Shortening Psychological Scales: Semantic Similarity Matters

Educational and Psychological Measurement, Ahead of Print.
In this study, we proposed a novel scale abbreviation method based on sentence embeddings and compared it to two established automatic scale abbreviation techniques. Scale abbreviation methods typically rely on administering the full scale to a large ...

Enhancing Effort-Moderated Item Response Theory Models by Evaluating a Two-Step Estimation Method and Multidimensional Variations on the Model

Educational and Psychological Measurement, Ahead of Print.
Rapid-guessing behavior in data can compromise our ability to estimate item and person parameters accurately. Consequently, it is crucial to model data with rapid-guessing patterns in a way that can produce unbiased ability estimates. This study proposes ...

Invariance: What Does Measurement Invariance Allow Us to Claim?

Educational and Psychological Measurement, Ahead of Print.
Measurement involves numerous theoretical and empirical steps—ensuring our measures are operating the same in different groups is one step. Measurement invariance occurs when the factor loadings and item intercepts or thresholds of a scale operate ...

Assessing the Speed–Accuracy Tradeoff in Psychological Testing Using Experimental Manipulations

Educational and Psychological Measurement, Ahead of Print.
The speed–accuracy tradeoff (SAT), where increased response speed often leads to decreased accuracy, is well established in experimental psychology. However, its implications for psychological assessments, especially in high-stakes settings, remain less ...

Enhancing Precision in Predicting Magnitude of Differential Item Functioning: An M-DIF Pretrained Model Approach

Educational and Psychological Measurement, Ahead of Print.
Despite numerous studies on the magnitude of differential item functioning (DIF), different DIF detection methods often define effect sizes inconsistently and fail to adequately account for testing conditions. To address these limitations, this study ...

Optimal Number of Replications for Obtaining Stable Dynamic Fit Index Cutoffs

Educational and Psychological Measurement, Ahead of Print.
Factor analysis is commonly used in behavioral sciences to measure latent constructs, and researchers routinely consider approximate fit indices to ensure adequate model fit and to provide important validity evidence. Due to a lack of generalizable fit ...

Exploring the Evidence to Interpret Differential Item Functioning via Response Process Data

Educational and Psychological Measurement, Ahead of Print.
Evaluating differential item functioning (DIF) in assessments plays an important role in achieving measurement fairness across different subgroups, such as gender and native language. However, relying solely on the item response scores among traditional ...

The Effect of Modeling Missing Data With IRTree Approach on Parameter Estimates Under Different Simulation Conditions

Educational and Psychological Measurement, Ahead of Print.
This study explores the performance of the item response tree (IRTree) approach in modeling missing data, comparing its performance to the expectation–maximization (EM) algorithm and multiple imputation (MI) methods. Both simulation and empirical data ...

Factor Retention in Exploratory Multidimensional Item Response Theory

Educational and Psychological Measurement, Ahead of Print.
Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory ...

An Omega-Hierarchical Extension Index for Second-Order Constructs With Hierarchical Measuring Instruments

Educational and Psychological Measurement, Ahead of Print.
An index extending the widely used omega-hierarchical coefficient is discussed, which can be used for evaluating the influence of a second-order factor on the interrelationships among the components of a hierarchical measuring instrument. The index ...

A Comparison of the Next Eigenvalue Sufficiency Test to Other Stopping Rules for the Number of Factors in Factor Analysis

Educational and Psychological Measurement, Ahead of Print.
A plethora of techniques exist to determine the number of factors to retain in exploratory factor analysis. A recent and promising technique is the Next Eigenvalue Sufficiency Test (NEST), but has not been systematically compared with well-established ...

Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution

Educational and Psychological Measurement, Ahead of Print.
A new alternative to obtain a Bayesian estimate of coefficient alpha through a posterior normal distribution is proposed and assessed through percentile, normal-theory-based, and highest probability density credible intervals in a simulation study. The ...
❌
❌