共通教育心理学の採点手続き

この記事は,RStudioで作成され,knitrやRWordPressでRStudioからポストされたものです。
何事も実験してみたい年頃!w


準備段階

データを読み込む。

data <- read.csv("test2014.csv")
items <- data[3:50]
summary(items)
##        q1              q2              q3              q4      
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.00  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.00  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.00  
##  Mean   :0.979   Mean   :0.979   Mean   :0.979   Mean   :0.85  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.00  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.00  
##        q5              q6              q7              q8       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:0.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.914   Mean   :0.667   Mean   :0.946   Mean   :0.892  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##        q9             q10             q11             q12       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.581   Mean   :0.677   Mean   :0.935   Mean   :0.892  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q13             q14             q15             q16       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:1.000  
##  Median :0.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.484   Mean   :0.742   Mean   :0.602   Mean   :0.914  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q17             q18             q19             q20       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000  
##  Median :1.000   Median :1.000   Median :0.000   Median :0.000  
##  Mean   :0.699   Mean   :0.688   Mean   :0.226   Mean   :0.269  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:0.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q21             q22             q23             q24       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :0.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.495   Mean   :0.538   Mean   :0.817   Mean   :0.839  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q25              q26             q27             q28       
##  Min.   :0.0000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :0.0000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.0108   Mean   :0.892   Mean   :0.979   Mean   :0.882  
##  3rd Qu.:0.0000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.0000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q29             q30             q31             q32       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:0.000   1st Qu.:1.000   1st Qu.:0.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.914   Mean   :0.581   Mean   :0.839   Mean   :0.742  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q33             q34             q35             q36       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.796   Mean   :0.871   Mean   :0.796   Mean   :0.914  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##       q37             q38             q39             q40      
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.00  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:0.000   1st Qu.:1.00  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.00  
##  Mean   :0.946   Mean   :0.796   Mean   :0.677   Mean   :0.85  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.00  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.00  
##       q41             q42             q43             q44   
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :1  
##  1st Qu.:1.000   1st Qu.:0.000   1st Qu.:1.000   1st Qu.:1  
##  Median :1.000   Median :1.000   Median :1.000   Median :1  
##  Mean   :0.935   Mean   :0.538   Mean   :0.989   Mean   :1  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1  
##       q45             q46             q47             q48       
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:0.000   1st Qu.:0.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :0.000  
##  Mean   :0.796   Mean   :0.892   Mean   :0.731   Mean   :0.323  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000

通過率は平均でわかる。高い。

apply(items,2,mean)
##      q1      q2      q3      q4      q5      q6      q7      q8      q9 
## 0.97849 0.97849 0.97849 0.84946 0.91398 0.66667 0.94624 0.89247 0.58065 
##     q10     q11     q12     q13     q14     q15     q16     q17     q18 
## 0.67742 0.93548 0.89247 0.48387 0.74194 0.60215 0.91398 0.69892 0.68817 
##     q19     q20     q21     q22     q23     q24     q25     q26     q27 
## 0.22581 0.26882 0.49462 0.53763 0.81720 0.83871 0.01075 0.89247 0.97849 
##     q28     q29     q30     q31     q32     q33     q34     q35     q36 
## 0.88172 0.91398 0.58065 0.83871 0.74194 0.79570 0.87097 0.79570 0.91398 
##     q37     q38     q39     q40     q41     q42     q43     q44     q45 
## 0.94624 0.79570 0.67742 0.84946 0.93548 0.53763 0.98925 1.00000 0.79570 
##     q46     q47     q48 
## 0.89247 0.73118 0.32258

得点の分布を見る。まあまあ正規分布かしら?

sum <- rowSums(items)
hist(sum,breaks=50)

plot of chunk unnamed-chunk-3

項目反応理論をベイズ推定して採点

さて,では項目反応理論のMCMCといきましょうか。MCMCpackを使うとすぐにできる。

library(MCMCpack)
posteriori <- MCMCirt1d(items,store.item=TRUE,burnin=1000,mcmc=10000,thin=1)

rstanは趣味人の世界

けど,あえてrstanを使うのがロマンってもんよね。

library(rstan)
stancode <-'
data{
  int<lower=0> N; // sample size
  int<lower=0> M; // number of items
  int<lower=0> r[M,N] ; // data matrix <- transposed!
}
transformed data{
  vector[N] ones;
  for(i in 1:N)
    ones[i] <- 1.0;
}
parameters{
  real alpha[M];
  real<lower=0> beta;
  vector[N] theta;
}
model{
  alpha ~ normal(0,100);
  beta ~ normal(0,100);
  theta ~ normal(0,1);
  for(m in 1:M)
    r[m] ~ bernoulli_logit(beta * theta - alpha[m] * ones);
}
generated quantities{
  real  mean_alpha;
  real  a[M];
  mean_alpha <- mean(alpha);
  for(m in 1:M)
    a[m] <- alpha[m] - mean_alpha;
}
'
datastan <- list(N=nrow(items),M=ncol(items),r=t(items))
fit <- stan(model_code=stancode,data=datastan,iter=2000,chain=2)
## 
## TRANSLATING MODEL 'stancode' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'stancode' NOW.
## 
## SAMPLING FOR MODEL 'stancode' NOW (CHAIN 1).
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## #  Elapsed Time: 92.9324 seconds (Warm-up)
## #                70.2261 seconds (Sampling)
## #                163.158 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'stancode' NOW (CHAIN 2).
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## #  Elapsed Time: 78.7089 seconds (Warm-up)
## #                92.3773 seconds (Sampling)
## #                171.086 seconds (Total)
# print(fit,digit=3,par=c("theta"))

ここで中央値を推定されたスコアだとして抽出。素点との相関関係を見てみよう。

MCMCscore <- extract(fit,pars=c("theta"))
estimated.score <- apply(MCMCscore$theta,2,median)
plot(estimated.score,sum)
abline(lm(sum~estimated.score))

plot of chunk unnamed-chunk-6

cor(estimated.score,sum)
## [1] 0.9831

ほぼ相関している・・・。まあ分析しなくてもおんなじってことですね。それをあえてするのは、ロマンだからですね。
一応ヒストグラムと正規分布の当てはめを書き加えてみましょうか。

hist(estimated.score,breaks=50)
par(new=T)
plot(dnorm,-4,4,col="red")

plot of chunk unnamed-chunk-7

最後の処理

これで平均を調製して、段階に書き換えれば採点は終了です。

final.score <- round(estimated.score*10+70,0)
final.grade <- ifelse(final.score>=90,"S",ifelse(final.score>=80,"A",ifelse(final.score>=70,"B",ifelse(final.score>=60,"C","D"))))
table(final.grade)
## final.grade
##  A  B  C  D  S 
##  9 41 28 12  3

あとはファイルに書き出したり、名簿ファイルにマージしたりする作業がありますね。余談的ではありますが、書いておきます。

export.dat <- cbind(data[,1:2],final.score,final.grade)
write.table(export.dat,"final2014.csv",sep=",",quote=FALSE)

はい、おしまい。
本学の場合,システムから名簿.csvを落として来れるんだけど,Shift-JISのおかしなゴミのいっぱい入ったファイルなので、実はこの後のマージが大変だったりするのですが,まあそれは本稿と趣旨がずれますので。

投稿日:
カテゴリー: Rstan