缺失資料在因素分析上的處理方法之研究
                    作者:王鴻龍(國立臺北大學統計學系)、楊孟麗(中央研究院人文社會科學研究中心)、陳俊如(國立臺北大學統計學系)、林定香(國立臺北大學統計學系)
                
                
                    卷期:57卷第1期
                    日期:2012年3月
                    頁碼:29-50
                    DOI:10.3966/2073753X2012035701002
                
                摘要:
因素分析常用來研究問卷及量表。當資料缺失過多或缺失機制為非完全隨機時,分析所得的共同因素個數或因素負荷常有偏差。本研究使用「台灣教育長期追蹤資料庫」,將其中的完整資料視為基準資料,並根據原有缺失結構,建構一至五倍缺失比率的資料集,以探討因素分析對缺失插補的敏感度。研究者比較了四種缺失處理法,包括:可用個體法、完整個體法、邏輯斯迴歸插補法與蒙第卡羅-馬可夫鏈(Monte Carlo Markov Chain, MCMC)插補法。結果顯示,缺失比率愈高時,所估計出來的變異數矩陣與基準資料的矩陣差異愈大。可用個體法在缺失比率較高時,萃取的共同因子的個數比基準資料多。在因素負荷上,可用個體法的誤差最嚴重,而完整個體法雖然和其他兩種插補法的誤差接近,不過會因缺失比率的增加與基準的誤差而隨之變大。研究者建議在缺失比率20%~30%或以上時,使用邏輯斯迴歸插補法或是蒙第卡羅-馬可夫鏈插補法後再進行因素分析會有較小的誤差。
                關鍵詞:台灣教育長期追蹤資料庫、缺失資料、探索式因素分析、蒙第卡羅-馬可夫鏈(MCMC)插補法、邏輯斯迴歸插補法
                
                     《詳全文》
《詳全文》
                       
  
                
                
                   
                      
                                   
                
                
             
         
        
        
            
                
Journal directory listing - Volume 57 (2012)  - Journal of Research in Education Sciences【57(1)】March
                
                
            
            
                
				    Missing Data Techniques for Factor Analysis
                    Author: Hong-Long Wang(Department of Statistics, National Taipei University), Meng-Li Yang(Research Center for Humanities and Social Sciences, Academia Sinica), Chun-Ju Chen(Department of Statistics, National Taipei University), Ting-Hsiang Lin(Department of Statistics, National Taipei University)
                
                
                    Vol.&No.:Vol. 57, No. 1
                    Date:March 2012
                    Pages:29-50
                    DOI:10.3966/2073753X2012035701002
                
                Abstract:
Factor analysis is frequently employed to analyze scales and questionnaires. However, when the proportion of missing data is high or the missing data are not random, the number of factors extracted can be biased. We used the Taiwan Education Panel Survey (TEPS) and constructed 5 data sets with different missing proportions to assess the effects of missingness on factor analysis imputation. Complete observed data were used as a baseline for comparison. We compared the 4 treatments: available case method (AC), the complete case method (CC), MCMC single imputation (MCMC), and step-wise logistic regression single imputation (LR). The results show that the higher the missing proportion, the greater the discrepancy between the covariance matrix of the constructed data set and that of the baseline. For the AC method, the higher the proportion of missing data, the more the number of extracted factors exceeds that of the baseline. The AC method possessed the largest bias in factor loadings. The bias in factor loading of the CC method increased as the missing portion also increased. Thus, we recommend not applying the list-wise deletion method for factor analysis when the missing proportion is 20% or more.
                Keywords:TEPS, missing data, exploratory factor analysis, MCMC imputation, logistic regression imputation