[1] A. Abadie and G. W. Imbens, Large sample properties of matching
estimators for average treatment effects, Econometrica 74(1) (2006),
235-267.
[2] A. Abadie and G. W. Imbens, Bias corrected matching estimators for
average treatment effects, (2007).
http://ksghome.harvard.edu/aabadie/research.html
[3] J. D. Angrist and J. S. Pischke, Mostly Harmless Econometrics: An
Empiricist’s Companion, New Jersey: Princeton University Press,
2009.
[4] P. C. Austin, P. Grootendorst and G. M. Anderson, A comparison of
the ability of different propensity score models to balance measured
variables between treated and untreated subjects: A Monte Carlo study,
Statistics in Medicine 26(4) (2007), 734-753.
[5] P. C. Austin, Optimal caliper widths for propensity-score matching
when estimating differences in means and differences in proportions in
observational studies, Pharmaceutical Statistics 10(2) (2011),
150-161.
[6] P. C. Austin, A comparison of 12 algorithms for matching on the
propensity score, Statistics in Medicine 33(6) (2014), 1057-1069.
[7] J. Bafumi and A. Gelman, Fitting multilevel models when predictors
and group effects correlate, Paper presented at the 2006 Annual
Meeting of the Midwest Political Science Association, Chicago, IL,
2006. Retrieved from:
http://www.stat.columbia.edu/~gelman/research/unpub
lished/Bafumi_Gelman_Midwest06.pdf
[8] M. D. Bates, K. E. Castellano, S. Rabe-Hesketh and A. Skrondal,
Handling correlations between covariates and random slopes in
multilevel models, Journal of Educational and Behavioral Statistics
39(6) (2014), 524-549.
[9] A. Bell and K. Jones, Explaining fixed effects: Random effects
modeling of time-series cross-sectional and panel data, Political
Science Research and Methods 3(1) (2015), 133-153.
[10] M. T. Berg, E. A. Stewart, E. Stewart and R. L. Simons, A
multilevel examination of neighborhood social processes and college
enrollment, Social Problems 60(4) (2013), 513-534.
[11] K. A. Bollen, Structural Equations with Latent Variables, Willey,
New York, 1989.
[12] D. T. Campbell and J. C. Stanley, Experimental and
Quasi-experimental Designs for Research, Rand McNally College
Publishing, Chicago, 1966.
[13] G. Chamberlain, Omitted variable bias in panel data: Estimating
the returns to schooling, In Annales de \\\\\\\\\\\\\\\'INSEE (pp.
49-82). Institut national de la statistique et des études
économiques, 1978.
[14] Y. Chung, S. Rabe-Hesketh, V. Dorie, A. Gelman and J. Liu, A
non-degenerate penalized likelihood estimator for variance parameters
in multilevel models, Psychometrika 78(4) (2013), 685-709.
[15] W. G. Cochran, Matching in analytical studies, American Journal
of Public Health 43 (1953), 684-691.
[16] W. G. Cochran, Analysis of covariance: Its nature and uses,
Biometrics 13(3) (1957), 261-281.
[17] W. G. Cochran, The effectiveness of adjustment by
sub-classification in removing bias in observational studies,
Biometrics 24(2) (1968), 295-313.
[18] W. G. Cochran, The use of covariance in observational studies,
Applied Statistics 18(3) (1969), 270-275.
[19] W. G. Cochran, Observational studies. In T. A. Bancroft (Ed.),
Statistical papers in honor of George W. Snedecor (p. 71-90). Ames:
Iowa State University Press, 1972.
[20] W. G. Cochran, The planning of observational studies of human
populations (with discussion), Journal of the Royal Statistical
Society, Series A (General) 128(2) (1965), 234-255.
[21] W. G. Cochran and D. B. Rubin, Controlling bias in observational
studies: A review, Sankhy: The Indian Journal of Statistics, Series A
35 (1973), 417-446.
[22] P. Ebbes, U. Bockenholt, M. Wedel and H. Nam, Accounting for
regressor-error dependencies in educational data: A Bayesian mixture
approach (Robert H. Smith School Research Paper No. RHS, 2466533),
(2014). Retrieved from:
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=
2466533
[23] M. Fairbrother, Two multilevel modeling techniques for analyzing
comparative longitudinal survey datasets, Political Science Research
and Methods 2(01) (2014), 119-140.
[24] S. Greenland, An overview of methods for causal inference from
observational studies, In A. Gelman&X., 2004.
[25] L. Meng (Eds.), Applied Bayesian Modeling and Causal Inference
from Incomplete-Data Perspectives, (pp. 3-14) Willey, New York,
2004.
[26] Z. Griliches and W. M. Mason, Education, income, and ability, The
Journal of Political Economy 80(3) (1972), S74-S103.
[27] L. V. Hedges and E. C. Hedberg, Intraclass correlation values for
planning group randomized trials in education, Educational Evaluation
and Policy Analysis 29(1) (2007), 60.
[28] L. V. Hedges, Correcting a significance test for clustering,
Journal of Educational and Behavioral Statistics 32(2) (2007),
151-179.
[29] N. E. Helwig and C. J. Anderson, Book review, [Review of the book
Handbook of Advanced Multilevel Analysis, by J. J. Hox & J. K.
Roberts]. Psychometrika 79(1) (2014), 175-177.
[30] D. E. Ho, K. Imai, G. King and E. A. Stuart, MatchIt:
Nonparametric preprocessing for parametric causal inference (version
2.211) [software], Journal of Statistical Software 42(8) (2011).
Available at: http://imai.princeton.edu/research/les/matchit.pdf
[31] G. Hong and S. W. Raudenbush, Evaluating kindergarten retention
policy, Journal of the American Statistical Association 101(475)
(2006), 901-910.
[32] D. G. Horvitz and D. J. Thompson, A generalization of sampling
without replacement from a finite universe, Journal of the American
Statistical Association 47(260) (1952), 663-685.
[33] K. Imai, L. Keele and D. Tingley, A general approach to causal
mediation analysis, Psychological Methods 15(4) (2010), 309-334.
[34] K. Imai, L. Keele and T. Yamamoto, Identification, inference and
sensitivity analysis for causal mediation effects, Statistical Science
25 (1) (2010), 51-71.
[35] International Association for the Evaluation of Educational
Achievement, The Second International Mathematics Study, Amsterdam,
Netherlands, 1977. Retrieved from
http://www.iea.nl/sims.html
[36] K. G. Jöreskog and D. Sörbom, LISREL8: User’s
Reference Guide, Lincoln Wood, Illinois: Scientific Software
International, 1996.
[37] J. D. Y. Kang and J. L. Schafer, Demystifying double robustness:
A comparison of alternative strategies for estimating a population
mean from incomplete data, Statistical Science 22(4) (2007),
523-539.
[38] D. A. Kenny, J. D. Korchmaros and N. Bolger, Lower level
mediation in multilevel models, Psychological Methods 8 (2003),
115-128.
DOI:10.1037/1082-989X.8.2.115
[39] J. S. Kim and E. W. Frees, Omitted variables in multilevel
models, Psychometrika 71(4) (2006), 659-690.
[40] D. M. LaHuis, M. J. Hartman, S. Hakoyama and P. C. Clark,
Explained variance measures for multilevel models, Organizational
Research Methods 17(4) (2014), 433-451.
[41] G. Leckie, Book review. [Review of the book Handbook of Advanced
Multilevel Analysis, by J. J. Hox & J. K. Roberts], Journal of the
Royal Statistical Society: Series A (Statistics in Society) 174(3)
(2011), 844-845.
[42] G. Leckie, R. French, C. Charlton and W. Browne, Modeling
heterogeneous variance-covariance components in two-level models,
Journal of Educational and Behavioral Statistics 39(5) (2014),
307-332.
[43] M. Lunt, Selecting an appropriate caliper can be essential for
achieving good balance with propensity score matching, American
Journal of Epidemiology 179(2) (2014), 226-235.
[44] R. C. MacCallum, M. Roznowski and L. B. Necowitz, Model
modifications in covariance structure analysis: The problem of
capitalization on chance, Psychological Bulletin 111(3) (1992),
490-504.
[45] D. C. Martin, P. Diehr, E. B. Perrin and T. D. Koepsell, The
effect of matching on the power of randomized community intervention
studies, Statistics in Medicine 12(3-4) (1993), 329-338.
[46] D. McCaffrey and L. Hamilton, Value-Added Assessment in Practice:
Lessons from the Pennsylvania Value-Added Assessment System Pilot
Project, Santa Monica, CA: Rand Corporation, 2007.
[47] B. O. Muthén, Multi level covariance structure analysis,
Sociological Methods & Research 22(3) (1994), 376-398.
[48] L. K. Muthén and B. O. Muthén, Mplus User’s Guide,
Los Angeles: Muthén & Muthén, 1998-2012.
[49] J. Neyman, On the application of probability theory to
agricultural experiments: Essay on principles, section 9, (translated
in 1990), Statistical Science, 5 (1923), 465-480.
[50] A. Pokropek, Phantom effects in multilevel compositional analysis
problems and solutions, Sociological Methods & Research 44 (2015),
677-705.
DOI: 10.1177/0049124114553801
[51] K. J. Preacher, Advances in mediation analysis: A survey and
synthesis of new developments, Annual Review of Psychology 66 (2015),
825-852.
[52] R Development Core Team. R: A Language and Environment for
Statistical Computing, Vienna: R Foundation for Statistical Computing,
Vienna, Austria, 2007. Retrieved from
http://www.R-project.org
[53] G. M. Raab and I. Butcher, Balance in cluster randomized trials,
Statistics in Medicine 20(3) (2001), 351-365.
[54] T. Raykov, T. Patelis, G. A. Marcoulides and C. L. Lee, Examining
intermediate omitted levels in hierarchical designs via latent
variable modeling, Structural Equation Modeling: A Multidisciplinary
Journal, (ahead-of-print), (2015) 1-5. Derived from
http://dx.doi.org/10.1080/10705511.2014.938186
[55] S. W. Raudenbush, Statistical analysis and optimal design for
cluster randomized trials, Psychological Methods 2(2) (1997),
173-185.
[56] S. W. Raudenbush and A. S. Bryk, Hierarchical Linear Models:
Applications and Data Analysis Methods, Thousand Oaks, CA: Sage,
2002.
[57] P. R. Rosenbaum, Observational Study, Springer-Verlag, New York,
2002.
[58] P. R. Rosenbaum and D. B. Rubin, The central role of the
propensity score in observational studies for causal effects,
Biometrika 70(1) (1983), 41-55.
[59] P. R. Rosenbaum and D. B. Rubin, Constructing a control group
using multivariate matched sampling methods that incorporate the
propensity score, American Statistician 39(1) (1985), 33-38.
[60] D. B. Rubin, Matching to remove bias in observational studies,
Biometrics 29(1) (1973a), 159-183.
[61] D. B. Rubin, The use of matched sampling and regression
adjustment to remove bias in observational studies, Biometrics 29(1)
(1973b), 185-203.
[62] D. B. Rubin, Multi variate matching methods that are equal
percent bias reducing, II: Maximum son bias reduction for fixed sample
sizes, Biometrics 32(1) (1976a), 121-132.
[63] D. B. Rubin, Multivariate matching methods that are equal percent
bias reducing, I: Some examples, Biometrics 32(1) (1976b), 109-120.
[64] D. B. Rubin, Using multivariate matched sampling and regression
adjustment to control bias in observational studies, Journal of the
American Statistical Association 74(366) (1979), 318-328.
[65] D. B. Rubin, Bias reduction using Mahalanobis-metric
matching, Biometrics 36(2) (1980), 293-298.
[66] D. B. Rubin, The use of propensity scores in applied Bayesian
inference, Bayesian Statistics 2 (1985), 463-472.
[67] D. B. Rubin, Formal modes of statistical inference for causal
effects, Journal of Statistical Planning and Inference 25(3) (1990),
279-292.
[68] D. B. Rubin, Matched Sampling for Causal Effects, Cambridge
University Press, New York, 2006.
[69] D. B. Rubin and R. P. Waterman, Estimating the causal effects of
marketing interventions using propensity score methodology,
Statistical Science 21(2) (2006), 206-222.
[70] W. H. Schmidt and L. Burstein, Concomitants of growth in
mathematics achievement during the population a school year, In L.
Burstein (Ed.), The IEA Study of Mathematics III: Student growth and
classroom processes (pp. 309-327), Pergamon Press, Oxford, UK,
1992.
[71] J. S. Sekhon, Multivariate and propensity score matching software
with automated balance optimization: The matching package for R,
Journal of Statistical Software 10(2) (2007), 1-51.
[72] J. S. Sekhon and A. Diamond, Genetic matching for estimating
causal effects: A general multivariate matching method for achieving
balance in observational studies, (2008). Retrieved July 18, 2009,
from
http://sekhon.berkeley.edu/papers/GenMatch.pdf
[73] E. Schlueter, B. Meuleman and E. Davidov, Immigrant integration
policies and perceived group threat: A multilevel study of 27 Western
and Eastern European countries, Social Science Research 42(3) (2013),
670-682.
[74] W. R. Shadish, T. D. Cook and D. T. Campbell, Experimental and
Quasi-Experimental Design for Generalized Causal Inference, Boston:
Houghton-Mifflin, 2002.
[75] R. L. Solomon, An extension of control group design,
Psychological Bulletin 46(2) (1949), 137-150.
[76] S. Sun and W. Pan, Investigating the accuracy of three estimation
methods for regression discontinuity design, The Journal of
Experimental Education 81(1) (2013), 1-21.
[77] E. A. Stuart and D. B. Rubin, Matching with multiple control
groups with adjustment for group differences, Journal of Educational
and Behavioral Statistics 33(3) (2008), 279-306.
[78] I. Televantou, H. W. Marsh, L. Kyriakides, B. Nagengast, J.
Fletcher and L. E. Malmberg, Phantom effects in school composition
research: Consequences of failure to control biases due to measurement
error in traditional multilevel models, School Effectiveness and
School Improvement 26(1) (2015), 75-101.
[79] D. Tofighi and F. Thoemmes, Single-level and multilevel mediation
analysis, The Journal of Early Adolescence 34(1) (2014), 93-119.
[80] D. Tofighi, S. G. West and D. P. MacKinnon, Multilevel mediation
analysis: The effects of omitted variables in the 1-1-1 model, British
Journal of Mathematical and Statistical Psychology 66(2) (2013),
290-307.
[81] Q. Wang, Propensity Score Matching on Multilevel Data, In W. Pan
and H. Bai (Eds.), Propensity Score Analysis: Fundamentals and
Developments, Guilford, New York, NY, 2015.
[82] Q. Wang, Matching for Bias Reduction in Treatment Effect
Estimation of Hierarchically Structured Synthetic Cohort Design Data,
Unpublished Doctoral Dissertation, Michigan State University, East
Lansing, MI, 2010.
[83] D. E. Wiley and R. G. Wolfe, Major survey design issues for the
IEA third international mathematics and science study, Prospects 22(3)
(1992), 297-304.
[84] R. G. Wolfe, Second international mathematics study: Training
manual for use of the databank of the longitudinal, classroom process
surveys for population a in the IEA second international mathematics
study, (Contractor’s Report), Washington, DC: Center for
Education Statistics, 1987.
[85] J. M. Wooldridge, Econometric Analysis of Cross-section and Panel
Data, Cambridge, Massachusett: MIT Press, 2002.
[86] S. Wu, Y. Ding, F. Wu, J. Hou and P. Mao, Application of
propensity-score matching in four leading medical journals,
Epidemiology 26(2) (2015), e19-e20.