Design Principles — State Evaluation of Principal Preparation Programs (SEP<sup>3</sup>-Kit)

As described in the concept paper, Improving State Evaluation of Principal Preparation Programs, and the SEP³ Toolkit Guide describe five research-based core design principles for strong evaluation of Principal Preparation programs.

Promote continuous program improvement.

Effective program review systems encourage improvement and innovation in program design and implementation by doing two things: providing programs with specific and actionable feedback about their practices and outcomes, and allowing adequate time for programs to make changes and assess their impact. To provide this level of accurate and actionable feedback, systems must employ program reviewers who have relevant expertise for making appropriate judgments. The reviewers should possess content expertise in leadership, an understanding of adult learning theory and practices, knowledge of current research about effective leadership preparation, and the ability to accurately assess curriculum and pedagogy.

Support states in holding programs accountable for effective practices and outcomes.

An evaluation system is a key way for states to hold preparation programs accountable for delivering high quality preparation for aspiring principals. With approximately 1,000 programs currently in operation and new ones emerging on a regular basis, states need to be able to confidently make consequential decisions such as whether to approve a program, when to put a program on an improvement plan and, in the most serious circumstances, when to rescind program approval. States need to understand the limitations of the indicators they track and ensure they have sufficient and valid information for making consequential decisions. States also need sound program ratings, based on a sufficient number of indicators to meaningfully capture performance and improvement over time. Finally, states need a clear process and timeline for intervening when programs demonstrate unacceptable performance.

Provide key stakeholders with accurate and useful information.

When key consumers and partners—especially aspiring school leaders and school districts— have good information about key program indicators, they can use that information to make more informed choices. For aspirants, a state evaluation system can provide concrete information about program features and outcomes (e.g., candidate learning and career outcomes)— including side-by-side, apples-to-apples comparisons—thus helping them choose high-quality programs. For districts, the same information can guide decisions concerning formal partner- ships with programs and the hiring of graduates. To meet these purposes, effective evaluation systems make high-quality, easily understandable program data available to the public. (See the sidebar describing important considerations about making data publicly available.)

Are sophisticated and nuanced in their approach to data collection and use.

Evaluate what matters. The data system includes the indicators that are most germane to preparation. We define program effectiveness in terms of inputs (especially the rigor of selection into a program), processes (especially the ways in which a program increases aspirants’ leadership knowledge and skills), outputs, (especially aspirants’ successful placement in roles as principals), and contributions to important graduate outcomes (especially outcomes for students, including academic achievement measures, attainment measures such as graduation, and non-cognitive measures such as engagement and social/emotional growth).
Evaluate accurately. The data system uses the most accurate data available, and interpretations are made cautiously, with awareness of data limitations. Valid and reliable measures of leadership effectiveness are still in the early stages of development but, once confidence in their accuracy is established, could be a part of the review process. The system takes into account limitations related to reliability and validity in determining whether and how much to weigh particular data sources in evaluation.
Include data that can be realistically gathered and shared. Data are feasible to gather, efficient to report, and possible to corroborate with other sources of information. Further, data collection is ongoing and conducted according to a consistent schedule.
Consider contextual factors. Data are means, not ends. In order to make appropriate judgments based on accurate results, states gather additional contextual information.Basic indicator results can be difficult to interpret on their own, but can be the basis of productive investigation into and conversations about program quality and improvement. Analyses of program-related data inform judgments about program status and the need for continued program development.
Clearly and transparently communicate how results will be used. Programs understand which data will be made public, including how and when this will occur. Programs also understand how component parts of the program evaluation will be used to make substantive judgments and decisions about program status.

Adhere to characteristics of high-quality program evaluation.

An effective state system of program evaluation reflects what we know about best practices in program evaluation in education. We recommend the Standards for Educational Evaluation as a basis for judging best practices. These standards focus on utility (i.e., the extent to which stakeholders find processes and results valuable), feasibility (i.e., the effectiveness and efficiency of evaluation processes), propriety (i.e., the fairness and appropriateness of evaluation processes and results), accuracy (i.e., the dependability of evaluation results, especially judgments of quality), and accountability (i.e., having adequate documentation to justify results).