ADDIE Explained: Evaluation
By: Matthew Wilson, Shilpa Sahay and Cheryl Calhoun
Objectives
At the end of this chapter, you will be able to:
- Summarize the theoretical foundations of evaluation and its application.
- Define evaluation.
- Categorize tasks roles within the three types of evaluation.
- Explain the process of evaluation and its role within instructional design.
- Recognize leaders in the domain of evaluation.
- Develop a plan for ongoing evaluation of an instructional design project.
Why Do We Evaluate?
Evaluation helps us to determine whether our instructional implementation was effective in meeting our goals. As you can see in figure 1, evaluation sits at the center of the ADDIE model, and it provides feedback to all stages of the process to continually improve our instructional design. Evaluation can answer questions such as: Have the learners obtained the knowledge and skills that are needed? Are our instructional goals effective for the requirements of the instructional program? Are our learners able to transfer their learning into the desired contextual setting? Do our lesson plans, instructional materials, media, assessments, etc. meet the learning needs? Does the implementation provide effective instruction and carry out the intended lesson plan, and instructional objectives? Do we need to make any changes to our design to improve the effectiveness and overall satisfaction with the instruction? These questions help shape the instruction, confirm what and to what extent the learn is learning, and validates the learning over time to support the choices made in the instructional design, as well as how the program holds up over time.
What is Evaluation?
To get started with evaluation, it is crucial to understand the overall picture of the process. The use of varied and multiple forms of evaluation throughout the design cycle is one of the most important processes for an instructional designer to employ. To that end, this first section of the evaluation chapter attempts to explain what evaluation is in terms of the varying types of evaluation; evaluation’s overarching relationship throughout the ADDIE model; the need for both validity and reliability; how to develop standards of measurement; and evaluation’s application in both education and training.
What Does Evaluation Look Like?
The first, most important step to understanding evaluation is to develop a knowledge-base about what the process of evaluation entails. In other words, a design must comprehend evaluation in terms of the three components of evaluation: formative, summative, and confirmative. Each of these forms of evaluation is examined in detail here, both through the definition of the form itself and an explanation of some of the key tools within each.
Formative
Historically speaking, formative evaluation was not the first of the evaluation processes to have been developed, but it is addressed first in this chapter because of its role within the design process. Yet, it is important to place the development of the theory behind formative evaluation in context. Reiser (2001) summarizes the history of formative evaluation by explaining that the training materials developed by the U.S. government in response to Sputnik were implemented without verifying their effectiveness. These training programs were then later demonstrated to be lacking by Michael Scriven, who developed a procedure for testing and revision that became known as formative evaluation (Reiser, 2001).
Formative evaluation is the process of ongoing evaluation throughout the design process for the betterment of design and procedure within each stage. One way to think about this is to liken it to a chef tasting his food before he sends it out to the customer. Morrison, Ross, Kalman, and Kemp (2013) explain that the formative evaluation process utilizes data from media, instruction, and learner engagement to formulate a picture of learning from which the designer can make changes to the product before the final implementation. Boston (2002, p. 2) states the purpose of formative evaluation as “all activities that teachers and students undertake to get information that can be used diagnostically to alter teaching and learning.” Regardless if instructional designers or classroom practitioners conduct the practice, formative evaluation results in the improvement of instructional processes for the betterment of the learner.
To effectively conduct formative evaluation, instructional designers must consider a variety of data sources to create a full picture of the effectiveness of their design. Morrison et al. (2013) propose that connoisseur-based, decision-oriented, objective-based, public relations, constructivist evaluations are each appropriate data points within the formative process. As such, an examination of each format in turn will provide a framework for moving forward with formative evaluation.
Connoisseur-Based
Subject-matter experts, or SMEs, and design experts are the primary resources in connoisseur-based evaluations. These experts provide instructional analyses, performance objectives, instruction, test and other assessments to verify objectives, instructional analysis, context accuracy, material appropriateness, test item validity, sequencing. Each of these points allow the designer to improve the organization and flow of instruction, the accuracy of content, the readability of materials, the instructional practices, and total effectiveness (Morrison et al., 2013). In short, SMEs analyze the instruction from which they make suggestions for improvement.
Decision-Oriented
Often as instructional designers, one must make choices within the program of study being developed that require reflective thought and consideration. Morrison et al. (2013) describe this type of formative evaluation as decision-oriented. The questions asked during decision-oriented evaluations may develop out of the professional knowledge of an instructional designer or design team. These questions subsequently require the designer to develop further tools to assess the question, and as such should be completed at a time when change is still an option and financial prudent (Morrison et al., 2013).
Objective-Based
If a program of study is not delivering the desired results, a provision for possible change should be considered. Through an examination of the goals of a course of instruction, the success of a learner’s performance may be analyzed. This is the primary focus for objective-based evaluation. While making formative changes are best conducted during earlier stages of the ADDIE cycle, these changes may come later if the situation dictates it. Objective-based evaluations may generate such results. According to Morrison et al. (2013), when summative and confirmative evaluations demonstrate undesirable effects, then the results may be used as a formative evaluation tool to make improvements. Morrison et al. (2013) recommend combining the results of objective-based evaluations with connoisseur-based because of the limited ability to make changes from the data from pre-test/post-test format objectives-based assessment typically employs. However, Dimitrov and Rumrill (2003) suggest that analysis of variance and covariance statistical tests can be used to improve test design. The application of statistical analyses improves the validity and reliability of the design. Therefore, this may also suggest that similar comparisons may also be useful in improving overall instruction.
Public Relations
Occasionally the formative process for a program may call for showing off the value of the project as it is being developed. Morrison et al. (2013, p. 325) refers to this form of formative data as “public-relations-inspired studies.” Borrowing from components of the other formats discussed above, this type of evaluation is a complementary process that combines data from various sources to generate funding and support for the program (Morrison et al., 2013). However, it should be noted that this process should happen during the later stages of development, because the presentation of underdeveloped programs may do more harm than good in the development process (e.g., the cancellation of pilot programs due to underwhelming results).
Constructivist Methods
Some models of evaluation are described as being behavior driven and biased. In response to those methods, multiple educational theorist have proposed that the use of open-ended assessments allowing for multiple perspectives that can be defended by the learner (Richey, Klein, & Tracey, 2011). Such assessments pull deeply from constructivist learning theory. Duffy and Cunningham (1996) make the analogy that “an intelligence test measures intelligence but is not itself intelligence; an achievement tests measures a sample of a learned domain but is not itself that domain. Like micrometers and rulers, intelligence and achievement tests are tools (metrics) applied to the variables but somehow distinct from them” (p. 17). How does this impact the formative nature of assessment? Constructivist methods are applicable within the development of instruction through the feedback of the learner to shape the nature of learning and how it is evaluated.
Summative
Dick et al. (2009) claim the ultimate summative evaluation question is “Did it solve the problem” (p. 320)? That is the essence of summative evaluation. Continuing with the chef analogy from above, one asks, “Did the customer enjoy the food?” The parties involved in the evaluation take the data, and draw a conclusion about the effectiveness of the deigned instruction. However, over time summative evaluation has developed into a process that is more complex than the initial question may let on. In modern instructional design, practitioners investigate multiple questions through testing to assess the learning that ideal happens. This differs from the formative evaluation above in that summative assessments are not typically used to assess the program, but the learner. However, summative evaluations can also be used to assess the effectiveness of learning, efficiency and cost-effectiveness, lastly attitudes and reactions to learning (Morrison et al., 2013).
Learning Effectiveness
Like the overall process of summative evaluation is summarized above with one simple question, so can its effectiveness. How well did the student learn? Perhaps even, did we teach the learner the right thing? “Measurement of effectiveness can be ascertained from test scores, ratings of projects and performance, and records of observations of learners’ behavior” (Morrison et al., 2013, p. 328). However, maybe the single question is not enough. Dick et al. (2009) outlines a comprehensive plan for summative evaluation throughout the design process, including collecting data from SMEs and during field trials to feedback. This shifts the focus from the learner to the final form of the instruction. Either way, the data collected tests the successfulness of the instruction and learning.
Learning Efficiency and Cost-Effectiveness
While learning efficiency and cost-effectiveness of the instruction are certainly distinct constructs, the successfulness of the former certainly impacts the later. Learning efficiency is a matter of resources (e.g., time, instructors, facilities, etc.), and how those resources are used within the instruction to reach the goal of successful instruction (Morrison et al., 2013). Dick et al. (2009) recommend comparing the materials against an organization’s needs, target group, and resources. The end result is the analysis of the data to make a final conclusion about the cost-effectiveness based on any number of prescribed formulas. Morrison et al. (2013) acknowledge the relationship between this form of summative evaluation and confirmative, and sets the difference at the time it takes to implement the evaluation.
Attitudes and Reactions to Learning
The attitudes and reactions to the learning, while integral to formative evaluation, can be summatively evaluated, as well. Morrison et al. (2013) explain there are two uses for attitudinal evaluation: evaluating the instruction and evaluating outcomes within the learning. While a majority of objectives within learning are cognitive, psychomotor and affective objectives may also be goals of learning. Summative evaluations often center on measuring achievement of objectives. As a result, there is a natural connection between attitudes and the assessment of affective objectives. Conversely, designers may utilize summative assessments that collect data on the final versions of their learning product. This summative assessment measures the reactions to the learning.
Confirmative
The customer ate the food and enjoyed it. But, did they come back? The ongoing value of learning is the driving question behind confirmative evaluation. Confirmative evaluation methods may not differ much from formative and summative outside of the element of time. Confirmative evaluation seeks to answer questions about the learner and the context for learning. Moseley and Solomon (1997) describe confirmative evaluation as falling on a continuum between a customer’s or learner’s expectation and assessments.
Evaluation’s Relationship within the ID Process
Whenever examining the premises evaluation, one must look at the connections to other areas of the instructional design process. Each form of evaluation (formative, summative, and confirmative) can make significant difference to the quality of instruction when applied throughout the stages of the ADDIE model. Each of these is examined in depth later in this chapter, so are only briefly summarized here. During analysis, evaluation tools can be developed to assist in the breakdown of content into task, content, and tasks. Formative and summative assessment of the instructional design conducted by the SMEs involved in the project shape and finalize these lists prior to the design stage. Alternately, confirmative analysis may be used as a new form of learner analysis from which redesign may develop. Design decisions, like the sequencing, strategies used, and the instructional message, are once again areas to use formative assessment. Feedback from SMEs and focus groups frame the decision made by designers in this area. Also, design looks at the objectives and the assessment of those objectives as part of the consideration before moving forward to develop instructional materials. Development of materials by designers requires examining the best practices within research to create materials. Furthermore, designers examine the overall program cost. As referenced above, both of these are key factors gauged with summative, and more likely confirmative, evaluations. As instruction is implemented, the picture of evaluation is painted with all three brushes. Learners are assessed using all three methods to shape learning and pacing, get the measure of performance, and make long-term determinations about the change in student practice. Clearly, evaluation is a deep and ongoing process throughout the design of instruction.
Evaluation Check Point
Validity and Reliability
Who evaluates the evaluators? This is the question of validity and reliability. Morrison et al. (2013) define both validity and reliability as the evaluation measures learning and the evaluation is a consistent measure, respectively. In Figure 1, Trochim (2006) illustrates a common analogy for validity and reliability.
Figure 1. Validity vs. reliability scatterplots.
As can be seen with the third portion of the figure, validity and reliability do not always go together. Each must be considered separately to achieve both within the design process.
Both validity and reliability require evaluation in and of themselves to evaluate the instructional design and learning that is taking place. Shepard (1993) states the four types of validity are content validity, construct validity, concurrent validity, and predictive validity, with the first three representing the primary methodology. Each of these validity forms has a preferred methodology unique to the type. For example, Morrison et al. (2013) suggest one method for achieving content validity is the use of a performance-context matrix to verify test question content. Instructional designers can evaluate reliability through a number of methods directly related to quantitative research methods. Some manners might be analysis of a test-retest method, split-half correlations within a single test administration, or examining reliability coefficients (Morrison et al., 2013).
When and Who Does It?
When a new program has been endorsed and accepted by an institution or an organization, there emerges the need for an evaluation. An instructional designer is approached to undertake this task of evaluation. It is significant for the implementing agency to realize that the achievements of the new program are aligned to the goals and purposes of the program. For this evaluation, an instructional designer conducts three layers of the abovementioned types of evaluation respectively- formative, summative and confirmative evaluation.
When a teacher or a designer develops a lesson plan, one must keep in mind that what seems plausible as an idea or in the initial stages might not work out the same way when put to a full blown functional program. Here, formative assessment takes a significant role in the instructional design process of the lesson plan. As has been defined above, the first phase of evaluation- the formative evaluation is conducted since the inception of the idea, concept stage to the development and tryouts of the instructional materials and lesson plans stage. The purpose of this assessment by the instructional designer becomes crucial because one might figure out that it is not working out as expected before one puts and wastes a lot of valuable time and resources (Morrison, Ross, Kalman, Kemp, 2011).
When it is believed and tested that the initial concepts and design of the program are sound, the program is implemented with one’s target audience over a period of time. An instructional designer prepares and comes out with the summative evaluation that aims to test the success of the program at the end of the training. The different tools used for evaluation will be explained in the next section of this chapter. The purpose of the summative evaluation by the designers again has a significant role to play in order to assess the success or the failure of the training program. Usually the designer of the program conducts posttests and a final examination to test where the learners stand in gaining the newly trained knowledge and also assessing how far the training stayed close to the objectives of the program.
The third type of evaluation conducted by program evaluators is confirmative evaluation which was originally introduced by Misanchuk (1978) based on the logic that evaluation needs to move beyond summative evaluation in order to get an improved, large-scale, replicative in nature training model.
Evaluating Reaction
It has be said that in order to have an effective teaching, one requires frequent feedback from the learners to check the learning progress and also monitor the efficacy of the pedagogical process selected for teaching (Heritage, 2007). An instructional designer can evaluate both the teacher and the learner’s initial reaction to a new pedagogical instruction. Liz Hollingworth (2012) maintains that formative assessments are metacognitive tools designed to analyze the initial teaching and learning profile for both teachers and students in order to track their reaction and progress over a period of time.
Evaluating reactions is the first step towards forming a sound assessment base. It will be beneficial if an evaluator or an instructional designer evaluates the initial reactions towards the newly introduced training program. Once it is believed that there is no resistance by the learners towards the new program, one may assume that learners will not drop out of the program due to reasons like non-acceptance or inability to understand the ideas and concepts of the training in the first place itself. It also helps the evaluator to control the gear of the program as one move ahead in the training phase. It leaves less frustration and vagueness in the evaluator’s mind if one knows that all the learners are positively oriented towards undertaking the training.
Evaluating Learning
An evaluator or the instructional designer of the program continues with the process of evaluation as the teaching and learning unfold with the implementation of the training program. Several studies in the field of educational measurement have suggested that assessments and evaluations lead to higher quality leaning. Popham (2008) calls this new aspect of assessment in the evaluation process as “Transformative Assessment” where an evaluator identifies learning progression of the learners by analyzing the sequence of skills learnt over the period of study program. This also helps the evaluator or the instructional designer to invent ways to assess how much has the learners mastered the learning material, making one go back to formative assessments.
Evaluating learning is an ongoing process in the phase of instructional assessment. It is important to evaluate whether an evaluator continued to be focused on the original problem and also whether the training materials developed solved the problems that were identified. These objectives can be tested by evaluating the knowledge gained by the learners through ongoing formative evaluation as well summarizing summative evaluation techniques (discussed in the next section). This is one of the most important aspects to the success of the program. When a trainee masters the content of the training or exhibits proper learning through tests outcome, one can assume the effectiveness of the program and also draw out what did not work if the learning outcomes show adverse results.
Evaluating Behavior
Attitudes and behavior are important indicators towards the acceptance and success of a training program. Dick (2009) mentions that an evaluator needs to write directions to guide the learner’s activities and construct a rubric (a checklist, a rating scale, etc.) in order to evaluate and measure performance, products, and attitudes. A learner develops several intellectual and behavioral skills and an evaluation can suggest what changes have been brought in the attitude and behavior of the learners. Testing behavior would fall under the novel approaches which evaluators are conducting these days along with developing the traditional tests for evaluating test results and so on.
Evaluating Results
With every training program, evaluating results is the most significant task by an evaluator that determines how closely one has been able to achieve success in the implementation of the program. An evaluator conducts a summative evaluation in order to test the effectiveness of learner’s learning. The evaluator also measures several other factors while evaluating the result of the program. As mentioned by Morrison, Kemp et al. (2011), an evaluator will also measure the efficiency of learning in terms of materials mastered and time taken; cost of program development; continuing expenses; reactions towards the program; and long-term benefits of the program. Apart from the summative evaluation at this stage, an evaluator might also conduct a confirmative evaluation with the results that will be a step towards confirming what worked and what did not work in the program for its large-scale replication, if needed.
Evaluation Check Point
How Do We Do It?
Formative Evaluation
Formative evaluation occurs during instructional design and is the process of evaluating instruction and instructional materials to obtain feedback that in turn drives revisions to make instruction more efficient and effective. It is an iterative process that includes at least three phases. Begin with one-to-one evaluation, then small group evaluation, and finally a field trial. Results from each phase of evaluation are fed back to the instructional designers to be used in the process of improving design.
Figure 2. The cycle of formative evaluation.
After each phase, the ID should consider the results of the evaluation and meet with project stakeholders to make decisions about instructional changes or whether to move to the next stage of evaluation. Data and information are collected and summarized in the formative evaluation and used to make decisions about whether or not the instruction is meeting its intended goals. Instructional elements can then be revised to improve instruction. The development and implementation of formative evaluation requires the involvement of Instructional Designers, Subject Matter Experts, Target Learners, and Target Instructors.
One-to-One
The purpose of the one-to-one evaluation is to identify and remove the most obvious errors and to obtain initial feedback on the effectiveness of the instruction. During this evaluation IDs should be looking for clarity, impact and feasibility (Dick, Carey, & Carey, 2009, p. 262). Results from one-to-one evaluation can be used to improve instructional components and materials before a pilot implementation.
Select a few learners who are representative of the target learners. Choose learners that represent the variety of learners that will participate in instructions. Don’t choose learners that represent the extremes of the population. It would be good to have one average, one above average and one below average learner. This will ensure the instructional materials are accessible to a variety of learners.
The one-to-one evaluation is much like a usability study. Ensure the learner that what is being evaluated is the instruction and instructional materials, not the learner. The learner should be presented with the instructional materials that will be provided during the instruction. Encourage the learner to discuss what they see, write on materials as appropriate, note any errors, etc. The ID can engage the learner in dialog to solicit feedback on the materials and clarity of instruction.
There are many technological tools that can facilitate a one-on-one evaluation. In Don’t Make Me Think (Krug, 2014) Steve Krug describes a process of performing a usability study for web site development. The steps he provides are a good guide for performing a one-to-one evaluation. Krug recommends video recording the session for later analysis. If instruction is computer based, there are also tools available that can record the learner interaction as well as video record the learner’s responses. Morae from Techsmith is a tool that allows you to record user interactions and efficiently analyze the results.
Small group
Small group evaluation is used to determine the effectiveness of changes made to the instruction following the one-to-one evaluation and to identify any additional problems learners may be experiencing. Additionally, the question of whether or not learners can use the instruction without interaction from the instructor is evaluated here.
In the small group evaluation, the instructor administers the instruction and materials in the manner in which they are designed. The small-group participants complete the lesson(s) as described. The instructional designer observes but does not intervene. After the instructional lesson is complete, participants should be asked to complete a post-assessment designed to provide feedback about the instruction.
Field Trial
After the recommendations from the small group evaluation have been implemented it is time for a field trial. A field trial is conducted exactly as you would conduct instruction. The selected instruction should be delivered as close as possible to the way they are designed to be implemented in the final instructional setting and instruction should occur in a setting as close to the targeted setting as possible. Learners should be selected that closely match the characteristics of the intended learners. All instructional materials for the selected instructional section, including the instructor manual, should be complete and ready to use.
Data should be gathered on learner performance and attitudes. In addition data should be gathered about the time required to use the materials in the instructional context and the effectiveness of the instructional management plan. During the field trial the ID does not participate in delivery of instruction. The ID and the review team will observe the process and record data about their observations.
Summative Evaluation
The purpose of a summative evaluation is to evaluate instruction and/or instructional materials after they are finalized. It is conducted during or immediately after implementation. This evaluation can be used to document the strengths and weaknesses in instruction or instructional materials to make a decision about whether or not to continue instruction. Or it could be used to determine whether or not to adopt instruction. External evaluators for decision makers usually conduct summative evaluation. Subject matter experts may be needed to ensure integrity of the instruction and/or instructional materials.
The summative evaluation has two phases the expert judgment and the field trial. The expert judgment tries to answer the question of whether or not the developed instruction meets the organizational needs. The purpose of the field trial is to answer the question of whether or not the developed instruction is successful in producing the intended learning gains. Table 2 gives examples of the questions that are being asked in each of the two phases of the summative evaluation.
Table 1. Summative evaluation by question and phase. (Dick, Carey, & Carey, 2009, p. 321)
Summative Evaluation | |
Expert Judgment Phase | Field Trial Phase |
Overall Decisions | |
Do the materials have the potential for meeting this organization’s needs? | Are the materials effective with target learners in the prescribed setting? |
Specific Decisions | |
Congruence Analysis: Are the needs and goals of the organization congruent with those in the instruction?
Content Analysis: Are the materials complete, accurate, and current? Design Analysis: Are the principles of learning, instruction, and motivation clearly evident in the materials? Feasibility Analysis: Are the materials convenient, durable, cost-effective, and satisfactory for current users. |
Outcomes Analysis:
Impact on Learners: Are the achievement and motivation levels of learners’ satisfactory following instruction? Impact on the Job: Are learners able to transfer the information, skills, and attitudes from instructional setting to the job setting or to subsequent units of related instruction? Impact on Organization: Are learners’ changed behaviors (performance, attitudes) making positive differences in the achievement of the organization’s mission and goals (e.g. reduced dropouts, resignations, improved attendance, achievement, increased productivity, grades)? Management Analysis: 1. Are instructor and manager attitudes satisfactory? 2. Are recommended implementation procedures feasible? 3. Are costs related to time, personnel, equipment, and resources reasonable? |
Expert Judgment (Congruence Analysis)
The expert judgment phase consists of performing a congruence analysis consisting of a content analysis, design analysis, utility and feasibility analysis, and current user analysis. You will need copies of the organizational goals and instructional materials for the evaluation. See Table 2 for a list of questions that you are trying to answer during this phase. As each phase is conducted a no, no go, decision is made. If no, then the materials are sent back with feedback for further design and development. This phase is conducted with the instructional designer, the subject matter experts and often an external reviewer. Target learners are not involved in this stage of evaluation.
Figure 3. Sequence of the stages of expert judgment.
Quality Matters Peer Review
One example of an expert judgment summative review would be the Quality Matters (QM) peer review process. The Maryland Online (MOL) consortium initiated the QM review process. It was established in 1999 to leverage the efforts of and increase the collaboration among institutions involved in online learning. The QMRubric was developed under a grant by the U.S. Department of Education’s, Fund for Improvement of Postsecondary Education (FIPSE). The goal of the QM project is to improve student learning, engagement, and satisfaction in online courses through better design. The QM peer review is a faculty driven initiative; in which online learning experts develops standards of quality course design and online faculty carry out a peer review in conjunction with the course designer. Rubrics are now available for Higher Education, K-12, and Continuing and Professional Development. (Introduction to the Quality Matters Program, 2013).
The QM Rubric is based on eight general standards of quality course design. While this rubric was developed for online environments, it is easy to see how it could be adapted for use in evaluating face-to-face instruction, or even other delivery methods. The general standard categories are:
- The Course Overview and Introduction
- Learning Objectives and Competencies
- Assessment and Measurement
- Instructional Materials
- Learner Interaction and Engagement
- Course Technology
- Learner Support
- Accessibility
During the peer review process, a team of three reviewers reviews the course. The peer review team may consist of a team leader, a subject matter expert, and at least one external member that is not associated with the subject or area of instruction. This allows for a balance view of instructional design and materials. The ID will prepare a course overview sheet for the design team that includes information about supplementary materials such as the course textbook, or any other information that the review team may need in performing the review.
Figure 4. QM review cycle.
As the review team reviews the course, they evaluate the instructional materials & design against the standards in the rubric. Each standard includes annotations that give examples of effective design elements. Reviewers decide if a standard is Met or Not Met, and then gives thoughtful constructive feedback to the ID that will help them to improve the course. A standard passes if the majority (2 out of 3) reviewers agree the standard passes. The purpose of the review is to improve the course design. The goal is that all courses will pass the review with the 85% met standard. The review process is iterative allowing the ID to make changes as necessary to ensure the course passes the review (see Figure 4).
Field Trial
Once you have completed the expert judgment and determined instructional materials meet the goals of the instruction and are appropriate for the learners, you are ready to begin planning implementation of a field trial. The field trial helps to determine the question of whether or not the instruction is effective with the target learners in their normal context. It also looks at whether or not the implementation plan is feasible and whether or not time, costs and resource allocations on target.
The field trial should be held in a context that closely matches the intended context for instruction. Learners should be selected that closely match the intended learner population. The number of learners selected should also closely match that of the intended implementation. At this phase, the field trial could be for a select section of the instruction, or could be a pilot implementation of the full instructional design.
Confirmative Evaluation
The purpose of a confirmative evaluation is to determine if instruction is effective and if it met the organization’s defined instructional needs. In effect, did it solve the problem? Confirmative evaluation goes beyond the scope of formative and summative evaluation and looks at whether or not the long-term effects of instruction is what we were hoping to achieve. Is instruction affecting behavior or providing learners with the skills needed as determined by the original goals of the instruction?
Confirmative evaluation should be conducted on a regular basis. The interval of evaluation should be based on the needs of the organization and the instructional context. The focus of confirmative evaluation should be on the transfer of knowledge or skill into a long-term context. To conduct a confirmative evaluation, you may want to use observations with verification by expert review. You may also develop or use checklists, interviews, observations, rating scales, assessments, and a review of organizational productivity data.
Who Are The Leaders?
Tyler’s Model
An early and popular planning model was first developed by Tyler in his famous book, Basic Principles of Curriculum and Instruction (1949). It has been claimed that the strength of this model and the taxonomies lies in the fact that it emphasized accountability. May (1986) defines that Tyler’s model considers three primary sources of curriculum- students, society, and subject matter in formulating tentative general objectives of the program that reflects the philosophy of education and the psychology of learning. It’s a linear model where the sequence is followed strictly. A unit plan is prepared on precise instructional objectives that lead to selection and organization of content and learning experiences for the learners. The model stresses an in-depth evaluation of learner’s performance.
CIPP Model
Stufflebeam (1971) describes CIPP model as a sound framework for both proactive evaluation to serve decision-making and retroactive evaluation to serve accountability. The model defines evaluation as the process of delineating, obtaining, and providing useful information for judging decision alternatives.
Figure 5. CIPP Model of evaluation.
The model includes three steps in the evaluation process- delineating, obtaining, and providing. And, it includes four kinds of evaluation- context, input, process, and product (the first letters of the names of these four kinds of evaluation gave the acronym- CIPP). The model provides how the steps in evaluation process interact with these different kinds of evaluation.
Stake’s Model
Stake in 1969 created an evaluation framework to assist an evaluator in collecting, organizing, and interpreting data for the two major operations (or countenances) of any evaluation, which include a) complete description and b) judgment of the program. Popham (1993) defines that Stake’s schemes draw attention towards the differences between the descriptive and judgmental acts according to their phase in an educational program, and these phases can be antecedent, transaction, and outcome. This is a comprehensive model for an evaluator for completely thinking through the procedures of an evaluation.
Scriven’s Model
Scriven provides a transdisciplinary model of evaluation in which one draws from an objectivist view of evaluation. Scriven defines three characteristics to this model: epistemological, political, and disciplinary. Some of the important features of Scriven’s goal free evaluation stress on validity, reliability, objectivity/credibility, importance/timeliness, relevance, scope and efficiency in the whole process of teaching and learning.
Kirkpatrick’s Model
Kirkpatrick’s model of training evaluation proposes four levels of evaluation criteria which include- reactions, learning, behavior and result. Alliger and Janak (1989) identified the three problematic assumptions of the model as – levels being arranged in ascending order from reactions to result; levels being casually linked; and levels being positively inter-correlated.
Figure 6. Kirkpatrick’s model of evaluation.
Though, Kirkpatrick’s model of training evaluation has provided a significant taxonomy of evaluation to the field of instructional design.
Evaluation Check Point
Chapter Summary
Evaluation is the process of determining whether or not the designed instruction meets its intended goals. In addition, evaluation helps us to determine whether or not learners are able to transfer the skills and knowledge learned back into long-term changes in behavior and skills required for the target context. Evaluation also provides the opportunity for instructional designers to ensure all stakeholders are in agreement that the developed instruction is meeting the organizational goals
In this chapter we reviewed what evaluation looks like and its relationship with in the ADDIE ID process. We looked at several models of evaluation including Kirkpatrick’s Model and the four levels of evaluation: Evaluating Reaction, Evaluating Learning, Evaluating Behavior, and Evaluating Results. We also looked at the three phases of evaluation including formative evaluation, summative evaluation and confirmative evaluation. And finally, we reviewed the leaders in the field and their primary contributions to evaluation including: Tyler’s Model, CIPP Model, Stake’s Model, Scriven’s Model and Kirkpatrick’s Model.
Discussions
- Where does evaluation stand in the ADDIE model? How will your flow chart look when you describe evaluation in relation to the other stages of the ADDIE model?
- Describe the three stages of evaluation. Give an example to explain how an instructional designer will use these three stages in any particular situation.
- Which are the five types of formative evaluation methods mentioned in the chapter that assist in collecting data points for the initial evaluation? Which two of these methods will be your preferred choice for your formative evaluation and why?
- What all will be parameters to evaluate the success of the instructional training?
- Validity and reliability are important concepts to be kept in mind while conducting evaluation. How can an instructional designer ensure that these are well covered in one’s evaluation process?
- How will you place evaluation reaction, learning, behavior, and results in your pyramid of evaluating the whole process of the success or failure of the instructional training and explain the reason for your choice?
- There are different ways of assessment discussed in the chapter. Name all of them and discuss any two in detail.
- What are some of the techniques to conduct formative and summative evaluation?
- Several models of evaluation have been discussed in the chapter. Discuss any two of these models in detail and explain how will you apply these models in your evaluation process?
Evaluation Practice Assessment
The end-of-chapter practice assessment retrieves 10-items from a database and scores the quiz with response correctness provided to the learner. You should score above 80% on the quiz or consider re-reading some of the materials from this chapter. This quiz is not time-limited; however, it will record your time to complete. The scores are stored on the website and a learner can optionally submit their scores to the leaderboard. You can take the quiz as many times as you want.
Assignment Exercises
For the following exercises, you may use an instructional module that you are familiar with from early childhood, K-12, Higher Ed, Career and Technical, Corporate or other implementation where instructional design is needed. Be creative and use something from an educational setting that you are interested in. Be sure to describe your selected instructional module as it relates to each of these exercises. You may need to do some additional online research to answer these questions. Be sure to include your references in your responses.
- Describe how you would conduct the three phases of the formative evaluation. Define your strategies, populations, and methodologies for each stage within the process.
- Draw a diagram of the iterative formative evaluation process. What specific pieces of the instructional intervention are considered within each stage of the process? How is the data gathered during this process employed to improve the design of instruction?
- Describe the context and learner selection process you would use for setting up a formative evaluation field trial. What special considerations need to be made to conduct this stage of evaluation effectively?
- What materials should the designer include in a field trial? How do the materials used for field trials contrast with the one-to-one and small group evaluations?
- You have been asked to serve as an external evaluator on a summative evaluation of a training model designed by one of your colleagues. Explain the phases of the summative evaluation that you may be asked to participate in as an external reviewer. Imagine you have created a rubric to help you evaluate the instructional intervention. What items might that rubric contain to help you effectively and efficiently conduct a review?
Group Assignment
Conduct an evaluation study in order to understand how successful an instructional intervention has been in achieving the goals of the designed instruction. Keep in mind the group project conducted in the previous development and implementation chapters and conduct an evaluation study in order to assess the success of achieving the goals and objectives of the instruction. To achieve these goals, you should conduct several rounds of evaluation:
- Conduct a one-on-one evaluation with a student from the target population. Make observations of the student’s actions within the instruction and reactions to the materials, content, and overall design. Propose changes to the instructional intervention based on the sample student’s feedback.
- Conduct a small group evaluation with a group of 3 to 5 learners. This evaluation should reflect the changes you made after the one-to-one stage, and evaluate nearly operational materials and instruction. You must recruit an instructor to deliver instruction. Make observations and have the instructor administer a summative assessment after instruction. Analyze the data gathered, and create a one-page report on the results.
- Implement a field test with an instructor and a student population of at least 15 people. Instructional materials, including the instructor manual, should be complete and ready to use. Gather data on learner performance and attitudes, as well as time required for instruction and the effectiveness of the instructional management plan. Observe the process and record data, and create a final report detailing the full evaluation cycle.
[sta_anchor id=”reference” /]
References
Alliger, G. M., & Janak, E. A. (1989). Kirpatrick’s levels of training criteria: thirty years later. Personnel Psychology.
Boston, C. (2002). The concept of formative assessment. ERIC Digest.
Dick, W., Carey, L., & Carey, J. O. (2009). The systematic design of instruction. Upper Saddle River, New Jersy: Pearson.
Dimitroy, D., Rumrill, J., & Phillip, D. (2003). Pretest-posttest designs and measurement of change. Work: A Journal of Prevention, Assessment and Rehabilitiation, 20(2), 159-165.
Duffy, T., & Cunningham, D. (1996). Constructivism: Implications for the design and delivery of instruction. Handbook of Research for Educational Communications and Technology, 170-198.
Fav203. (2012). ADDIE_Model_of_Design.jpg. Retrieved from http://commons.wikimedia.org/wiki/File:ADDIE_Model_of_Design.jpg#filelinks
Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta.
Hollingworth, L. (2012). Why leadership matters: Empowering teachers to implement formative assessment. Journal of Educational Administration, 50(3), 365-379.
(2013). Introduction to the Quality Matters Program. Quality Matters Program.
Krug, S. (2014). Don’t Make Me Think. New Riders.
May, W. (1986). Teaching students how to plan: The dominant model and alternatives. Journal of Teacher Education, 37(6), 6-12.
Misanchuk, E. (19778). Beyond the formative-summative distinction. Journal of Instructional Development, 2, 15-19.
Morrison, G. R., Ross, S. M., Kalman, H. K., & Kemp, J. E. (2013). Designing Effective Instruction (7th ed.). Hoboken, NJ: John Wiley & Sons.
Moseley, J., & Solomon, D. (1997). Confirmative evaluation: A new paradigm for continuous improvement. Performance Improvement, 36(5), 12-16.
Popham, W. (1993). Educational Evaluation (3rd ed.). Boston, MA: Allyn and Bacon.
Popham, W. (2008). Tranformative Assessment. Alexandria, VA: Association for Supervision and Curriculum Development.
Reiser, R. (2001). A history of instructional design and technology: Part II: A history of instructional design. Educational Technology Research and Development, 49(2), 57-67.
Reiser, R. A. (2012). Trends and issues in instructional design and technology. Boston, MA: Pearson.
Richey, R. C., Klein, J. D., & Tracey, M. W. (2011). The Instructional Design Knowledge Base. New York and London: Taylor & Francis.
Shepard, L. (1993). Evaluating test validity. Review of Research in Education, 405-450.
Stufflebeam, D. (1971). The relevance of the CIPP evaluation model for educational accountability. Retrieved from http://search.proquest.com/docview/64252742?accountid=10920
Trochim, W. (n.d.). Reliability and validity. Retrieved from http://www.socialresearchmethods.net/kb/relandval.php
Wood, B. (2001). Stake’s countenance model: Evaluating an environmental education professional development course. Journal of Environmental Education, 32(2), 18-27.