ELN122: June 2011

Friday, June 24, 2011

Advantages and Disadvantages of Contructed-Response Tests

According to Oosterhof, Conrad, & Ely (2008) constructed-response assessments include short and long essay tests and fill in the blank tests (or completion items as they are otherwise called). They list and discuss the advantages and disadvantages to using these types of assessments, which stem from the test itself to the making and grading of the test. I will discuss completion items first, and then I will cover essay tests.

The advantages of completion tests are three: the test is easy to make, students must give instead of choose an answer, and the number of questions in this type of test can be many. The test is relatively easy to make due to the fact that they usually measure recall of information instead of procedural knowledge. Another reason they are easy to make is because they do not require scoring plans such as those that essay tests require. Since students must give an answer instead of picking an answer (such as they would on a multiple choice test), scores are not negatively affected by guessing as is the case with multiple choice tests. Therefore, completion tests are generally more reliable than selected-response tests. The number of items on these types of tests can be many, which means a better, more complete sampling of the content can be achieved.

Limitations of completion tests are two: they generally measure the recall of information and have a higher scoring error probability than do other formats. Completion-tests measure knowledge of facts, and thus do not generally require higher level thinking skills, which is a goal of a good educational program. Since answers to these types of questions can often have several correct responses, the probability that they will be mechanically scored incorrectly is very present. For example, if a student answers a question in the plural form of the answer instead of the singular, it could be counted wrong (even though it is correct) if the instructor/designer didn’t include that plural response as acceptable when the test was designed and put online.

When designing these types of tests, take several steps to ensure reliability and validity. Always ensure that the items measure the behavior required in the instructional objective, ensure that the reading level of the test is below the reading level of the students in order to prevent a student’s reading proficiency to affect his/her test results (unless the test is an actual reading test), and that the items are written in very direct language. Also, be sure that the blanks represent key words from the learning. Otherwise, the test will measure reading comprehension more than achievement of learning objectives. As was mentioned already, ensure that only a single or set of very homogenous set of responses represent a correct answer. Do not use sentences from the actual readings of the class. This can be problematic because it encourages students to memorize instead of reading and comprehending. Sentences taken out of a paragraph lose their contextual references/clues and therefore can be misinterpreted. Place blanks at the end of the question instead of the beginning or middle of the item to make it easier and faster for students to read the item and supply an answer. Use only one blank per item in order to help the efficiency with which a student can understand what the question is asking and also to eliminate a larger set of correct responses. Finally, if you are using a question that requires a numerical answer, be sure to include the units expected in the response in order to prevent a student from giving an answer in a different unit than what is required and therefore getting a wrong answer (when it actually is correct, but just listed in a different unit). For example, if an answer is 36 inches, make sure you specify the answer must be given in inches; otherwise, the student might answer 3 as in 3 feet.

Essay tests have three advantages and three disadvantages. Advantages are: they measure more directly the behaviors specified by the instructional objectives, they examine the learner’s ability to communicate ideas in writing, and they help instructors gain insight into the thinking that leads to students’ answers which can reveal a student’s logic. Since the instructional objectives can many times be rewritten as an essay test question, essay items can measure the behavior more directly. Therefore, it is important to take care when writing performance objectives. Even though essay tests can help instructors measure a student’s ability to express their thoughts in writing, the goal of the test should be to measure how well a student met the instructional objectives. Therefore, two scores should be used if the instructor wants to evaluate the student’s writing ability: one to measure the proficiency with which the student met the objective(s) and one to measure the writing ability. Another important note here is to be sure not to use the essay test as a means to teach students to write. It is not an effective teaching method because it is a testing situation, not a learning situation. Finally, the last advantage involves ensuring that the student is not using the wrong logic to reach an answer. For example, with a multiple choice test, even though a student chooses a correct answer, it isn’t possible to see why they chose the answer. With an essay question, the student will have to answer correctly and explain his reasoning.

Disadvantages include a smaller sampling of the content than other formats, scoring can be very subjective, and they take more time to score than other formats. The reason essay tests can’t sample as much of the content as other formats is due to the time it takes for a student to respond to the question. Because of this, teachers must take time to write well-developed, strong essay test items. They should also create scoring plans for each essay question that defines a correct answer and how many points per each critical item within the essay will be given. Which leads to another disadvantage of this format: the scoring can be very subjective and even include bias. Finally, the third disadvantage is the amount of time to score the essay test is longer than other formats.

When designing essay tests, teachers should try to follow certain guidelines. The response required by the essay item may be brief or extended. Extended items may not be appropriate for online settings if the question is asking learners to demonstrate more than one skill. Instead, it would be better to break the long question down into shorter questions. This will allow the scoring to be more consistent and will allow broader skills to be assessed. Of course, this is only possible if the questions ask students to demonstrate declarative or procedural knowledge. If the question is asking online students to solve a problem, another test format other than an essay test should be used. Therefore, avoid asking online students to problem solve in an essay item.

In order to develop high quality essay items, teachers should follow six criteria. First, always ensure that the item measures the specific skill/instructional objective. One way to accomplish this is to be sure to NOT allow students to choose the items they will answer. If you do, be sure that all choices assess the same capability. Second, the reading level should be below that of the learner in order to ensure we are measuring the skill and not a student’s reading level. Third, the question should take only ten minutes to answer. Otherwise, it is an extended response item, which should be avoided in online settings. Fourth, a good scoring plan must be devised to ensure the validity of the test. Fifth, the scoring plan should describe a correct and complete response so scorers will be able to identify correct responses more accurately. Last, the item should be written in such a way that the knowledgeable learner will be able to determine the characteristics of a correct answer.

When grading online essay tests, there are certain things that can be done to ensure more consistent scoring. Teachers should be able to read all of the students’ answers given to a certain item before reading the responses to the next item. This helps instructors complete the scoring more quickly and to maintain an clear idea of expectations for that item instead of going through one paper at a time and trying to recall all the expectations for all of the items. Teachers should always read the responses of students in a random order and the papers should be reordered after reading/scoring each item. Research has shown that the quality of a previous paper can affect the scoring of the next paper. Another precaution to ensure more consistent scoring would be to conceal the identity of the student while grading so that no bias exists. Using multiple readers is another way. Finally, items that cause students to answer in various ways to the same question should be revised before being used again as they make it difficult to substantiate whether students have met the instructional objective.

References

Oosterhof, A., Conrad, R., Ely, D. (2008). Assessing Learners Online. Upper Saddle River, NJ: Pearson Education, Inc.

Thursday, June 16, 2011

Best Type of Referenced Assessment for Online Learning

The four types of referenced assessments are: ability, growth, criterion, and norm referenced tests. The type of reference used will provide specific information, which is useful to educators. Therefore, the type of reference used will depend on the type of information the educator needs. This is dependent upon the purpose of the educator, the program or the course, which is in itself dependent upon the grade level of the course (primary or secondary).

Ability referenced assessments compare a learner’s performance with their potential performance. Growth-referenced assessments compare a learner’s performance with past performance to determine growth. Criterion-referenced assessments compare a learner’s performance with specific criteria such as goals, outcomes, or objectives. Norm-referenced assessments compare a learner’s performance with a similar group of students.

In my opinion, criterion-referenced assessments should be used in online assessments to grade students. The goal of most online learning is for students to walk away with information and knowledge for the purpose of applying it in their profession. According to The Centre for Learning and Professional Development at the University of Adelaide (2001), “…in higher education the aim is to also use the subject matter to teach students to think, to develop higher-level cognitive skills including metacognition (think about their thinking). Higher-level cognitive skills include solving problems, analyzing arguments, synthesizing information from different sources and applying what they are learning to new and unfamiliar contexts. To be effective, assessment needs to be an integral part of the learning environment and embedded into the design of the course which involves aligning learning objectives with assessment.” In order to do this, educators often use formative assessments-testing that is done to diagnose what students haven’t grasped yet and still need in order to reach the objectives. Again, the Centre (2011) says, “The purpose of student assessment is to provide support and feedback to enhance ongoing learning and identify what students have already achieved.” Criterion referenced assessment will serve to help us know if students have met the objectives or not and it will drive instruction based on students’ needs as determined throughout the course. According to Oosterhof, Conrad, and Ely (2008), formative assessments work well with criterion referenced interpretations because formative assessments cover specific content and show what a learner can or can’t do. Therefore, criterion-referenced assessments help us to substantiate that students are well prepared for their profession and serve online purposes the best.

Growth-referenced assessments can provide information about how much a student has learned compared to what they knew when they started, but it will not inform us about their overall grasp of the content domain being taught in the course-unless, of course, the pre- and post-test is a good sampling of the content domain of the course. Norm-referenced assessments would rank the students in the class (and only the students in the class), but that will not reveal whether or not the students learned what they need to know, neither will it allow us to compare them to a larger group of students in order to get a better indication of their performance. Ability referenced assessments might tell us which students are most likely to succeed, but not how much a particular student is capable of accomplishing.

When we talk about K-12 online learners, we still need to establish that our learners have accomplished what we have set out for them to accomplish. Therefore, criterion-referenced assessments are also best used here as well. In both levels, primary and secondary, there are also needs for the other referenced assessments: entrance, placement, diagnostic purposes name a few.

References

Centre for Learning and Professional Development. (2011). Effective Learning. Retrieved June 16, 2011 from http://www.adelaide.edu.au/clpd/online/assessonline/effectivelrng/

Oosterhof, A., Conrad, R., Ely, D. (2008). Assessing Learners Online. Upper Saddle River, NJ: Pearson Education, Inc.

Saturday, June 11, 2011

The Difference Between Education and Training

Training and Education are not the same thing. When designing assessments, it is important for teachers to understand the difference between the two. Training is teaching a skill, which usually involves motor skills and contains a complex set of actions or activity. For example, making pottery is a skill. This is a skill that can be learned, but takes time to hone and perfect. Education, on the other hand involves more of the process of thinking and reasoning. Students in an educational program will learn about broad topics that will allow them to problem solve and apply their knowledge in new situations. When in an educational program, students study only a small portion of the whole idea in order to form a foundation with which to draw from when placed in real world situations that require them to use the knowledge gained. In real life and in the classroom, the line between the two is sometimes blurred. An excellent example can be found in a letter to the editor of National Forum: The Phi Kappa Phi Journal, Spring 2000, p. 46 from Robert H. Essenhigh of the Ohio State University which is reproduced at this website: http://www.uamont.edu/facultyweb/gulledge/Articles/Education%20versus%20Training%20.pdf. In the letter Mr. Essenhigh states: “The difference? It's the difference between know how and know why. It's the difference between, say, being trained as a pilot to fly a plane and being educated as an aeronautical engineer and knowing why the plane flies, and then being able to improve its design so that it will fly better. Clearly both are necessary, so this is not putting down the Know-How person; if I am flying from here to there I want to be in the plane with a trained pilot (though if the pilot knows the Why as well, then all the better, particularly in an emergency).”

An excellent example of how the two can be confused happens in the classroom with teaching reading. Many have asked if reading is a skill or not? If it is a skill, students can be trained to read. If it involves education, then a different approach is required. Sounding out words is a skill that can be taught (known as decoding in educational circles) even though abstract concepts such as random symbols being attached to specific sounds are involved. However, being able to sound out and read words doesn’t guarantee that someone will understand the words they are reading. Since reading involves both decoding and comprehension/understanding, we can say that it also involves education. Students must be taught the skill of reading and the more complex concepts behind the skill in order to effectively read.

Training and education each require a different approach to assessment. If we are training, we will need to make sure we assess all critical skills for the complex performance the training is supposed to teach. Otherwise, we can’t be sure the person being trained will be successful when performing the skill. This would be very bad in certain situations, such as the pilot analogy used above. A task analysis must be done to determine all of the parts of the skill in order to be able to teach and assess them. If the content represents education instead of training, a broader base of knowledge will be the topic of our lessons. Basically, training and education differ in that training teaches us to perform a complex task that usually involves motor skills whereas education helps us to know how to think about things, how to problem solve, how to understand the why behind things. Since education is so broad, only a portion of the whole can be learned and then, only a small part of what is learned can be assessed. The reading analogy is a good example here, again. We can educate students to think about what they have read and make connections and evaluations on the material which they can then apply in future readings outside of the classroom. However, we could never possibly imagine all the different types of situations they might encounter when reading outside the classroom or the many different connections or evaluations they may make in those future activities. Neither can we assess those many and varied future situations. Instead, we must narrow our focus and decide specifically what to teach and assess based on our focus. For example, we might narrow our focus to understanding and comparing poetry. Then, we would decide the important concepts about poetry which need to be taught and assessed.

Saturday, June 4, 2011

ELN 122 Lesson 1 Blog 1: Learning Outcomes and Performance Objectives

Learning outcomes and performance objectives are not the same thing. Learning outcomes are what we want students to be able to do/know at the end of instruction (lesson, unit, course, etc). They are more general in nature than performance objectives, but the two are connected to each other. Performance objectives tell us whether or not the student has achieved the specified learning outcome. The performance objective is the description of the exact behavior within a prescribed situation that the student will exhibit in order for the instructor/examiner to know that learning outcomes have been achieved. Performance objectives must be designed carefully to ensure that the behavior elicited will correctly indicate that the student has achieved the type of knowledge (learning outcomes) we want them to have learned as a result of the instruction. Without performance objectives, we can’t show that students have mastered the learning outcomes.

Learning outcomes and performance objectives are related to each other. Outcomes can be categorized based on the type of capability they require of the learner, and the performance objective must measure only that capability in order to be valid. The categories of learning outcomes are: declarative knowledge, procedural knowledge, and problem solving knowledge. Declarative knowledge is being able to state information or knowing that something is the case. It includes more than just memorization, however. Stating a definition of a word can be done as a result of memorization, but it can also be done as a result of experience with the concept of the word. Therefore, being able to state the definition of a word as a result of experience with it is known as declarative knowledge. Procedural knowledge means knowing how to do something. The difference between declarative knowledge and procedural knowledge can be demonstrated with an example. Let’s say that someone states that a poodle is a dog. That is declarative knowledge. Then, let’s say that someone is able to take previously unseen breeds of dogs and categorize them as dogs. That is an example of procedural knowledge, which requires the ability to discriminate, apply rules, and understand concepts. Problem solving knowledge involves using strategies in order to find a solution. It means using what we know and what we can do in order to come to a solution. However, to be a good problem solver, we have to be able to choose between all the information we have and all that we can do in order to find just the right information and abilities needed to solve the specific problem at hand. Problem solving is not about applying rules. So, solving math problems doesn’t count as this category of learning outcomes.

We must know what type of outcome we are going to require of students because the performance objective we use must align with that category in order for us to show/prove that students have mastered the outcome and not some other outcome. For example, if we want students to be able to categorize animals as mammals (procedural knowledge), we must be sure that the test measures the student’s ability to apply the rule of what determines whether or not an animal fits into the category of mammal and not that he/she is simply recalling (declarative knowledge) that a whale is a mammal because he/she saw an example of a whale as a mammal during instruction. It should be noted that in an assessment, students should be provided with unique examples to categorize instead of previous examples seen during instruction.

As stated above, the types of performance objectives used must match with the category of the capability indicated in the learning outcome. Declarative knowledge is simply the stating of information. A performance objective that measures declarative knowledge would require someone to recall information (match terms to definitions). However, procedural knowledge involves discrimination, concept understanding, and rule applying. Therefore, performance assessments can be designed to measure each of these three types of procedural knowledge, and they would each require a behavior of the learner that would measure that exact capability. Finally, problem solving involves using and choosing between all of one’s declarative and procedural knowledge to select the right ones needed to find a solution. The performance objective must ask a student to find a solution to a relevant and previously unused problem in order to measure problem solving ability.

As you can see, both learning outcomes and performance objectives are necessary to determine if students have achieved what we set out for them to accomplish. You must have an idea in mind of what you want students to be able to do at the end of instruction (learning outcome) in order to design assessments that will measure whether or not they have learned what they were supposed to learn. Learning outcomes specify the exact nature of the learning and performance objectives control the measuring tool used in order to ensure we are getting a correct reading.