Why do Assessment at All?

Grading and Assessment

We already do assessment, all the time. We call it grading.

Much of the literature on assessment stresses the difference between grading and assessment. I believe, however, that this is part of what alienates faculty and makes assessment feel like bureaucratic excess.

Assessment is Grading-Sort of:

Assessment is really not brain surgery. It isn't even particularly innovative. It has been around as long as teaching has. It's called grading.

There isn't a single soul on this campus who thinks that students should not be judged for the work that they produce, that they should not have to prove that they have learned something. Assessment feels to us like a task imposed from outside the institution, while grading is our definition of what we do. As a faculty, we can spend months, even years, arguing about grade inflation but don't want to spend an hour talking about what we expect our students to learn and how we might measure it? So why not work from our strengths? Why not try to see the connections between grading and assessment and use them to build an assessment program that has a chance of working. We might actually at the same time improve our grading, obviating the need to discuss grade inflation.

Those in the assessment movement have done us a disservice by constantly telling us that assessment isn't grading. What repels most faculty about assessment is a sense that this is something new and different, that it will be complicated, that it will take a lot of time, and that no one will ever look at it or use it. Fair enough; there are a lot of days I feel the same way. But let's ask, what are the ways in which grading is assessment? When we grade we offer a direct assessment of student learning outcomes. We give a test and the test covers those things we want students to know (the outcomes). We have them write an essay and we evaluate that essay based on the extent to which it demonstrate that the students understand what we want them to understand (more outcomes). Even if we don't articulate a set of "learning objectives" for our assignments, we still do have them in mind. When we evaluate a test, a quiz, a paper, an art project, a musical composition, we assess student learning. If I assign a paper and most of the students do poorly on it, you can bet I am asking myself what went wrong? I am looking to see what I need to change (either the assignment or the teaching) so that the students have a better chance of succeeding. I am sure there isn't one of us who doesn't do the same. In the assessment literature, that's "closing the feedback loop." I have taken something I've learned from the assessment (the test, paper, or whatever) and used it to improve my teaching and hence the students' chances of learning. So assessment really isn't all that alien; what's new is the mandate to make these steps explicit rather than implicit in the grade.

Assessment is also not Like Grading

The assessment literature stresses how different assessment is from grading. And to some extent that's also true. But the basic principles are the same. The difference is the scale. First of all a grade really isn't evaluation of student learning; it's a reification of the evaluation that is going on, which must remain implicit in the grade. What I mean by this is the following: an instructor takes all the things she want students to learn from an assignment (and there might be dozens of things ranging from content to skills) and crams them willy-nilly into one letter.

Let's take the case of three students, all receiving grades of C+ on a paper. One student receives a C+ because, although well-researched and with a strong thesis, the paper is poorly executed with ill-conceived paragraphs, numerous grammatical errors, and a significant number of typos. Student 2 receives a C+ because, although well researched, the paper fails to move from a summary to a thesis that exhibits a creative and analytical response to the research. Student three receives a C+ because, although well argued and demonstrating true creative flair, the student fails to do enough research (a stated component of the assignment). The grades reflect an assessment, but the grade alone cannot tell us which of the learning objectives embedded in it have been achieved and which have not. I count something like six different learning objectives in the example above, all of which are incorporated into that C+. I really think that is a weakness of grading and one that should give us pause. The students who receive that C+ may never figure out which of the learning objectives they failed to demonstrate (at least the grade can't tell them).

Figure 1: Breakout of Learning Objectives for Hypothetical Example

Objective

Student A

Student B

Student C

Research

Good

Good

Poor

Thesis

Good

Poor

Adequate

Organization

Poor

Adequate

Adequate

Creativity

Adequate

Poor

Good

Analysis

Adequate

Poor

Good

Grammar and Usage

Poor

Adequate

Adequate

Assessment asks us to be able to differentiate among those three C+. That is it focuses on teasing out and evaluating the objectives that underlie our assignments (the things we want students to learn). Figure 1 shows how an assessment of the assignment might break out the learning objectives implicit in the letter grade.

Another critical difference between grading and assessment that the above example suggests is that grading is primarily directed toward the individual student. We rarely consider the collective performance of all students in a class, department, or division. The one place where we do tend to look at student grades collectively is in examining grade inflation. So our grade inflation discussion is also connected to questions of assessment. But just as an individual grade cannot tell us about an individual student's learning, our grade inflation statistics cannot tell us collectively what students are actually learning because the grade inflation statistics are as reified as the original grades upon which they were based (that is, the single number, say an average GPA of 3.3 again crams all sorts of learning outcomes for many, many students into a single number and so cannot really tell us much about the effectiveness of our programs).

So what information might we take out of the example above (remember it is entirely made up and bears no relationship to any actual student or students or classes they might take)? It might enable us to make an argument that moves beyond the judgment that "our students write badly." A reading of the data enables us to tease out which writing skills are weakest. In this example, I'd say the biggest problem areas seem to be grammar and organization. Other areas seem at least adequate. Research skills seems to be the strongest. This information might be more useful to faculty in thinking about possible remedies than simply saying students write badly (of course in a real situation, information from more students will make the picture more nuanced).