Tuesday, January 14, 2020

Physics Departments Finally Take Steps To Curb Student Evaluations

College physics teaching staff brainstorm with ideas at a meeting in Boulder, Colorado 2 months ago, on how to improve teaching and teacher evaluations.

Yours truly, giving an advanced astrophysics course to college students, ca. 1978. There was no provision for any student evaluations.
Finally, in the most recent issue of Physics Today  (January, 2020, p. 10),  we learn that (some)  Physics departments, at least, are realizing how woefully inadequate student evaluations of their university teachers really are.  We read, for example, putting the situation in a nutshell:
"For decades student evaluations have been the mainstay of attempts to measure the quality of teaching at colleges and universities across the US and beyond. Now, as part of a growing focus on teaching in higher education, and because of mounting evidence of student biases, those evaluations are increasingly in the crosshairs. A smattering of institutions have begun revamping their approaches to student evaluations of teaching (SETs), and those independent efforts are fueling momentum on a national scale.
SETs have become the norm in higher education because they are convenient and cheap. The questions and scoring vary by discipline and institution, but typically before they see their final grade, students are asked to fill out a survey about the course and the instructor. Department heads or other campus officials calculate averages and often compare a given teacher’s ratings to others’ in the department and across the institution. The ratings inform promotion and tenure decisions and are often the deciding factor in renewing teaching contracts for instructors who are not on the tenure track (see Physics Today, November 2018, page 22)."
I saw this first hand while teaching at two U.S. universities, with too many students more preoccupied by "revenge" - say for low grades in tests, labs or homework -  than delivering an objective evaluation. I realized early that these students were incapable of unbiased judgment, and most likely because their adolescent (and barely post-adolescent) brains were still developing.    In the Physics Today article this was also borne out, as we read (ibid.):
"Study after study has shown that SET responses are biased. In physics, female instructors are often rated 7–13% lower than males, notes physicist Noah Finkelstein, codirector of the Center for STEM learning at the University of Colorado Boulder (CU). Similar patterns are observed in other STEM fields. The degree of disparity varies by discipline, course, level, institution, and other factors, but across the board SETs penalize women, underrepresented minorities, nonnative English speakers, and older and physically less attractive instructors of both sexes. SET ratings are affected by the condition of the classroom, the time of day a course takes place, and other things that are outside the instructor’s control, says Berkeley Physics professor Philip Stark. The strongest correlation with high ratings is expectations, he adds. “If students go in thinking they will get a good grade, they give higher evaluations.”
Most traditional SETs include broad questions like, “How would you rate the quality of the course overall?” and “How would you rate the quality of the instructor overall?” Such questions are coming under increasing criticism because the responses are frequently biased and unactionable—instructors don’t glean ideas about how to improve their teaching. Some responses are even abusive. “That type of question offers up a vacuum to fill,” says Richard Taylor, physics chair at the University of Oregon, “and encourages whatever biases students have, implicit or explicit.”
Students have written, for example, “the teacher is a crybaby,” and “I would rather watch my mother’s head be cut off and her hair used to mop up the blood than take another class with [instructor’s name].
 Such comments take an emotional toll, the instructors who received them say. They also note that instructors can feel pressured to inflate grades in a bid for better ratings."
Now, some 'news flashes'.
For the relatively brief time of my exposure to American higher education, I found most of the students  (at the freshman level), didn't belong in a college setting. They came in to university unprepared from top to bottom, lacking basic skills in numeracy as well as literacy.   It was a wonder they scored enough on the SAT to even be accepted.  No surprise these students were also more likely to be given low grades and to take their grievances out using the student evaluations.

By "numeracy" I don't even mean facility with calculus. I mean skills such as:

- Obtaining fractions from decimals and vice versa, i.e. 0.33 = 1/3 and 1/4 = 0.25

- Using ratios and proportions, i.e.

If x/y =  a/ b  and b = 3a/4 then x/y = 4/3

- Adding and dividing fractions, i.e.

1/3 + 3/4 =   (4 + 9)/ 12  = 13/12

5/6 divided by 2/3 =    5/6  x  3/2 =  15/ 12 =  5/ 4

I realized how far below basic competence some of these purported "college" students were during one Space Physics lab I conducted at Univ. of Alaska- Fairbanks when during an experiment on Snell's law-  to do with refraction and a sketch layout, e.g.
 One student then another asked how could one obtain the ratio of the angles:

Θ2 /  Θ1

And thence:

n1/ n2

I realized then I had to set up side remedial math classes just to ensure the students would be able to make their way through the remaining labs!

I suspected, though I couldn't be absolutely sure, that other physics instructors have experienced similar problems.  And also that these problems could have caused lower student evaluations of their  own performance.
Fortunately,  I left full time teaching at the cusp of the grade inflation-teacher evaluation infection, and before all the social media devices invaded the classroom,  as well as the noxious site,  'Rate My Professor'.   All of these in tandem, I believe, have contributed to an atmosphere of disrespect and casual inattention in the classroom and outside it. Professors, once held in some esteem, are now belittled on sites like 'Rate My Professor'- as well as on Facebook and via Twitter.  Not to mention the actual student evaluations as described above from the Physics Today piece.

To make matters worse, the profs today face not only unruly students, but helicopter parents who are convinced their charges can do no wrong, and also a social media atmosphere that turns them into caricatures. Truthfully, the only profs remaining who have any fun are those whose work is based 90 percent or more on research, not teaching. So they needn't worry about having to seize a cellphone (as one angry prof did at Caltech), and hurling it against the chalkboard in frustration.  Did the Caltech prof get a miserable evaluation from the imp whose device he treated as an illustration in ballistics? Probably.
We also learn that students "aren't the right people to ask about the effectiveness of a course" from the article.   For example, whether "an instructor fosters an atmosphere  consistent with campus goals for inclusion."  I.e. in the opinion of Prof. Noah Finkelstein of CU:
They can’t judge that. I’ve seen questions on whether the instructor has mastery of the material. How on Earth would a student know that?  We are asking students the wrong questions and using the data badly.”

In a "smattering" of physics departments it now all appears to be coming to a head: a realization that this biased 'game' has to change - for the good of departments, instructors and students.

In 2009 the faculty union at Ryerson University in Toronto finally had their fill and  filed a grievance with the university over SETs being an unfair measure of teaching effectiveness. Last year, an arbitrator ruled in the faculty’s favor: Student evaluations at Ryerson can no longer be used to assess teaching effectiveness for high-stakes decisions such as tenure and promotion.
The case could prove to be a harbinger. Traditional SETs will become illegal, predicts Carl Wieman, a physics Nobel laureate at Stanford University and a leader in science, technology, engineering, and math (STEM) education studies. “It will be hard for an institution to say they are still collecting SETs but not using them in tenure and promotion decisions,” he says.

University of California, Berkeley, statistics professor  Stark, who was an expert witness in the Ryerson case, says class-action suits are already in the works. noting:

SETs don’t measure teaching effectiveness; you can’t make a course better with the information that comes in. They are biased. There are all sorts of problems.
The University of Oregon introduced a campus-wide overhaul to teacher evaluations this past fall. It replaced traditional SETs with self-reflection, peer review, and student feedback. As is the case at other universities at the vanguard of revamping their teacher assessments, the questions are now designed to reflect student experience, and students fill out surveys a few weeks into a term and again at the end. The midterm feedback is seen only by instructors, says physics chair Taylor, and it can be helpful for adjusting one’s teaching. The survey responses are no longer numerical ratings, and students are asked to single out something that was especially helpful and something that they would like to see changed. “”  In the words of Sierra Dawson, the university’s associate vice provost for academic affairs:

"We’ve made a complete mental model shift"

Which is just as well and long overdue.  This is given that language that asks students to assess experience in the classroom - rather than personal evaluation of the professor - serves a more objective and rational focus. Indeed, psychology studies show such structures/language  indicates that students (people in general, really) tend to give more thought to questions - and to their responses.

This is what we need in the university setting:  encouragement for more thought and more thoughtful responses, as opposed to knee jerk reaction and emotional responses.    At the end of it we should find more objective and rational  assessment of college teaching, as opposed to the current paradigm of "garbage in, garbage out".   Plus there is the added benefit of less grade inflation and its corrosive effects on educational quality- at least in physics and astronomy.

No comments: