Monday, December 12, 2011

RMA is back!

After an extended hiatus, Re-Mediating Assessment is back.  In the meantime, lots has happened.  Michelle Honeyford completed her PhD and joined the faculty at the University of Manitoba in Winnipeg.  Jenna McWilliam has moved on to Joshua Danish's lab and is focusing more directly critical theory in new media contexts.  She renamed her blog too.

Lots of other things have happened that my student and I will be writing about.  I promise to write shorter posts and focus more on commentary regarding assessment-related events.  I have a bunch of awesome new doctoral students and collaborations who are lined up to start posting regularly about assessment-related issues.


For now I want to let everybody know that today is the official release day of a new volume on formative assessment that Penny Noyce and I edited.  It has some great chapters.  On the Harvard Education Press website announcing the book, my assessment hero Dylan Wiliam said:
"This is an extraordinary book. The chapters cover practical applications of formative assessment in mathematics, science, and language arts, including the roles of technology and teachers’ professional learning. I found my own thinking about formative assessment constantly being stretched and challenged. Anyone who is involved in education will find something of value in this book."
Lorrie Shepard's foreword is a nice update on the state of assessment.  David Foster writes about using the tools from Mathematics Assessment Resources Services in the Silicon Valley Mathematics Initiative Dan Damelin and Kimberle Koile from the Concord Consortium write about using formative assessment with cutting edge technology. (And we appreciate that the Concord Consortium is featuring their book on their website.

For me the best part was the chapter from Paul Horwitz of the Concord Consortium.  Paul wrote a nice review of his work with Thinker Tools and GenScope and the implications of that work for assessment.  Paul's chapter provided a nice context for me to summarize my ten year collaboration with him around GenScope.  That chapter is perhaps the most readable description of participatory assessment that I have managed to write.  A much more detailed account of our collaboration was just accepted for publication by the Journal of the Learning Sciences and will appear in 2011.

I promise you will be hearing from us regularly starting in the new year.  We hope you will comment and share this with others.  And if you have posts or links that you think we should comment on, please let us know.  I will let the rest of the team introduce themselves and add their bios to the blog as they start posting.

Thursday, April 15, 2010

short-sighted and socially destructive: thoughts on Ning's decision to cut free services

Lord knows I'm not a huge fan of Ning, the social networking tool that allows users to create and manage online networks. I find the design bulky and fairly counterintuitive, and modifying a network to meet your group's needs is extremely challenging, and Ning has made it extremely difficult or impossible for users to control, modify, or move network content. Despite the popularity of Ning's free, ad-supported social networks among K-16 educators, the ads that go along with the free service have tended toward the racy or age-inappropriate.

But given the Ning trifecta--it's free, getting students signed up is fast and fairly easy, and lots of teachers are using it--I've been working with Ning with researchers and teachers for the last two years. So the recent news that Ning will be switching to paid-only membership is obnoxious for two reasons.

The first reason is the obvious: I don't want to pay--and I don't want the teachers who use Ning to have to pay, either. One of the neat things about Ning is the ability to build multiple social networks--maybe a separate one for each class, or a new one each semester, or even multiple networks for a single group of students. In the future, each network will require a monthly payment, which means that most teachers who do decide to pay will stick to a much smaller number of networks. This means they'll probably erase content and delete members, starting fresh each time. The enormous professional development potential of having persistent networks filled with content, conversations, and student work suddenly disappears.

Which brings me to my second point: That anyone who's currently using Ning's free services will be forced to either pay for an upgrade or move all of their material off of Ning. This is tough for teachers who have layers upon layers of material posted on various Ning sites, and it's incredibly problematic for any researcher who's working with Ning's free resources. If we decide to leave Ning for another free network, we'll have to figure out some systematic way of capturing every single thing that currently lives on Ning, lest it disappear forever.

Ning's decision to phase out free services amounts to a paywall, pure and simple. Instead of putting limits on information, as paywalls for news services do, this paywall puts limits on participation. In many ways, this is potentially far worse, far more disruptive and destructive, far more short-sighted than any information paywall could be.

If Ning was smart, it would think a little more creatively about payment structures. What about offering unlimited access to all members of a school district, for a set fee paid at the district level? What about offering an educator account that provides unlimited network creation for a set (and much lower) fee? What about improving the services Ning provides to make it feel like you'd be getting what you paid for?

More information on Ning's decision to go paid-only will be released tomorrow. For now, I'm working up a list of free social networking tools for use by educators. If you have any suggestions, I'd love to hear them.

Update, 4/15/10, 6:48 p.m.: Never one to sit on the sidelines in the first place, Alec Couros has spearheaded a gigantic, collaborative googledoc called "Alternatives to Ning." As of this update, the doc keeps crashing because of the number of collaborators trying to help build this thing (the last time I got into it, I was one of 303 collaborators), so if it doesn't load right away, keep trying.

Friday, April 2, 2010

Diane Ravitch Editorial on the Failure of NCLB

I have long admired Diane Ravitch. While I have disagreed with her on fundamental philosophical grounds, her arguments have always been grounded in the realities of schooling--even if those were the realities of conservative parents and stakeholders.

Now the evidence has shown what some of us predicted and what many of us have known for years: that external tests of basic skills and punitive sanctions were just going to lead to illusory gains (if any) and undermine other value outcomes. Her editorial in today's (April 2) Washington Post is very direct. While I disagree with her on where to go from here, I applaud her for using her audience and her reputation to help convince a lot of stakeholders who have found one reason or another to ignore the considerable evidence against continuing NCLB. Like Jim Popham has been saying for years, all of the improvement schools could make with test scores already happened between 1990 and 2000, once newspapers began publishing test scores.



Certainly this will factor into the pending NCLB reauthorization. Perhaps Indiana's Republican leadership will read this and think twice about going forward with their two core ideas for their Race to the Top reform proposal, even though it was not funded. The twin shells in their reform shotgun is "Pay for Performance" merit pay for Indiana teachers based on basic skills test scores, and "Value Added" growth modeling that ranks teachers based on how much "achievement" they instilled in their kids. For reasons Ravitch summarizes and other concerns outlined in a recent letter and report by the National Academy, the recoil from pulling these two triggers at once might be just enough to blow our schools and our children pretty far back into the 20th century.

Tuesday, March 9, 2010

Video of Barry McGaw on Assessment Strategies for 21st Century Skills (Measurement Working Group)

I just came across a video of a keynote by Barry McGaw at last month’s Learning and World Technology Forum. McGaw heads the Intel/Microsoft/Cisco initiative known as Assessment and Teaching of 21st Century Skills. This high-powered group is aiming to transform the tests used in large-scale national comparisons and education more broadly. Their recent white papers are a must read for anyone interested in assessment or new proficiencies. McGaw’s video highlights aspects of this effort that challenge conventional wisdom about assessment. In this post I focus on McGaw’s comments on the efforts of the Measurement working group. In particular they point to (1) the need to iteratively define assessments and the constructs they aim to capture, and (2) the challenge of defining developmental models of these new skills.


Iterative Design of Assessments and Constructs
McGaw highlighted that the Measurement Working Group (led by Mark Wilson) emphasized the need for iterative refinement in the development of new measures. Various groups spent much of the first decade of the 21st century debating how these proficiencies should be defined and organized. In this abstract context, this definition process could easily consume the second decade as well. Wilson’s group argues that the underlying constructs being assessed must be defined and redefined in the context of the assessment development process. Of this, McGaw said

You think about it first, you have a theory about what you want those performances to measure. You then begin to develop ways of capturing information about that skills. But the data themselves give you information about the definition, and you refine the definition. This is the important point of pilot work with these assessment devices. And not just giving the tests to students, but giving them to students and seeing what their responses are, and discovering why they gave that response. And not just in the case where it is the wrong response but in the case where it is the correct response, so that you get a better sense of the cognitive processes underlying the solution to the task.

In other words, you can’t just have one group define standards and definitions and then pitch them to the measurement group when dealing with these new proficiencies. Because of their highly contextualized nature, we can’t just pitch standards to testing companies as has been the case with hard skills for years. This has always nagged at me in previous consideration, in that they seemed to overlook both the issue and the challenge that it presents (e.g., the Partnership for 21st Century Skills). Maybe now we can officially decide to stop trying to define what assessment scholar Lorrie Shepard so aptly labeled “21st Century Bla Bla Bla.”

The Lack of Learning Progression Models
McGaw also reiterated the concerns of the Measurement Working Groups over the lack of consensus about the way these new proficiencies develop. There is a strong consensus about the development of many of the hard skills in math, science, and literacy, and these insights are crucial for developing worthwhile assessments. I learned about this first hand developing a performance assessment for introductory genetics working with Ann Kindfield at ETS. Ann taught me the difference between the easier cause-to-effect reasoning (e.g., completing the Punnett square) and the more challenging effect-to-cause reasoning (e.g., using a pedigree chart to infer mode of inheritance). We used these and other distinctions she uncovered in her doctoral studies to create a tool that supported tons of useful studies on teaching inheritance in biology classes. Other more well known work on “learning progressions” include Ravit Duncan’s work in molecular genetics and Doug Clements’ work in algebra. In each case it took multiple research teams many years reach consensus about the way that knowledge typically developed.

Wilson and McGaw are to be commended for reminding us how difficult it is going to be to agree on the development of these much softer 21st century proficiencies. They are by their very definition situated in more elusive social and technological contexts. And those contexts are evolving. Quickly. Take for example judging credibility of information on the Internet. In the 90s this meant websites. In the past decade it came to mean blogs. Now I guess it includes Twitter. (There is a great post about this at MacArthur’s Spotlight Blog, as well as a recent CBC interview about fostering new media literacies, featuring my student Jenna McWilliams.)

Consider that I taught my 11-year-old son to look at the history page on Wikipedia to help distinguish between contested and uncontested information in a given entry. He figured out on his own how to verify the credibility of suggestions for modding his Nerf guns at nerfhaven.com and YouTube. Now imagine you are ETS, where it inevitably takes a long time and buckets of money to produce each new test. They already had to replace their original iSkills test with the iCritical Thinking test. From what I can tell, it is still a straightforward test of information from a website. Lots of firms are starting to market such tests. Some places (like Scholastic’s Expert21) will also sell you curriculum and classroom assessments that will teach students to pass the test—without ever actually going on the Internet. Of course ETS know that they can’t sell curriculum if they want to maintain their credibility. But I am confident that as soon as organizations start attaching meaningful consequences to the test, social networks will spring up telling students exactly how to answer the questions.

There is lots of other great stuff in the Measurement white paper. Much if it is quite technical. But I applaud their sobering recognition of the many challenges that these new proficiencies pose for large scale measurement. And they only get harder when these new tests are used for accountability purposes.

Next up: McGaw’s comments about the Classroom Environments and Formative Evaluation working group.

Friday, January 22, 2010

Can We Really Measure "21st Century" Skills?

The members of the 21st Century Assessment Project were asked a while ago to respond to four pressing questions regarding assessment of “21st Century Skills.” These questions had come via program officers at leading foundations, including Connie Yowell at MacArthur’s Digital Media and Learning Initiative, which funds our Project. I am going to launch my efforts to blog more during my much-needed sabbatical by answering the first question, with some help from my doctoral student Jenna McWilliams.

Question One: Can critical thinking, problem solving, collaboration, communication and "learning to learn" be reliably and validly measured?

As Dan Koretz nicely illustrated in the introduction to his 2008 book, Measuring Up: What Educational Testing Really Tells Us, the answers to questions about educational testing are never simple. We embrace strongly situative and participatory view of knowing and learning, which is complicated to explain to those who do not embrace it. But I have training in psychometrics (and completed a postdoc at ETS) and have spent most of my career refining a more pragmatic stance that treats educational accountability as inevitable. When it comes to assessment, I am sort of a born-again situativity theorist. Like folks who have newly found religion and want to tell everybody how Jesus helped them solve all of the problems they used to struggle with, I am on a mission to tell everyone how situative approaches measurement can solve some nagging problems that they have long struggled with.

In short, no, we don’t believe we can measure these things in ways that are reliable and yield scores that are valid evidence of what individuals are capable of in this regard. These are actually “practices” that can most accurately be interpreted using methods accounting for the social and technological contexts in which they occur. In this sense, we agree with skeptics like Jim Greeno and Melissa Gresalfi who argued that we can never really know what students know. This point riffs on the title of the widely cited National Research Council report of the same name that Jim Pellegrino (my doctoral advisor) led. And as Val Shute just reminded me, Messick has reminded us forever that measurement never really gets directly at what somebody knows, but instead provides evidence about what the seem to know. My larger point here is my concern about what happens with these new proficiencies in schools and in tests when we treat them as individual skills rather than social practices. In particular I worry what happens to both education and evidence when students, teachers, and schools are judged according to tests of these new skills.

However, there are lots of really smart folks who have a lot of resources at their disposal who think you can measure them. This includes most of my colleagues in the 21st Century Assessment Project. For example, check out Val Shute’s great article in the International Journal of Learning and Media. Shute also has an edited volume on 21st Century Assessment coming out shortly. Likewise Dan Schwartz has a tremendous program of research building on his earlier work with John Bransford on assessments as preparation for future learning. Perhaps the most far reaching is Bob Mislevy’s work on evidence-centered design. And of course there is the new Intel-Microsoft-Cisco partnership which is out to change the face of national assessments and therefore the basis of international comparisons. I will elaborate on these examples in my next post, as that is actually the second question we were asked to answer. But first let me elaborate on why I believe that the assessment (of what individuals understand) and the measurement (of what groups of individuals have achieved) of 21st Century skills is improved if we assume that we can never really know what students know.

To reiterate, from the perspective of contemporary situated views of cognition, all knowledge and skills are primarily located in the social context. This is easy to ignore when focusing on traditional skills like reading and math that can be more meaningfully represented as skills that individuals carry from context to context. This assumption is harder to ignore with these newer ones that everyone is so concerned with. This is expecially the case with explicity social practices like collaborating and communicating, since these can't even practiced in isolated contexts. As we argued in our chapter in Val’s book, we believe it is a dangerously misleading to even use the term skills in this regard. We elected to use the term proficiencies because that term is broad enough to capture the different ways that we think about them. As 21st Century Assessment project leader Jim Gee once put it
Abstract representations of knowlege, if they exist at all, reside at the end of long chains of situated activity.
However, we also are confident that that some of the mental “residue” that gets left behind when people engage meaningfully in socially situated practices can certainly be assessed reliably and used to make valid interpretations about what individuals know. While we think these proficiencies are primarily social practices, it does not exclude recognizing the secondary “echoes” of participating in these practices. This can be done with performance assessments and other extended activities that provide some of that context and then ask individuals or groups to reason, collaborate, communicate, and learn. If such assessments are created carefully, and individuals have not been directly trained to solve the problems on the assessments, it is possible to obtain reliable scores that are valid predictions of how well individuals can solve, communicate, collaborate, and learn in new social and technological contexts. But this continues to be difficult and the actual use of such measures raises serious validity issues. Because of these issues (as elaborated below), we think this work might best be characterized as “guessing what students know.”

More to the point of the question, we believe that only a tiny fraction of the residue from these practices can be measured using conventional standardized multiple-choice tests that provide little or no context. For reasons of economy and reliability, such tests are likely to remain the mainstay of educational accountabiity for years to come. Of course, when coupled with modern psychometrics, such tests can be extremely reliable, with little score variation across testing time or version. But there are serious limitations in what sorts of interpretations can be validly drawn from the resulting scores. In our opinion, scores on any standardized test of these new skills are only valid evidence of proficiency when they are
a) used to make claims about aggregated proficiencies across groups of individuals;
b) used to make claims about changes over longer times scales, such as comparing the consequences of large scale policy decisions over years; and
c) isolated from the educational environment which they are being used to evaluate.
Hence, we are pretty sure that national and international assessments like NAEP and PISA should start incorporating such proficiencies. But we have serious concerns about using these measures to evaluate individual proficiencies in an high-stakes sorts of ways. If such tests are going to continue to be used on any high stakes decisions, they may well best be left to more conventional literacies, numeracies, and knowledge of conventional content domains, which are less likely to be compromised.

I will say that I am less skeptical about standardized measures of writing. But they are about the only standardized assessments left in wide use that actually requires students to produce something. Such tests will continue to be expensive and standardized scoring (by humans or machines) requires very peculiar writing formats. But I think the scores that result are valid for making inferences about individual proficiency in written communication more broadly, as was implied by the original question. They are actually performance assessments and as such can bring in elements of different contexts. This is particularly true if we can relax some of the needs for reliability (which requires very narrowly defined prompts and typically gets compromized and writers get creative and original). Given that I think my response to the fourth question will elaborate on my belief that written communication is probably the single most important “new proficiency” needed for economic, civic, and intellectual engagement, I think that improved testing of written communication will be the one focus of assessment research that yields the most impact on learning and equity.

To elaborate on the issue of validity, it is worth reiterating that validity is a property of the way the scores are interpreted. Unlike reliability, validity is never a property of the measure. In other words, validity always references the claims that are being supported by the evidence. As Messick argued in the 90s, the validity of any interpretation of scores also depends on the similarity between prior education and training contexts and the assessment/measurement context. This is where things get messy very quickly. As Kate Anderson and I argued in a chapter in an NSSE Yearbook on Evidence and Decision Making edited by Pam Moss, once we attach serious consequences to assessments or tests for teachers or students, the validity of the resulting scores will get compromised very quickly. This is actually less of a risk with traditional proficiencies and traditional multiple choice tests. This is because these tests can draw from massive pools of items that are aligned to targeted standards. In these cases, the test can be isolated from any preparation empirically, by randomly sampling from a huge pool of items. As we move to newer performance measures of more extended problem solving and collaboration, there necessarily are fewer and fewer items and the items become more and more expensive to develop and validate. If teachers are directly teaching students to solve the problems, then it becomes harder and harder to determine how much of an individual score is real proficiency and how much is familiarity with the assessment format (what Messick called construct-irrelevant variance). The problem is that it is impossible to ever know how much of the proficiency is “real.” Even in closely studied contexts, different observers are sure to differ in the validity—a point made most cogently in Michael Kane’s discussions of validity as interpretive argument.

Because of these validity concerns, we are terrified that the publishers of these tests of “21st Century Skills” are starting marketing curricula and test preparation materials of those same proficiencies. Because of the nature of these new proficiencies, these new “integrated” systems raise even more validity issues than the ones that emerged under NCLB for traditional skills. Another big validity issue we raised in our chapter concerns the emergence of socially networked cheating. Once these new tests are used for high-stakes decisions (especially for college entrance), social networks will emerge to tell students how to solve the kinds of problems that are included on the tests. (This has already begun to happen, as in the "This is SPARTA!" prank on the English Advanced Placement test that we wrote about in our chapter and in a more recent "topic breach" wherein students in Winnipeg leaked the essay topic for the school's 12th grade English exam.)

Of course, proponents of these new tests will argue that learning how to solve the kinds of problems that appear on their tests is precisely what they want students to be doing. And as long as you adopt a relatively narrow view of cognition and learning, there is some truth to that assumption. Our real concern is that this unbalanced focus in addition to new standards and new tests will distract from the more important challenge of fostering equitable, ethical, and consequential participation in these new skills in schools.

That is it for now. We will be posting my responses to the three remaining questions over the next week or so. We would love to hear back from folks about their responses to the first question.


Questions remaining:
2) Which are the most promising research initiatives?
3) Is it or will it be possible to measure these things in ways that they can be scored by computer? If so, how long would it take and what sorts of resources would be needed?
4) If we had to narrow our focus to the proficiencies most associated with economic opportunity and civic participation, which ones do we recommend? Is there any evidence/research specifically linking these proficiencies to these two outcomes? If we further narrowed our focus to only students from underserved communities, would this be the same list?

Monday, November 16, 2009

Join this discussion on Grading 2.0

Over at the HASTAC forum, a conversation has begun around the role of assessment in 21st-century classrooms.

The hosts of this discussion, HASTAC scholars John Jones, Dixie Ching, andMatt Straus, explain the impetus for this conversation as follows:
As the educational and cultural climate changes in response to new technologies for creating and sharing information, educators have begun to ask if the current framework for assessing student work, standardized testing, and grading is incompatible with the way these students should be learning and the skills they need to acquire to compete in the information age. Many would agree that its time to expand the current notion of assessment and create new metrics, rubrics, and methods of measurement in order to ensure that all elements of the learning process are keeping pace with the ever-evolving world in which we live. This new framework for assessment might build off of currently accepted strategies and pedagogy, but also take into account new ideas about what learners should know to be successful and confident in all of their endeavors.

Topics within this forum conversation include:
  • Technology & Assessment ("How can educators leverage the affordances of digital media to create more time-efficient, intelligent, and effective assessment models?");
  • Assignments & Pedagogy ("How can we develop assignments, projects, classroom experiences, and syllabi that reflect these changes in technology and skills?");
  • Can everything be graded? ("How important is creativity, and how do we deal with subjective concepts in an objective way, in evaluation?"); and
  • Assessing the assessment strategies ("How do we evaluate the new assessment models that we create?").

The conversation has only just started, but it's already generated hundreds of visits and a dozen or so solid, interesting comments. If you're into technology, assessment and participatory culture, you should take a look. It's worth the gander.

Here's the link again: Grading 2.0: Assessment in the Digital Age.

Tuesday, October 27, 2009

The Void Between Colleges of Education and the University Teaching and Learning

In this post, I consider the tremendous advances in educational research I am seeing outside of colleges of education and ponder the relevance of mainstream educational research in light of the transformation of learning made possible by new digital social networks.

This weekend, the annual conference of the International Society for the Scholarship of Teaching and Learning took place at Indiana University. ISSOTL is the home of folks who are committed to studying and advancing teaching and learning in university settings. I saw several presentations that are directly relevant to what we care about here at Re-Mediating Assessment. These included a workshop on social pedagogies organized by Randy Bass, the Assistant Provost for Teaching and Learning at Georgetown, and several sessions on open education, including one by Randy and Toru Iiyoshi, who heads the Knowledge Media Lab at the Carnegie Foundation. Toru co-edited the groundbreaking volume Opening up Education, of which we here at RMA are huge fans. (I liked it so much I bought the book, but you can download all of the articles for free—ignore the line at the MIT press about sample chapters).

I presented at a session about e-Portfolios with John Gosney (Faculty Liaison for Learning Technologies at IUPUI) and Stacy Morrone (Associate Dean for Learning Technologies at IU). John talked about the e-Portfolio efforts within the Sakai open source collaboration and courseware platform; Stacy talked about e-Portfolio as it has been implemented in OnCourse, IU’s instantiation of the Sakai open source course collaboration platform. I presented about our efforts to advance participatory assessment in my classroom assessment course using newly available wikis and e-Portfolio tools in Oncourse (earlier deliberation on those efforts are here; more posted here soon). I was flattered that Maggie Ricci of IU’s Office of Instructional Consulting interviewed me about my post on positioning assessment for participation and promised to post the video this week (I will update here when I find out).

I am going to post about these presentations and how they intersect with participatory assessment as time permits over the next week or so. In the meantime, I want to stir up some overdue discussion over the void between the SOTL community and my colleagues in colleges of education at IU and elsewhere. In an unabashed effort to direct traffic to RMA and build interest in past and forthcoming posts, I am going to first write about this issue. I think it raises issues about the relevance of colleges of education and suggests a need for more interdisciplinary approaches to education research.

I should point out that I am new to the SOTL community. I have focused on technology-supported K-12 education for most of my career (most recently within the Quest Atlantis videogaming environment). I have only recently begun studying my own teaching in the context of developing new core courses for the doctoral program in Learning Sciences and in trying to develop online courses that take full advantage of new digital social networking practices (initial deliberations over my classroom assessment course are here). I feel sheepish about my late arrival because I am embarrassed about the tremendous innovations I found in the SOTL community that have mostly been ignored by educational researchers. My departmental colleagues Tom Duffy, who has long been active in SOTL here at IU, and Melissa Gresalfi have recently gotten seriously involved as well. The conference was awash with IU faculty, but I only saw a few colleagues from the School of Education. One notable exception was Melissa’s involvement on a panel on IU’s Interdisciplinary Teagle Colloquium on Inquiry in Action. I could not go because it conflicted with my own session, but this panel described just the sort of cross-campus collaboration I am aiming to promote here. I also ran into Luise McCarty from the Educational Policy program who heads the school’s Carnegie Initiative on the Doctorate for the school.

My search of the program for other folks from colleges of education revealed another session that was scheduled against mine and that focused on the issue I am raising in this post. Karen Swanson of Mercer University and Mary Kayler of George Mason reported on the findings of their meta-analysis of the literature on the tensions between colleges of education and SOTL. The fact that there is enough literature on this topic to meta-analyze points out that this issue has been around for a while (and suggests that I should probably read up before doing anything more than blogging about this issue.) From the abstract, it looks like they focused on the issue of tenure, which I presume refers to a core issue in the broader SOTL community: that SOTL researchers outside of schools of education risk being treated as interlopers by educational researchers, while treated as dilettantes by their own disciplinary communities. This same issue was mentioned in other sessions I attended as well. But significantly from my perspective, it looks like Swanson and Kayler looked at this issue from the perspective of Education faculty, which is what I want to focus on here. I have tenure, but I certainly wonder how my increased foray into the SOTL community will be viewed when I try to get promoted to full professor.

I will start by exploring my own observations about educational researchers who study their own university teaching practices. I am not in teacher education, but I know of a lot of respected education faculty who seem to be conducting high quality, published research about their teacher education practices. However, there is clearly a good deal of pretty mediocre self-study taking place as well. I review for a number of educational research journals and conferences. When I am asked to review manuscripts or proposals for educational research carried out in classrooms in the college of education, I am quite suspect. Because I have expertise in motivation and in formative assessment, I get stacks of submissions of studies of college of education teaching that seem utterly pointless to me. For example, folks love to study whether self______ is correlated with some other education relevant variables. The answer is always yes, (unless their measures are unreliable), and then there is some post hoc explanation of the relationships with some tenuous suggestions for practice. Likewise, I review lots of submissions that examine whether students who get feedback on learning to solve some class of problems learn to solve those problems better than students whose feedback is withheld. Here the answer should be yes, since this is essentially a test of educational malpractice. But the studies often ignore the assessment maxim that feedback must be useful and used, and instead focus on complex random assignment so that their study can be more “scientific.” I understand the appeal, because they are so easy to conduct and there are enough examples of them actually getting published to provide some inspiration (while dragging down the over effect size of feedback in meta-analytic studies). While it is sometimes hard to tell, these “convenience” studies usually appear to be conducted in the author’s own course or academic program. So, yes, I admit that when that looks to be the case, I do not expect to be impressed. I wonder if other folks feel the same way or if perhaps I am being overly harsh.

Much of my interest in SOTL follows from my efforts to help my college take better advantage of new online instructional tools and to help take advantage of social networking tools in my K-12 research. While my colleagues in IU Bloomington and IUPUI are making progress, I am afraid that we are well behind the curve. While I managed to attend a few SOTL sessions, I saw tremendous evidence of success that I will write about in subsequent posts. Randy Bass and Heidi Elmendorf (also of Georgetown) showed evidence of deep engagement on live discussion forums that simply can’t be faked; here at IU, Phillip Quirk showed some very convincing self-report data about student engagement in our new interdisciplinary Human Biology Program, which looks like a great model of practice for team-teaching courses. These initial observations reminded me of the opinion of James Paul Gee, who leads the MacArthur Foundation’s 21st Century Assessment Project (which partly sponsors my work as well). He has stated on several occasions that “the best educational research is no longer being conducted in colleges of education.” That is a pretty bold statement, and my education colleagues and I initially took offense to it. Obviously, it depends on your perspective; but in terms of taking advantage of new digital social networking tools and the movement towards open education and open-source curriculum, it seems like it may already be true.

One concern I had with SOTL was the sense that the excesses of “evidence-based practice” that has infected educational research was occurring in SOTL. But I did not see many of the randomized experimental studies that set out to “prove” that new instructional technology “works.” I have some very strong opinions about this that I will elaborate on in future posts; for now I will just say that I worry that SOTL researchers might get are too caught up in doing controlled comparison studies of conventional and online courses that they completely miss the point that online courses offer an entirely new realm of possibilities for teaching and learning. The “objective” measures of learning normally used in such studies are often biased in favor of traditional lecture/text/practice models that train students to memorize numerous specific associations; as long as enough of those associations appear on a targeted multiple-choice exam, scores will go up. The problem is that such designs can’t capture the important aspects of individual learning and any aspects of the social learning that is possible in these new educational contexts. Educational researchers seem unwilling to seriously begin looking at the potential of these new environments that they have “proven” to work. So, networked computers and online courses end up being used for very expensive test preparation…and that is a shame.

Here at RMA, we are exploring how participatory assessment models can foster and document all of the tremendous new opportunities for teaching and learning made possible by new digital social networks, while also producing convincing evidence on these “scientific” measures. I will close this post with a comment that Heidi Elmendorf made in the social pedagogies workshop. I asked her why she and the other presenters were embracing the distinction between “process” and “product.” In my opinion, this distinction is based on outdated individual models of learning; it dismisses the relevance of substantive communal engagement in powerful forms of learning, while privileging individual tests as the only “scientific” evidence of learning. I don’t recall Heidi’s exact response, but she immediately pointed out that her disciplinary colleagues in Biology leave her no choice. I was struck by the vigorous nods of agreement from her colleagues and the audience. Her response really brought be me back down to earth and reminded me how much work we have to do in this regard. In my subsequent posts, I will try to illustrate how participatory assessment can address precisely the issue that Heidi raised.

Thursday, October 1, 2009

Positioning Portfolios for Participation

Much of our work in our 21st Century Assessment project this year has focused on communicating participatory assessment to broader audiences whose practices we are trying to inform. This includes:

  • classroom teachers whose practices we are helping reshape to include more participation (like those we are working with in Monroe County right now);

  • other assessment researchers who seem to dismiss participatory accounts of learning as “anecdotal” (like my doctoral mentor Jim Pellegrino who chaired the NRC panel on student assessment);

  • instructional innovators who are trying to support participation while also providing broadly convincing accounts of learning (like my colleagues Sasha Barab and Melissa Gresalfi whose Quest Atlantis immersive environment has been a testbed for many of our idea about assessment);

  • faculty in teacher education who are struggling to help pre-service teachers build professional portfolios while knowing that their score on the Praxis will count for much more (and whose jobs are being threatened by efforts in Indiana to phase out teacher education programs and replace them with more discipline-based instruction);

  • teachers in my graduate-level classroom assessment course who are learning how to do a better job assessing students in their classrooms, as part of their MA degree in educational leadership.


It turns out that participatory approaches to assessment are quite complicated, because they must bridge the void between the socially-defined views of knowing and learning that define participation, and the individually-defined models of knowing and learning that have traditionally been taken for granted by the assessment and measurement communities. As our project sponsor Jim Gee has quite succinctly put: Your challenge is clarity.

As I have come to see most recently, clarity is about entry. Where do we start introducing this comprehensive new approach? Our approach itself is not that complicated really. We have it boiled down to a more participatory version of Wiggins' well known Understanding by Design. In fact we have taken to calling our approach Participation by Design (or if he sues us, Designing for Participation). But the theory behind our approach is maddeningly complex , because it has to span the entire range of activity timescales (from moment-to-moment classroom activity to long-term policy change) and characterizations of learning (from communal discourse to individual understanding to aggregated achievement).

Portfolios and Positioning
Now it is clear to me that the best entry point is the familiar notion of the portfolio. Portfolios consist of any artifacts that learners create. Thanks to Melissa Gresalfi, I have come to realize that the portfolio, and the artifacts that they contain, are ideal for explaining participatory assessment. This is because portfolios position (where position is used as a verb). Before I get to the clarity part, let me first elaborate on what this means.

It turns out that portfolios can be used to position learners and domain content in ways that bridges this void between communal activity and aggregated attainment. In a paper with Caro Williams about the math project that Melissa and I worked on together, Melissa wrote that

“positioning, as a mechanism, helps bridge the space between the opportunities that are available for participation in particular ways and what individual participants do”

Building on the ideas of her doctoral advisor Jim Greeno (e.g., Greeno and Hull, 2002) Melissa explained that positioning refers to how students are positioned relative to content (called disciplinary positioning) and how they are positioned relative to others (called interpersonal positioning). As I will add below, positioning also refer to how instructors are positioned relative to the students and the content (perhaps called professorial positioning). This post will explore how portfolios can support all three types of positioning in more effective and in less effective ways.

Melissa further explained that positioning occurs at two levels. At the more immediate level positioning concerns the moment-to-moment process in which students take up opportunities that they are presented with. Over the longer term, students become associated with particular ways of participating in classroom settings (these ideas are elaborated by scholars like Dorothy Holland and Stanton Wortham). This post will focus on identifying two complementary functions for portfolios helps them support both types of positioning.

Portfolios and Artifacts
Portfolios are collections of artifacts that students created. Artifacts support participation because they are where students apply what they are learning in class to something personally meaningful. In this way they make new meanings. In our various participatory assessment projects, artifacts have included

  • the “Quests” that students complete and revise in Quest Atlantis’ Taiga world where they explain, for example, their hypothesis for why the fish in the Taiga river are in decline;
  • the remixes of Moby Dick and Huck Finn that students in Becky Rupert’s class at Aurora Alternative High School create in their work with the participatory reading curricula that Jenna McWilliams is creating and refining.
  • the various writing assignments that the English teachers in Monroe and Greene County have their students complete in both their introductory and advanced writing classes;
  • the wikifolio entries that my students in my graduate classroom assessment course complete where they draft examples of different assessment items for a lesson in their own classrooms, and state which of the several item writing guidelines in the textbook they found most useful.

  • In each case, various activities scaffold the student learning as they create their artifacts and make new meanings in the process. As a caveat, this means that participatory assessment is not really much use in classrooms where students are not asked to create anything. More specifically, if your students are merely being asked to memorize associations and understand concepts in order to pass a test, stop reading now. Participatory assessment won’t help you. [I learned this the hard way trying to do participatory assessment with the Everyday Mathematics curriculum. Just do drill and practice. It works.]


Problematically Positioned Portfolios
Probably the most important aspect of participatory assessment has to do with the way portfolios are positioned in the classroom. We position them so they serve as a bridge between the communal activities of participatory classroom and the individual accountability associated with compulsory schooling. If portfolios are to serve as a bridge, they must be firmly anchored. On one side they must be anchored to the enactment of classroom activities that support students’ creation of worthwhile portfolios. On the other side they must be anchored to the broader accountability associated with any formal schooling.



To keep portfolio practices from falling apart (as they often do) it is crucial that they rest on these two anchors. If accountability is placed on the portfolio, the portfolio practice will collapse. In other words, don’t use the quality of the actual portfolio artifacts for accountability. Attaching consequences to the actual artifacts means that learners will expect precise specifications regarding those artifacts, and then demand exhausting feedback on whether the artifacts meet particular criteria. And if an instructor’s success is based on the quality of the artifacts, that instructor will comply. Such classrooms are defined by an incessant clamor from learners asking “Is this what you want???”

When portfolios are positioned this way (and they often are), they may or may not represent what students actually learned and are capable of. When positioned this way, the portfolio is more representative of of (a) the specificity of the guidelines, (b) their ability to follow those guidelines, and (3) the amount of feedback they get from the instructor. Accountability-oriented portfolios position disciplinary knowledge as something to be competitively displayed rather than something to be learned and shared, and portfolios position students as competitors rather the supporters. Perhaps most tragically, attaching consequences to artifacts positions instructors (awkwardly) as both piano tuners and gatekeepers. As many instructors (and ex-instructors) know, doing so generates massive amounts of work. This is why it seems that many portfolio-based teacher education programs rely so heavily on doctoral students and adjuncts who may or may not be qualified to teach courses. The more knowledgeable faculty members simply don’t have the time to help students with revision after revision of their artifacts as students struggle to create the perfect portfolio. This is the result of positioning portfolios for production.

Productive Positioning Within Portfolios
Portfolio are more useful when they are positioned to support reflection. Instead of grading the actual artifacts that students create, any accountability should be associated with student reflection on those artifacts. Rather than giving students guidelines for producing their artifact, students need guidelines for reflecting on how that artifact illustrates their use of the “big ideas” of the course. We call these relevant big ideas, or RBIs. The rubrics we provide students for their artifacts essentially ask them to explain how their artifact illustrates (a) the concept behind the RBI, (b) the consequences of the RBI for practice, and (c) what critiques others might have of this characterization of the RBI. For example:

  • Students in my classroom assessment course never actually “submit” their wikifolios of example assessments. Rather, three times a semester they submit a reflection that asks them to explain how they applied the RBIs of the corresponding chapter.
  • Students in Taiga world in Quest Atlantis submit their quests for review by the Park Ranger (actually their teacher but they don’t know that). But the quest instructions (the artifact guidelines) also include a separate reflection section that asks students to reflect on their artifact. The reflection prompts are designed to indirectly cue them what their quest was supposed to address.
  • Students in Becky Rupert’s English class are provided a rubric for their remixes that ask them to explain how that artifact illustrates how an understanding of genre allows a remix to be more meaningful to particular audiences.
Assessing the resulting reflections positions portfolios, students, and teachers in ways that strongly support participation. For example, if the particular student’s artifact actually does not lend itself to applying the RBIs, my classroom assessment students can simply indicate that in their assignment. This is important for at least three reasons:

  1. it allows full individualization for students and avoids a single ersatz assignment that is only half-meaningful to some students and mostly meaningless to the rest;
  2. understanding if and how ideas from a course do not apply is a crucially important part of that expertise.
  3. The reflection itself provides more valid evidence of learning, precisely because it can include very specific guidelines. We give students very specific guidelines asking them to reflect on the RBIs conceptually, consequentially, and critically.

For example, the mathematics teachers in the classroom assessment course are going to discover that it is very difficult to create portfolio assessments for their existing mathematical practices. Rather than forcing them to do so anyways (and giving them a good grade for an absurd example), they can instead reflect on what it is about mathematics that makes it so difficult, and gain some insights into how they might more readily incorporate project-based instruction into their classes. The actual guidelines for creating good portfolios are in the book when they need them; reflecting on those guidelines more generally will set them up to use them more effectively and meaningfully in the future.

Another huge advantage of this way of positioning portfolios is that it greatly eliminate a lot of the grading busywork and allows more broadly useful feedback. In the Quest Atlantis example, our research teacher Jake Summers of Binford Elementary discovered that whenever the reflections were well written and complete, the actual quest submission would also be well done. In the inevitable press for time, he just started looking at the artifacts. Similarly in my classroom assessment course, I will only look need to go back and look at the actual wikifolio entries when a reflection is incomplete or confusing. Given that the 30 students each have 8 entries, it is impossible to carefully review all 240 entries and provide meaningful feedback. Rather throughout the semester, each of the students have been getting feedback from their group members and from me (as they specifically request and as time permits). Because the artifacts are not graded, students understand the feedback they get as more formative than summative, and not as instructions for revision. While some of the groups in class are still getting the hang of it, many of the entries are getting eight or nine comments along with comments on comments. Because the entries are wikis it is simple for the originator go in and revise as appropriate. These students are starting to send me messages that, for me, suggest that the portfolio has indeed been positioned for participation: “Is this what you meant?” (emphasis added). This focus on meaning gets at the essence of participatory culture.

In a subsequent post, I will elaborate on how carefully positioning portfolios relative to (a) the enactment of classroom activities and (b) external accountability can further foster participation.

Participation versus Compulsion

In Sleeping Alone and Starting Out Early, Jenna McWilliams offers up a concise summary of the value of blogging for schools. Her post got me reflecting on the complex intersection of participation (in public persistent discourse as you have described) and compulsion (as in the inevitable way that compulsory attendance compels students to attend but not necessarily participate). I am thinking today context of the graduate-level education course we are teaching. We are trying to coax some busy teachers who are getting graduate degrees in educational leadership to participate in meaningful semi-public discourse around improving their classroom assessment practices. We have students building and sharing wikifolios where they apply what they are learning about assessment to their own classroom practice, and using forums to discuss the big ideas in the text. The resistance to the participatory aspects from some students is remarkably strong. We have agonized over the various design features that will compel all students to participate more than they would otherwise. While we are finding success, we are in part doing so by linking their participation to a grade. It seems effective but bizarre, for example, to motivate students to engage consequentially and critically in a discussion forum by pointing out that doing so will prepare them to engage conceptually on an exam at the end of the semester.

When I say we are finding success, it is because I see a strong level of engagement emerging across the class, and that the most reluctant participants are indeed engaging in ways that for me meet the level of accountability associated with this required course. If somebody is going to have a graduate degree in educational leadership, then they need to be able to engage in meaningful discussions around that aspect of practice. And in case I drop the ball, there are faculty members in charge of the graduate programs looking over my shoulder (and to some extent watching my back); if we drop the ball there is an accreditation agency out there looking over our collective shoulders (but probably not watching our backs).

So in this regard I do believe we are finding success. But I worry that we are not supporting the handful of students who are disposed (as in have the disposition, a carefully chosen term) to becoming 21st century educational leaders. For example, am I sacrificing the chance to help a couple of these students who might end up keeping a blog that critiques and unpacks local and state accountability practices that are buffeting teachers and administrators in their district in exchange for passing exam scores and adequate teaching evaluations? [Anybody who teaches required courses knows that the way to get great evaluations is go really easy on students and emphasize what they already know and make them think they have learned a lot.]

So this is what it boils down to: Compulsory attendance versus scaffolded participation. For me, this is the major issue facing education today. I do think that blogging is a bridge too far for many novices. But I do think that well structured discussion forums can give beginners the opportunity to try on the identities and try out the discourses of participatory culture. I also want to second Jenna’s shoutout for the collection of participatory media activities that Sam Rose and Howard Rheingold have provided at sociamediaclassroom. It is the best collection out there, and a great starting place for anybody looking to refine a more specific set for particular educational contexts. We are putting some together for our teachers in our project with teachers in Monroe and Eastern Green Counties, and they should be available on a site at ning.com soon.

And finally, what I would not give to be able to put everything else aside to blog. I have four of five posts that wake me up every morning. But then I remember I still have an overdue annual report for the National Science Foundation that I have been working on for a week. I will be lucky if my project officer reads it. But I have to crank it out to keep the grant money coming in so my graduate students can eat and have a place to sleep.

Wednesday, September 9, 2009

Q & A with Henry Jenkins' New Media Literacies Seminar

New media scholar Henry Jenkins is teaching a graduate seminar on new media literacies at the University of Southern California's Annenberg School for Communication. The participants had raised the issues of assessment and evaluation, especially related to educational applications of new media. Henry invited Dan Hickey to skype into their class to field questions about this topic. They perused some of the previous posts here at re-mediating assessment and proceeded to ask some great questions. Over the next few weeks, Dan and other members of the participatory assessment team will respond to these and seek input and feedback from others.


The first question was one they should have answered months ago:



Your blog post on what is not participatory assessment critiqued prevailing assessment and testing practices. So what is participatory assessment?

The answer to this question has both theoretical and practical elements. Theoretically, participatory assessment is about reframing all assessment and testing practices as different forms of communal participation, embracing the views of knowledgeable activity outlined by media scholars like Henry Jenkins, linguists like Jim Gee, and cognitive scientists like Jim Greeno. We will elaborate on that in subsequent posts, hopefully in response to questions about this post. But this first post will focus more on the practical answer.

Our work in participatory assessment takes inspiration from the definition of participatory culture in the 2006 white paper by Project New Media Literacies:
not every member must contribute, but all must believe they are free to contribute when ready and that what they contribute will be appropriately valued.

As Henry, Mimi Ito, and others have pointed out, such cultures define the friendship-driven and interest-driven digital social networks that most of our youth are now immersed in. This culture fosters tremendous levels of individual and communal engagement and learning. Schools have long dreamed of attaining such levels but have never even come close. Of course, creating (or even allowing) such a culture in compulsory school settings requires new kinds of collaborative activities for students. Students like those in Henry’s class, and students in our Learning Sciences graduate program are at the forefront of creating such activities. Participatory assessment is about creating guidelines to help students and teachers use those activities to foster both conventional and new literacy practices. Importantly, these guidelines are also intended to produce more conventional evidence of the impact of these practices on understanding and achievement that will always be necessary in any formal educational context. Such evidence will also always be necessary if there is to be any sort of credentialing offered for learning that takes place in less formal contexts.


Because successful engagement with participatory cultures depends as much on ethical participation (knowing how) as it does on information proficiency (knowing what), At the most basic practical level participatory assessment is intended to foster both types of know-how. More specifically, participatory assessment involves creating and refining informal discourse guidelines that students and teachers use to foster productive communal participation in collaborative educational activities, and then in the artifacts that are produced in those activities. Our basic idea is that before we assess whether or not individual students understand X (whatever we are trying to teach them), they must first be invited to collectively “try on” the identities of the knowledge practices associated with X. We do this by giving ample opportunities to “try out” discourse about X, by aggressively focusing classroom discourse towards communal engagement in X, and discouraging a premature focus on individual students’ understanding of X (or even their ability to articulate the concept of X). Premature focus on individual understanding leaves the students who are struggling (or have perhaps not even been trying) self-conscious and resistant to engagement. This will make them resist talking about X. Even more problematically, they will resist even listening to their classmates talk about X. Whatever the reason the individual is not engaging, educators must help all students engage with increased meangingfulness.
To do participatory assessment for activity A, we first define the relevant big ideas (RBIs) of the activity (i.e., X, Y, and perhaps, Z). We then create two simple sets of Discourse Guidelines to ensure that all students enlist (i.e., use) X, Y, and Z in the discourse that defines the enactment of that activity. Event reflections encourage classrooms to reflect on and critique their particular enactment of the activity. These are informal prompts that are seamlessly embedded in the activities. A paper we just wrote for the recent meeting of the European Association for Research on Learning and Instruction in Amsterdam discussed examples from our implementation of Reading in a Participatory Culture developed by Project New Media Literacies. That activity Remixing and Appropriation used new media contexts to conventional literary notions like genre and allusion. One of the Event Reflection prompts was

How is the way we are doing this activity helping reveal the role of genre in the practice of appropriation?


Given that the students had just begun to see how this notion related to this practice, the students struggled to make sense of such questions. But it set the classroom up to better appreciate how genre was just as crucial to Melville’s appropriation of the Old Testament in Moby-Dick as it was to the music video "Ahab" by nerdcore pioneer MC Lars. The questions are also worded to introduce important nuances that will help foster more sophisticated discourse (such as the subtle distinction between a concept like genre and a practice like appropriation)
Crucially, the event guidelines were aligned to slightly more formal Activity Reflections. These come at the end of the activity, and ask students to reflect on and critique the way the particular activities were designed, in light of the RBIs:

How did the way that the designers at Project New Media Literacies made this activity help reveal the role of genre in the practice of appropriation?


Note that the focus of the reflection and critique has shifted from the highly contextualized enactment of the activity, the more fixed design of the activity. But we are still resisting the quite natural tendency to begin asking ourselves whether each student can articulate the role of genre in appropriation. Rather than ramping up individual accountability, we first ramp up the level of communal discourse by moving from the rather routine conceptual engagement in the question above, and into the more sophisticated consequential and critical engagement. While these are not the exact questions we used, these capture the idea nicely:

Consequential Reflection: How did the decision to focus on both genre and appropriation impact the way this activity was designed?

Critical Reflection: Can you think of a different or better activity than Moby-Dick or Ahab to illustrate genre and appropriation?


We are still struggling to clarify the nature of these prompts, but have found a lot of inspiration in the work of our IU Learning Sciences colleagues Melissa Gresalfi and Sasha Barab, who have been writing about consequential engagement relative to educational video games.


The discourse fostered by these reflections should leave even the most ill-prepared (or recalcitrant) participant ready to meaningfully reflect on their own understanding of the RBIs. And yet, we still resist directly interrogating that understanding, in order to continue fostering discourse. Before jumping to assess the individual, we first focus on the artifacts that the individual is producing in the activity. This is done with Reflective Rubrics that ask the students to elaborate on how the artifact they are creating in the activity (or activities) reflects consequential and critical engagement with the RBI. As will be elaborated in a subsequent post, these are aligned to formal Assessment Rubrics of the sort that teachers would use to formally assess and (typically) grade the artifacts.

Ultimately, participatory assessment is not about the specific reflections or rubrics, but the alignment across these increasingly formal assessments. By asking increasingly sophisticated versions of the same questions, we can set remarkably high standards for the level of classroom discourse and the quality of student artifacts. In contrast to conventional ways of thinking about how assessment drive curriculum, former doctoral student Steven Zuiker help us realize that we have to thing impact of these practices using the anthropological notion of prolepsis. It helps us realize that anticipation of the more formal assessments motivates communal engagement in the less formal reflective process. By carefully refining the prompts and rubrics over time, we can attain such high standards for both that any sort of conventional assessment of individual understanding or measure of aggregated achievement just seems…well…. ridiculously trivial.
So the relevant big idea here is that we should first focus away from individual understanding and achievement if we want to confidently attain it with the kinds of participatory collaborative activities that so many of us are busily trying to bring into classrooms.