The Law of Small Numbers

From an early Daniel Kahneman article referenced in Thinking Fast and Slow about how the poor instincts of researchers selecting samples can lead to undersampling:

“The law of large numbers guarantees that very large samples will indeed be highly representative of the population from which they are drawn. If, in addition, a self-corrective tendency is at work, then small samples should also be highly representative and similar to one another. People’s intuitions about random sampling appear to satisfy the law of small numbers, which asserts that the law of large numbers applies to small numbers as well.”*

Hint: it doesn’t apply. Based on the experience of reading numerous LIS studies and surveys, a lot of librarians implicitly believe in the law of small numbers. Of course, I might have just read an unrepresentative sample and they really don’t.

*Tversky, Amos, and Daniel Kahneman. “Belief in the Law of Small Numbers.” Psychological Bulletin 76, no. 2 (August 1971): 106.

Part 2 in P2P Review: an Elaboration

Today the Library Journal published the follow up to my previous column about information literacy as an unnatural state: Education is no Salvation. In that one I’m trying to explore the motivated reasoning and cognitive bias literature a little more with the goal of showing what we’re up against when educating people to be “information literate.” Definitely still a work in progress.

The question I’m now asking myself is why. What difference does it make if we’re more aware of cognitive bias, motivated reasoning, and all the tricks the mind plays? For the most part I’m content with Aristotle’s maxim that humans by nature desire to know, but librarians tend to be a practical breed, and the question I’ve often gotten when doing anything theoretical is what difference it will make in practice. Right now, I don’t know, but every practice is based on some theoretical construct, usually one we apply unawares.

In providing some context to the last column, I used the phrase “scholarly habitude” to describe what I think is one of the aims of higher education, at least in the traditional arts and sciences. It’s not a list of things we can do, but a state of being, a frame of mind, something along those lines. In some ways I’m going back to Aristotle and the notion of virtue ethics. Scholarly habitude captures better than “information literacy” the sense that being a scholar or academic researcher isn’t just about having a set of rules to follow. It’s also about being a certain kind of person: intellectually curious, skeptical, requiring evidence for at least some beliefs, etc. These traits aren’t necessarily abundant in people.

I’m thinking about this mostly in terms of teaching undergraduates how to research and write scholarly essays, which most of them are expected to do at some point. An example of one approach I mentioned in today’s column: the student who has something to say and wants some scholarly sources to support it. The exact opposite way that people should approach research, but the way that is the most natural and in accord with how the human mind seems to work. We make snap judgments and then try to justify them. As Daniel Kahneman puts it in Thinking Fast and Slow, our “System 1″ comes to a conclusion very quickly, while our slower and more thorough “System 2″ is usually happy just to accommodate System 1 without further prodding because it’s also lazy. As Michael Shermer puts it in Why People Believe Weird Things, “smart people believe weird things because they are skilled at defending beliefs they arrived at for non-smart reasons.”

The wrong way to approach research on an unfamiliar topic is to have an opinion and then look for sources to justify it. The right way is to look for evidence and follow the evidence where it leads. There’s an academic analogy for this, but I’m not sure how far I want to pursue it. It’s similar to the distinction between theology and religious studies. I don’t want to say theology isn’t scholarly, just that it’s not really in accord with current information literacy standards in some ways. Theology can be defined as faith seeking understanding, meaning the theologian believes something to be true and then seeks to understand and justify that belief. Although I’ll leave open the possibility that some people do, in general people don’t hold religious beliefs for rational reasons based on evidence that could withstand public scrutiny. That’s why most religious people tend to practice the religion they grew up with and few convert to a different religion. Children growing up in a religion didn’t rationally choose to follow that religion, although later on many of them seek to understand their faith in a rational way. Hence, theology.

Religious studies, on the other hand, takes a different approach. We can use the insider/outsider distinction. Theologians study a religion from the inside, while scholars of religion often come at religions from the outside, trying to understand those religions without necessarily practicing them. They approach the available evidence and try to make sense of practices that might seem bizarre to outsiders, and to outsiders all religions have their bizarre practices. Understanding a religion as an outsider partly means explaining why strange practices don’t just exist because they’re practiced by crazy people. “They eat the body and drink the blood of who again?” “What kind of loving god would forbid bacon!?” “Your religion says I can’t publish a picture of this guy? WTF?” I’ve noticed lots of people like to make fun of Scientology without considering what they’re own religion looks like to people who don’t practice it.

Or a slightly different analogy, a bit broader. The traditional foil of theology is philosophy, and during the European Middle Ages philosophy was considered the handmaiden of theology, at least by the Catholic Church, and they were the intellectual standard that mattered. During the 17th and 18th centuries, philosophy broadly conceived came into its own again, and philosophy became the queen of the sciences. Every study that wasn’t motivated by religion could be considered philosophy, and indeed what we now call natural science was called natural philosophy in the 18th century. That’s why we now have PhDs, doctorates of philosophy, for disciplines that we don’t consider to be philosophy by contemporary standards, because they’re all involved in the same Enlightenment driven enterprise: to discover and disseminate knowledge about the world. The way to do that is approach the world with as few preconceptions as possible and see what you find. That approach explains why we (or perhaps “we”) no longer believe that demons cause epilepsy or the earth is the center of the universe. Academics follow the evidence, unless they’re economists or philosophers, because those people just make stuff up.

If we use theology/ philosophy analogy, what we’re trying to do when we try to teach students about academic research is move them from a theological mindset to a philosophical one, where the preconceptions, uninformed beliefs, and cognitive biases don’t motivate all of their reasoning. Writing what they know isn’t a good idea, because they don’t know very much, their experience of the world is limited, and their experience of scholarship even more so. Those preconceptions and biases instead should become objects of investigation themselves. That boundary has to be crossed before they can begin to examine evidence in the way information literacy standards suggest. Part of a good liberal education is about breaking down your past self to prepare to develop a better self.

So where does this leave library instruction? If all these cognitive biases and preconceptions are completely natural, extremely difficult to overcome, and probably impossible ever to completely overcome, how does this affect us practically? For one thing, it should lower high expectations. If you were unaware of all the ways the mind obscures and distorts reality for our benefit and how difficult making the philosophical leap really is, and you were already frustrated how little you could get done in the hour you might spend with a class, this news should lower your expectations and perhaps explain your frustrations. If you thought a little library research instruction is going to have a remarkable effect, you should probably change your opinion.

Then there’s the question, what the heck do librarians do instead, or in addition? I don’t have any ideas on that yet, but I’m convinced so far that librarians play much more of a support role in this enterprise than some think we do.

Plagiarism and Library Research Guides

A couple of weeks ago I had an unusual request. A librarian wanted to use one of my Libguide pages as an example of citing sources in research guides. It seems the dean of the library or someone had expressed concern that the librarians weren’t paying enough attention to plagiarism within Libguides and wanted a presentation to raise consciousness.

I have to say, it’s not a subject I usually think about. As far as I can tell, librarians have always had a culture of sharing about research guides. It’s not like we’re doing original research here. There are only so many ways to describe the research process or annotate a database. And though we seem to have become the citation police, librarians aren’t the plagiarism police, at least not on my campus. There are other academic units for that. While I’ve tried to assign credit when I blatantly copy or adapt something, I’ve given permission to everyone who’s wanted to use some of my Libguide material to do whatever they like with it, and I’ve never bothered to check whether people were citing or linking back to me. Now that I’m thinking about it, I wish Libguides could be published with some sort of Creative Commons license.

Eventually, I tried to find some examples of plagiarism in Libguides to see if this was widespread. It wasn’t hard. All I had to do was search Google for PLAGIARISM LIBGUIDES. The first guide that came up was this one with a page on avoiding plagiarism. That one has a section beginning, “Each day we take ideas from others without acknowledging the original source.” That’s probably true. In this case, there’s also a sidebar with a warning that begins, “Changing the words of an original source is not sufficient to prevent plagiarism.” That’s an unattributed warning, I might add, although based on the 73 results that come up in Google for that phrase, the source seems to be a document from Turnitin. Ooops.

And we get some interesting results if we search for the phrase “Each day we take ideas from others without acknowledging the original source.” That phrase, along with an entire section explaining plagiarism, shows up on at least three other Libguides, none attributed. Looking at the four, it’s impossible to tell who was first, or if all four are plagiarizing some third document.

So, plagiarism in Libguides definitely happens, and it’s ironically amusing that guides are plagiarizing each other to warn about plagiarism. Should we worry about it or try to do anything about it? I’m thinking probably not. While it might bother me to have an article or blog post blatantly plagiarized, I just don’t have the strong feelings about library research guides. Unlike with other types of writing, with research guides we’re all in this together, and using stuff that works for research guides helps everyone. It’s important for scholars to attribute ideas and phrasing for their sources so they don’t pass someone else’s ideas off as their own. But with library research guides, there just aren’t that many original ideas. The Libguides platform itself is built on the assumption that we want to easily borrow stuff from other guides, especially within our own institution. But perhaps I’m missing something and this is somehow a big deal.

Some Context for the Latest P2P Review Column

My latest Peer to Peer Review column in the Library Journal came out today, Information Literacy as an Unnatural State. This is my first effort to pull together ideas I’ve been writing and thinking about information literacy, the persistence of pseudoscience, and cognitive bias for the past year and a half. Possibly there will be some ancient philosophy in there eventually as well (e.g., Stoicism and philosophical Daoism), but I’m not sure yet. What we think of as information literacy, and indeed the entire academic enterprise, is deeply unnatural, and that instead of thinking about IL as a set of competencies, we should think about it some other way. I’m not sure what way yet, but the idea I’m playing around with I’m calling “scholarly habitude,” meaning roughly that the difference between the information literate/ scholarly person isn’t the ability just to do certain things, but a set of habits or frames of mind relative to the world, and that it’s much harder to achieve than reading through a set of competencies might indicate. I’m also not sure yet what specific role librarians would play in developing those habits.

Anyway, the LJ column is a tentative first step to something that might grow larger over time, so if anyone has any questions or criticisms, I’d appreciate them. The more and earlier the better.

On the “Sting”

The latest buzz in the OA community seems to be the story of the so-called sting of  OA journals, large numbers of which accepted a bogus paper with little to no peer review. The Chronicle article captures the story well. The journal Science, which published the “sting,” claims it exposes the “dark side of open access publishing.” I guess the dark side of subscription publishing has been well known for so long it’s good other dark sides are exposed. Critics have complained about the quality of the study/sting itself and the fact that it targeted only open access journals, even though (shockingly!) subscription science journals can be just as susceptible to flawed peer review, including Science itself.

I’m still trying to figure out what all the hubbub’s about. Okay, so only open access journals were targeted (including several owned by Elsevier and other subscription science publishers). Okay, a whole bunch of the publishers on Beall’s List of Predatory Publishers turn out to be predatory publishers. All you have to do is start exploring some of those publishers to figure out they’re hardly reputable.

Putting aside the potential bias of the subscription journal Science trying to spin this as a sting that shows how subscription journals are more trustworthy than open access journals, isn’t it beneficial to know just what dubious OA journals are in fact little more than scams? Beall himself might have an anti-OA bias and believes that the subscription Big Deals have been a big success for libraries (although I still don’t believe the numbers back him up on that), but that doesn’t mean he’s not doing the world a service by identifying suspicious publishers. Identifying suspicious OA publishers is good for the OA movement.

The only way this could be harmful to the OA movement in general is if someone claimed that this “sting” somehow proved that the OA process is inherently flawed. That would be a stupid and unsupportable claim based on the evidence at hand. In fact, despite the fact that every other Indian citizen seems to be creating a dubious OA journals, numerous OA journals didn’t fall victim to the bogus article. Is anyone making that claim?

What we can learn from this episode is that there are a lot of shady publishers trying to make money. We live in a world where Elsevier published fake medical journals for profit. Does it really come as a surprise that lots of enterprising people want to find a way to make a profit from a flawed system of scholarly communications? But just as the mission of science isn’t to support Elsevier’s bottom line, neither is it to support questionable OA publishers around the world. They should be outed and avoided. Maybe the bigger lesson is that wherever profit is involved in scholarly communication, someone’s going to try to make a profit, whether it’s Elsevier or some desperate guy in India with access to the Internet.

Radical Collaboration

For an ACRL committee producing a report, I’m investigating a category called “radical collaboration.” That basically means collaboration among academic libraries in relatively new ways, with collection development or public services or anything else.

If anyone knows of any examples of new types of collaboration among academic libraries, I would greatly appreciate it if you’d let me know, either in the comment section or via email at rbivens@princeton.edu.

Thanks very much.

Review: Jesse Shera, Librarianship, and Information Science

If you’re not familiar with the thought of Jesse Shera, you should be, and an easy place to begin that familiarity is Jesse Shera, Librarianship, and Information Science by H. Curtis Wright. This was originally published as Occasional Research Paper no. 5 by the School of Library and Information Science, Brigham Young University in 1988, and is now reprinted with a new introduction and index by the Library Juice Press.* Since the library school at BYU has been closed for 20 years, I’m assuming this has been out of print for a long time. Welcome back.

Some might call it a biography, and a review of the first edition in 1988 criticized it as a “run in attempt” at a biography. However, biography is the wrong word to describe the book. Yes, we find out a little bit about Shera’s childhood history and early manhood and a little bit more about his early career in libraries. However, the bulk of the study isn’t about Shera’s life, but his thought, specifically his intellectual journey from believing information science provided the theoretical foundation of librarianship to his belief that “symbolic interactionism” instead provides that foundation. This is combined with an extensive, possibly exhaustive, bibliography of Shera’s 57 years of publications. Of the 120-or-so page book, roughly half is the lengthy essay on Shera’s thought and half the bibliography. The combination makes this an indispensable volume to begin a serious study of Shera.

Early exposure to librarianship in the 1920s convinced Shera that librarianship as it had traditionally been practiced was a cramped and overly practical affair, and he spent the rest of his career trying to reform the profession, at first from the inside, later as a professor of library science at Chicago, and finally as the Dean of the library school at Case Western Reserve. During the 1940s and 1950s, Shera came to believe that the theoretical salvation lay with information science and technology. He was a cofounder of the reorganized American Documentation Institute, and cheered on the impressive gains of information science during the period. Eventually he changed his mind, saying much later that “twenty years ago, I thought of what is now called information science as providing the intellectual and theoretical foundations of librarianship, but I am now convinced that I was wrong” (41).

He changed his mind because he came to believe that librarianship is a humanistic affair involved with human communication, knowledge, and ideas. Information science is no such thing. While information science can provide useful tools and improve processes, it can never be the theoretical foundation of a field primarily involved with humans communicating ideas. “Information science . . . deals with only a part of what the librarian does” (45). Regardless of the prevalence of information science and technology useful to librarians, Shera believed that “the social purpose of the library remains unchanged–to bring the human mind and the graphic record together in a fruitful relation” (44). Thus, while librarianship might make use of science, it isn’t itself a science, and it has little to do with the information in information science.

At this point in the argument it might be useful to define terms for those unfamiliar with the debate. Most librarians believe we’re in the information business. We even have desks that say “information” on them, so that everyone knows what we do. And, in a sense, we are in the information business. However, the “information” in information science isn’t the same thing as the “information” that librarians trade in. (For a lengthy discussion of what “information” means to information scientists, I recommend James Gleick’s The Information. For a totally unrelated adventure story about a woman who trades in information in the sense librarians deal with, you might try Taylor Stevens’ The Informationist.) Here’s a key paragraph from Wright:

It was librarians, Shera reminds us, who “eagerly seized information science as potential supports to their . . . professionalism.” But information science, he says, has “misinterpreted [Claude] Shannon and [Warren] Weaver’s specialized use of the noun information and assumed that it related to the communication of knowledge rather than the transmission of signals.” This has created a genuine problem for libraianship, because Shannon was interested solely in creating a theory of pyhysical signals for describing “the message-carrying capacity of a symbol, a telephone wire, or any other medium or channel of communication.” (47)

Information science is concerned, according to Shera, purely with the transmission of signals, while librarianship is founded in human interactions and is concerned with ideas and knowledge as well as information. While the efficient transmission of signals or the storage of information in the IS sense is a necessary part of librarianship, it’s not as sufficient part.

Shera’s finally believed that “symbolic interactionism” should provide the theoretical foundation of librarianship. Symbolic interactionism is a theory borrowed from George Herbert Mead. Supposedly, unlike information science or systems theory, symbolic interactionism “investigates the psychophysical interaction of the empirical order and the ideative order in human beings by studying the relationship between the physical symbol and its symbolic referent” (55). While I accept the humanistic nature of librarianship, I wasn’t convinced that symbolic interactionism as such provides a theoretical foundation of the profession, and there wasn’t sufficient argument in the book to persuade me. It is perhaps the one flaw in the book that Wright, a friend and former student, provides little critical distance from Shera, because precisely at this point I would have preferred a little critical analysis in addition to the clear explanation of Shera’s thought.

However, that wasn’t the purpose of the book. There was enough to explain what Shera believed and to some extent why, and ample resources in the bibliography to follow Shera further if I cared to argue with him. So, overall, a satisfying volume, a quick read, and a passionate introduction to Shera’s thinking. Anyone concerned with what librarianship is or should be would profit from reading the book.

*[Disclosure: Library Juice Press published my book Libraries and the Enlightenment.]

Opting In

Back from a long vacation, caught up with work that piled up while I was gone, and ready to catch up on my library lit reading. So I started reading, backwards from this to this to this to this. I can say one thing for Rick Anderson, he knows how to get a debate going.

The debate concerns an Ithaka “issue brief” by Anderson called Can’t Buy Us Love. The basic thesis, as I understand it, is that research libraries should devote more resources to digitizing their special collections and making them discoverable. I don’t think anyone disagrees with that claim, which is probably why there’s not much discussion of it. This increased emphasis on special collections will require a shift of resources away from something, and for Anderson that something is “commodity documents,” by which he means documents easily available cheaply elsewhere, especially “trade books that are produced in large print runs.” The recurring example is a 1975 printing of East of Eden. If I’m reading it right, he’s saying that research libraries should maybe buy fewer popular books published in America, devote fewer resources to housing them indefinitely, and devote more of that money to special collections processing and digitization. That seems to me a plausible interpretation of the basic argument, which isn’t especially provocative even if one disagrees with it.

The controversy seems to be about two issues: the question of what constitutes commodity documents and their relationship to the mission of research libraries, and the claim that focusing on special collections and moving away from “commodity documents” somehow opts out of the so-called scholarly communications wars, because digitizing our own special collections “is neither undermining the existing scholarly communication system (except to the extent that it pulls collections money away from commercial purchases) nor supporting it.”

Anderson claims that, “With the advent of such internet-based outlets as Amazon Marketplace and Bookfinder.com, however, every home with an internet connection has direct access to the holdings of thousands and thousands of bookstores around the world, and the likelihood of finding a remaindered or used copy—often at a price of literally pennies, plus a few dollars in shipping—is very high.” It seems to me that the scope of “commodity documents” is pretty small compared to the breadth of research library collections. Anderson already eliminates the budget busting scholarly journals. University press publications aren’t nearly as cheap and readily available as old bestselling novels. Foreign publications aren’t so accessible after a while. Trade books in large print runs aren’t a huge percentage of a lot of research libraries’ expenditures, but possibly buying fewer of them, or perhaps keeping fewer of them as they get older and less used, would provide some savings that could be devoted to special collections. So what if it might be true, as Anderson claims, that “the library’s role as a broker, curator, and organizer of commodity documents is fading,” if commodity documents as such are a relatively small part of research library collections, which I believe to be the case. On this one, I could agree with his basic claim without thinking it particularly radical or controversial.

The other controversy about “opting out” of the scholarly communications wars could be puzzling, because as it’s framed the proposal has nothing to do with the scholarly communication wars. Whatever wars there are concern commercial scholarly journals, almost all STEM titles, and these are deliberately left out of the scope of discussion. That claim is simply irrelevant to the main argument about special collections versus commodity documents.  Reread Anderson’s article without the “Opting out of the scholarly communications wars” section, and see if that harms the piece at all. The key, though, is that the argument is framed to avoid problems in scholarly communications, except that can’t really be done.

Instead of being an unnecessary diversion, the section about scholarly communications wars is more a sleight of hand. It’s pulling a rabbit out of a hat while ignoring the elephant in the room, if I can mix my cliches. The basic claim is that libraries should digitize and make available more of their unique content, which, of course, lots of libraries are already doing. The resources to do more of that have to come from somewhere. Libraries could buy even fewer popular books than they already do. Or, maybe, they could opt into the scholarly communication wars, do their best to promote green OA, and reduce the stranglehold of commercial STEM publishers, because that’s where most of the money goes. When it comes to discussing where resources go within libraries, nothing escapes the scholarly communications wars. You can simply refuse to talk about it. You can claim that librarians doing so are putting politics above patrons. You can pretend that budgets for books and other resources just gut themselves. But you can’t have an honest discussion about where scarce resources in libraries should go without talking about problems in scholarly communication, whichever side of the issue you’re on.

Like It Was Written Yesterday

Another part of historical change rhetoric I’m looking at is the persistence of themes. Read these quotes and see if you could tell when they were written purely based on the language:

Applying technology is not a “one time” event, it is a continuing activity, since technology, whatever form it takes, is constantly changing. This reality is a key aspect of librarianship today and helps explain why our profession is clearly in transition….

Even without our deliberate choice, changes are being imposed on our working environment by technology as well as by other pressures external to and within our profession. In the past decade librarians have discovered that we must either initiate change or adapt to it. We simply can’t ignore new developments and hope they will leave us untouched. An ostrich-like attitude is downright dangerous….

Perhaps, as is already true in some specialized library service assignments, advanced degree studies in addition to MLS/information science type training will be frequently expected. Position ads in current library journals already show quite a variety of preferred background qualities….

Those are from an article in the Library Journal from 1985, “Managing Change: Technology and the Profession” by Karen L. Horny. The “past decade” in which librarians discovered they had to change was a big decade for library automation, although Horny also does a pretty good job of predicting how the rise of personal computers, the digitization of content, and the ability to do things “online” will change libraries and information. (“Perhaps electronic readers will become so compact and legible that it will be possible to curl up with a good online novel!”)

Statements almost identical to this appear in the current library literature all the time. The age of such themes–combined the significant changes that have occurred in libraries over the last 30 years–seems to me an indication that angry or frustrated attacks on current librarians as hopelessly resistant to change don’t have much evidence to support them. The elder librarians around today were the very ones implementing all the significant technological changes that resulted not from the Internet or the rise of social media, but from the initial automation of catalogs and indexes starting in the 1960s. It seems to me that wave of technological change was much more shocking for librarians of the time than our current situation, which is more or less a steady development building upon the drastic and rapid change that really happened 30 or so years ago. As people get older, perhaps they get more resistant to change, or perhaps not, but the retiring generation of librarians certainly lived through and implemented significant and rapid change.

Here’s another quote I left out because of the date:

It has been especially rewarding to see that some of a library’s longer-term employees have the greatest sense of the new technology’s benefits, since they can recall, often quite vividly, the limitations of former manual operations. It is also true that for most people who have entered the library field since the early 1970s, change is the accepted norm.

I’ve encountered plenty of examples of the exact same sentiment among librarians writing today, except the time frame is since the beginning of the 21st century or some such. Librarians without a historical knowledge of how technology has affected librarianship for the past 45 years or so are always in danger of making foolish claims about the current state of the profession.

30 Years of Change and Hype

For a possible research project, I’m reading around in the historical library literature about change in libraries. Here’s a great quote from John Berry in a Library Journal editorial from 10/15/83 about the first LITA conference:

The usual band of cheerleaders delivered typical, often condescending, pleas for everyone to get on this or that automation bandwagon, and the usual “experts” delivered typical indictments of working librarians who offered any resistance to the cosmic imperatives of the new age.

I’m trying to get an idea of just how long hyperbolic change rhetoric in librarianship has generated a specific kind of criticism, not of the change, but of the rhetoric. Now I know it’s been at least 30 years.