“Where Ideas Are Free:” Scientific Knowledge in the Algorithm Economy

Patrick Vonderau


[ PDF Version ]


Close to the end of its first season’s penultimate episode, HBO comedy series Silicon Valley shows its gawky programmer heroes join TechCrunch Disrupt SF, an actual annual convention in San Francisco where technology startups launch their ideas on stage. “PiedPiper,” a fictive “multi-platform technology based on a proprietary universal compression algorithm” according to the company’s equally fictive website (, here competes with “Bitflenser,” “YogaMasta,” and “ImmediBug” for investors. Parodying the clichéd language of innovation spouted by real IT entrepreneurs, the episode culminates in a montage of various startup CEOs pitching their inventions as “making the world a better place” (see figure below).

“We hope in this way to make the world a better place by providing a suite of compression services across diversified market segments.” [YouTube]

(image and slogan courtesy of

Media scholars are familiar with promotional claims of how information technologies would “revolutionize” the distribution or production of content, and how the digital would make media more “social,” “mobile” or “participatory,” with “infinite shelf-space” for content, “more direct contact” between consumers and digital providers, and “real-time, in-depth analytics” for “tracking success” in digital markets built on the “wisdom of crowds.”[1] Earlier research on media industry transformation often adopted such positive industry claims,[2]while a more recent wave of studies has begun to critically detail the complex historical, local, material, and socio-technical architecture of digital delivery systems.[3] Yet while changing distribution patterns for music, film, and television have been subjected to such studies that even involve critical industry dialogue and other forms of qualitative social research,[4] the distribution of academic research itself has not been the object of any media industry study so far.

One reason for this is that until recently, academic networking and publishing were not part of social media as a set of industries whose markets are organized via the Internet and digital platforms. What is more, the traditional standards and norms of academic communication seemed to run counter to the practices of consumer social media in boyd and Ellison’s sense of “web-based services that allow individuals to (1) construct a public or semi-public profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system.”[5] As TechChrunch spinned it in a 2013 article, academia is “one of the very last bastions to be affected by the free, unfettered flow of information on the web.”[6]

Conjuring up a situation of crisis in scholarly publishing, web IT startups over the past years have joined proponents of the Open Science movement and tech journalists in a concerted effort to make the academic world a better place, too. Building on earlier non-commercial initiatives such as the Social Sciences Research Network (, established 1994) or H-Net (, since 1995), a flurry of startups now attempt to match the “Web 2.0” idea of social networking with the alleged primal need of scholars to constantly publish and find fresh research. This includes services such as Philica (“Where Ideas Are Free”), Epernicus (“Increasing Research Productivity”), Academici (“Tapping into Untapped Potential”), and most importantly, (“Accelerating the World’s Research”).

With reportedly more than 16 million users and 15.7 million unique visitors each month, currently surpasses competing services ResearchGate, ScienceExchange, and Mendeley in capitalizing on the “free flow of information” among scholars. Founded in 2008, the platform as of January 2015 had raised 17.7 million U.S. dollar in venture capital.[7] TechCrunch repeatedly promoted the service as an alternative to “traditional, less transparent and more expensive publishing models,”[8] exposing the “slow-moving world of academia” as an “older, more opaque world in which researchers vied to get into elite journals,”[9] and as one marked by an outdated, inefficient, low quality peer review system resulting in occasional fraud.[10] Asked in a commissioned interview how they were “making the world a better place,”’s CEO Richard Price and software engineer Nate Sullivan responded:

Nate: I think freedom of information is a powerful equalizing force in the world. If people in the first world, who have the privilege of being at wealthy institutions, are the only ones able to access research, that’s inegalitarian. And yet, the way we encourage people to make their papers open-access is not by making emotional or ethical appeals, it’s by creating compelling products. That’s why we think a lot about how to align individual incentives with stuff that’s good for the research community as a whole. We like that academics join for career reasons, and then as a positive side effect they end up making their papers freely available to everybody.

Richard: It’s very difficult to get a job in science and research these days. If we can make that easier, that’s also a great thing. Users frequently report sharing their papers’ analytics with their tenure committees to make their case more compelling.[11]

In what follows, I aim to briefly contextualize these claims and the visions or business plans of platforms such as Taking a socio-epistemological perspective, this essay suggests a closer look at’s system of knowledge and reputation management in order to promote transparency about the epistemological status of tools such as rankings, recommender systems, and peer-to-peer review that are currently being heavily promoted to become new standards for academic practice, especially in the United States. As Origgi and Simon pointed out five years ago, scholars need to discuss whether their trust of these systems is “based on uncertain heuristics and optimistic biases” or indeed on “reliable procedures and mechanisms we are able to monitor” as scholars ourselves.[12] In this line of thinking, my short intervention will conclude by suggesting an engagement with a “collaborative audit” of’s metrics, following Sandvig et al’s proposal for scrutinizing algorithms that so far remain opaque to public understanding.[13] The proposal consists of reverse engineering’s reputation management system through publishing both specifically designed papers and the resulting metrics on The guiding question is simple: If the academic world indeed needs to be made a better place, in what ways are academics invited into this process? And how untransparent, uncertain, slow-moving, unfree, inaccessible, old-fashioned, inefficient, unsocial, biased, unequal, or closed are traditional academic communities in terms of their socio-epistemological organization?

As a platform for sharing research-related publications, offers academics a number of advantages. Fully searchable via Google even for non-registered users, the platform has made scientific knowledge accessible to a broader public. It facilitates the discovery of new research, promotes scholarship in other languages than English, and offers an alternative to the expensive paywalls of academic journals and the fees required for open access publishing. minimizes the time lag between final draft and first response to peer reviewed articles, facilitates the spread of unorthodox formats and grey literature, and allows the free creation of tags for existing or emerging research fields. The platform thus contributes to debates among digital humanists on how to establish, produce, organize, and disseminate scientific knowledge, as well as to debates about the larger ecosystems for academia today. Leaving practical drawbacks aside, it is nevertheless still hard to overlook the discrepancy between’s mission statement and its business model.

While widely presented to academics as a profession-oriented social networking service that would speed up the sharing of research papers and come to replace traditional forms of peer review, the monetization plan justifying the platform’s venture capital funding is less often revealed. In an interview conducted by Radio Berkman in 2013, CEO Richard Price explained that the site’s gradually optimized algorithm to track readings would at some point allow the site to “sell the fruit of that peer review system” to “pharmaceutical companies” and other corporations in need of real-time “peer review signals.” Speaking of’s service as an algorithm for which papers have been highly consumed by influencers, Price suggested the possibility of a “trending papers dashboard” that would help “people who deploy capital” to turn science into patents or products, implying that instant qualified crowd feedback could be converted into competitive advantage.[14] Although this business model is hardly unique—it resembles that of Elsevier’s platform Mendeley—there is a discrepancy between a service that sells usage-based metrics to corporations and a service promoting Open Science, as inscribed in the .edu top level domain name, which since 2001 is reserved for United States-affiliated institutions of higher education. This discrepancy is not coincidental, but tactical. As Mendeley founder Victor Henning put it:

The fifth and perhaps biggest problem in scaling usage-based metrics lies in overcoming social hurdles rather than technical ones. How might we convince scholars to take part in generating and accepting usage-based metrics? In our opinion, the answer is that doing so must confer some type of utility to scholars beyond the idea of contributing to a fuzzy ‘greater good of science.’ More specifically, the tool that does the measuring . . . should collect this data only as a secondary purpose and must have some other primary usage value.[15]

One might argue that a fuzzy primary usage value—such as “making the world a better place” or designing the “future of peer review”[16]—is exactly what has made more successful in creating trust among scholars, if only because of the user experience of participating in an infinite beta test for something that still might become a game changer. Although often overpitched, tactical coinages such as “crowd review” may indeed be necessary to get things done in the first place; as I have argued elsewhere, information technology is ripe with visions set in motion in order to shape the future, and promises to reinvent the academic world online form no exception as they help to mobilize the future to marshal resources, coordinate activities, and manage uncertainty.[17] Nor is’s present shape the simple result of corporate strategizing; it emerges at the intersection of interests practiced by scholars, programmers, journalists, and Google users and is also, to paraphrase Andrew Pickering, constantly “mangled in practice”—that is, transformed in a dialectic of resistance and accommodation.[18]

Still, an insistence on “algorithm transparency”[19] here remains founded in the fact that academia’s digital “politics of standards”[20] emerge in a way that remains largely untransparent, as new norms and standards are established by “passing through the mangle,” sneaking in silently in the wake of algorithmic processes, and affecting the organization of actual academic practice even where no obvious relation exists to the services offered by private companies such as or Mendeley. Let me briefly illustrate this point by focusing on’s reputation and knowledge management system.

In 2014, introduced what they call “percentiles,” a reward system identifying users who have progressed into the top 5% of the site, based on the total views across a given profile and of all the papers posted in the last month. In addition, percentile rankings for papers compare a paper’s view count to all other paper view counts in a thirty day period, according to the platform.’s personalized statistics

Such percentiles are part and parcel of reputation management systems that have preceded and Mendeley by quite some time. Citation-based reputation metrics such as those produced by the Institute for Scientific Information (since the 1950s), the Science Citation Index (since the 1960s) or the Arts and Humanities Citation Index (since the 1970s) have long provided seemingly objective measures of academic impact and performance.[21] In a similar vein, the display of percentile trophies for research or research profiles on is meant to positively influence decision-makers in determining career progression, post-doc positions, tenure, or grant funding.[22] At the same time, services like and Mendeley attempt to overcome the perceived shortcomings of pre-digital reputation management systems by proposing alternative, usage-based impact metrics facilitated through the scaling of online networks. The models for this are “social music services” such as and Spotify that have “managed to create the largest ontological classification (and the largest open database) of music in the world, by aggregating the musical tastes of its . . . users and then data-mining it for similar music genres, artists, and songs.”[23] Following this and other models, Mendeley and have combined search ranking and social filtering, or aggregation and selection, in order to engineer the reputation of scholars. In order to measure paper pervasiveness, aggregates and weighs profile views, document views, unique visitors, followers, and other data for a pre-defined time period.

While many academics have come to accept, by way of their daily practice, such usage-based metrics as a useful “Nielsen TV ratings for science,”[24] the question remains how handing over the rating of scientific quality to private Web 2.0 entrepreneurs and their proprietary algorithms is going to affect scholarly communication and notions of scientific knowledge in the long run. As Judith Simon has noted in her insightful analysis of epistemic sociality (or the social dimension of the construction of knowledge) and of what she calls “epistemic social software,” aggregation and selection are “mechanisms of closure” employed to terminate socio-epistemic processes in which multiple agents are involved.[25] For instance, while suggests that it adheres, in its revamping of the traditional peer review system, to the “wisdom of the crowd” or “some form of epistemic democracy where everybody on the Web has the same rights and weights,”[26] the platform of course relies precisely on mechanisms for creating “differences that matter,”[27] that is, for weighing epistemic agents and their judgments algorithmically. What is more, research suggests that often replicates existing power structures between disciplines and ranks, with professors occupying privileged positions in the network even where they are rarely actively participating.[28]

Concurrent with introducing their own standards for weighing judgments about scholarship, platforms such as have also regularly cast doubt on traditional standards of academic practice. The peer review system is at the very heart of this attack. Conjuring up the imminent risk of academic research becoming “irrelevant to contemporary culture’s dominant ways of knowing,”[29] proponents of Open Science and trade journalists have joined IT entrepreneurs in declaring such standards as soon to become obsolete. While there is still little evidence to back up such a claim, the claim in itself provides evidence for the changing notion of knowledge in what Richard Münch has described, borrowing from Jürgen Habermas, as an economic “colonization of the academic life-world” engineered through rankings, ratings, and evaluations of all kinds—or in short: “academic capitalism.”[30] In this context, knowledge no longer gains its legitimacy for culture in relation to truth, plausibility, or curiosity, but primarily in relation to its character as a resource. Historians of science have traced this semantical shift back to the 1990s Internet boom that required a more open concept than the techno-cybernetic notion of information for justifying its expansion into all sectors of society.[31] Today, knowledge is often validated in utilitarian terms by determining its functions for decision-making, risk assessment, or entertainment. The attacks on peer review are consequential in this respect, since peer review and professional organizations are foundational for the traditional socio-epistemic self-organization of academia that aimed to protect the autonomy of the sciences against the interests of the state, the church, and the market. Seeding distrust into traditional ways of academic self-organization is part of a claim on authority over peer review signals and is the precondition for managing them as resources.

At the same time, knowledge management systems like foster the very lack of transparency that then creates demand for their services. For a media industries scholar like myself, there so far has not been a vital need to manage a reputation online. Scholars within this small and specialized field regularly meet at conferences and share research and interests through joint projects, publications, or by testing each other’s work in teaching. However, allowing users to freely attach research tags to a paper—tags that then are used to channel paper upload information into newsfeeds addressing like-minded scholars (whose attention in turn will determine if trophies are awarded or not)—and to infinitely enter research fields into’s database has resulted in a multitude of equally labeled fields that show little overlap with accepted subfields among the actual community of scholars. While the proliferation of tagging may lead to the occasional discovery of interesting new research, the fact is that more often than not, academics just add as many tags as possible to appear in the results for a wide variety of searches. Gaming the system is easy and possibly a further reason for’s current wide adoption, as users may influence their online reputation by massively increasing the quantity of tags, uploaded papers, and followers, by using Google Instant searches for efficient title keywords, or by creating external links to their own uploaded documents.

The metrics used for building such a reputation thus remain “tantalizingly suggestive and wholly inconclusive” at best.[32] This is also because of Google Analytics’ recent refusal to pass keyword data on to site owners.[33] Yet even in cases where still provides keywords, such keywords seldom back up the platform’s claim to deliver deep analytics about searches related to a given profile or text, since most searches indeed seem rather coincidentally linked to a scholar’s activity, as testified by many of’s users who have published their metrics on Yet the inconclusiveness of the metrics presented does not prevent academics from using them as if the value generated around their profiles and publications could be turned into a real-world currency. That is, similar to the Hollywood Stock Exchange, or HSX, an online multiplayer game in which players use simulated money for buying and selling shares of actual directors, actors, or upcoming films (, works as a fictive market that asks us to suspend disbelief in its simulation of an integrated global market for scientific knowledge and to simply enjoy the game. Drawbacks include, apart from those mentioned already, the lacking valorization of non-numerical interaction and exchange; the ignorance of tacit and implicit knowledge; the simplification of the complex social relations between peers; the reduction of various aspects of publishing such as copy-editing work; the creation of real competition between platforms such as, online journals, university websites (and their bibliometrical databases), and professional social networking integrated into the websites of scholarly organizations such as the Society for Cinema and Media Studies ( and the European Network for Cinema and Media Studies (; and finally, the confusion of online rankings with long-term accomplishment in new fields of research.

Since this fictive market established by reputation systems remains unregulated and opaque, I would like to conclude by suggesting a collaborative audit of’s reputation management algorithm. While developing and coordinating this idea into a full-fledged field experiment is beyond the ambition of this essay, here is at least a suggestion that follows Sandvig et al’s idea of developing audit study designs that allow researchers to “look inside the black box of algorithm to pursue knowledge about pressing public problems.”[34] I would suggest a reverse engineering of’s aggregation and selection mechanics, by uploading web-searchable documents, testing variables for document labels, document titles, full text keywords, and research tags, as well as the uploader’s number of followers, language, and geographical location. Documents could relate to in their content—as the text you just have read—and invite other users clicking on the document to share their search histories. Incoming responses (as well as any lack of such responses) could in turn be integrated into the text within the document. Using a paper posted on as research on, with every view (or with a lack of views) contributing to the data set gathered, may eventually result in an analysis that helps to create some transparency around the standards currently set for the organizing of academic knowledge. If we indeed have come to make the world a better place by monitoring the impact of our research, let’s at least do this monitoring in a transparent, self-reflexive, and collaborative way.[35]



[1] See, for instance, Chris Anderson, The Long Tail: Why the Future of Business Is Selling Less of More (New York: Hyperion, 2006) or Erik Brynjolfsson and Andrew McAfee, The Second Machine Age: Work, Progress, and Prosperity In A Time of Brilliant Technologies (New York/London: W.W. Norton & Company, 2014).

[2] Yochai Benkler, The Wealth of Networks: How Social Production Transforms Markets and Freedom (New Haven, CT: Yale University Press, 2006); Henry Jenkins, Convergence Culture: Where Old and New Media Collide (New York: New York University Press, 2006).

[3] See, for instance, Jonathan Sterne, MP3: The Meaning of a Format (Durham, NC: Duke University Press, 2012); Ramon Lobato, Shadow Economies of Cinema (London: Palgrave Macmillan, 2012); Jennifer Holt and Kevin Sanson, eds., Connected Viewing: Selling, Streaming, & Sharing Media in the Digital Age (New York: Routledge, 2014); Tarleton Gillespie, Pablo J. Boczkowski, and Kirsten A. Foot, eds., Media Technologies: Essays on Communication, Materiality, and Society (Cambridge, MA: MIT Press, 2014); Eric Hoyt, Hollywood Vault: Film Libraries Before Home Video (Berkeley: University of California Press, 2014); and Lisa Parks and Nicole Starosielski, eds., Signal Traffic: Critical Studies of Media Infrastructures (Champaigne: University of Illinois Press, 2015).

[4] Michael Curtin, Jennifer Holt, and Kevin Sanson, eds., Distribution Revolution: Conversations about the Digital Future of Film and Television (Berkeley: University of California Press, 2014).

[5] danah m. boyd and Nicole B. Ellison, “Social Network Sites: Definition, History, and Scholarship,” Journal of Computer-Mediated Communication 13, no. 1 (2007): 211.

[6] Kim-Mai Cutler, “ Crosses 5M Users,” TechChrunch, October 2013,

[7] See for instance, Richard Price, “Guerilla Tips for Raising Venture Capital,”, 5 March 2014,

[8] Kim-Mai Cutler, “ Crosses 5M Users,” TechChrunch, October 2013,

[9] Kim-Mai Cutler, “Academia.Edu Overhauls Profiles,” TechChrunch, October 2012,

[10] Richard Price, “The Future of Peer Review,” TechChrunch, February 2012,

[11] “ is reimagining research,”, January 2015,

[12] Gloria Origgi and Judith Simon, “Scientific Publications 2.0: The End of the Scientific Paper?,” Social Epistemology 24, no. 3 (2010): 146

[13] Christian Sandvig et. al., “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms” (paper presented at the 64th Annual Meeting of the International Communication Association, 2014),

[14] Richard Price, interview by David Weinberger, Radio Berkman podcast, Universities and the Web, 18 September 2013,

[15] Victor Henning, Jason J. Hoyt, and Jan Reichelt, “Usage-Based Reputation Metrics in Science,” in The Reputation Society: How Online Opinions are Reshaping the Offline World, ed. Hassan Masum and Mark Tovey (MIT Press, 2011), 126.

[16] Price, “The Future of Peer Review.”

[17] Patrick Vonderau, “Beyond Piracy: Understanding Digital Markets,” in Connected Viewing, 99–123.

[18] Andrew Pickering, The Mangle of Practice (Chicago: University of Chicago Press, 1995), 23.

[19] Sandvig et al, “Auditing Algorithms.”

[20] Sterne, MP3, 131.

[21] Johannes Angermüller, “Wissenschaft zählen. Regieren im digitalen Panoptikum,” Leviathan: Berliner Zeitschrift für Sozialwissenschaft no. 25 (2010), 174–90.

[22] Victor Henning, Jason J. Hoyt, and Jan Reichelt, “Crowdsourcing Real-Time Research Trend Data” (paper presented at The Future of the Web for Collaborative Science, 2010),

[23] Victor Henning, and Jan Reichelt, “Mendeley: A for Research?” (paper presented at IEEE International Conference on eScience, 2008),, 327.

[24] Henning, Hoyt, and Reichelt, “Usage-Based Reputation Metrics in Science,” 120.

[25] Judith Simon, “A Socio-epistemological Framework for Scientific Publishing,” Social Epistemology 24, no. 3 (2010): 201–2.

[26] Ibid., 210.

[27] Karen Barad, Meeting the Universe Halfway: Quantum Physics and the Entanglement of Matter and Meaning (Durham, NC: Duke University Press, 2007).

[28] Katy Jordan, “Academics and their Online Networks: Exploring the Role of Academic Social Networking Sites,” First Monday 19, no. 11 (2014),

[29] Kathleen Fitzpatrick, Planned Obsolescence: Publishing, Technology, and the Future of the Academy (New York: New York University, 2011), 17.

[30] Richard Münch, Akademischer Kapitalismus: Über die politische Ökonomie der Hochschulreform (Frankfurt am Main: Suhrkamp, 2011), 63. See also Rosalind Gill, “Academics, Cultural Workers and Critical Labour Studies,” Journal of Cultural Economy 17, no. 1 (2014): 12-30.

[31] Hermann Kocyba, “Wissen,” in Glossar der Gegenwart, ed. Ulrich Bröckling and Susanne Krasmann (Frankfurt am Main: Suhrkamp, 2004), 300–6. Cf. Mercedes Bunz, “Knowledge,” in Critical Keywords for the Digital Humanities, available at

[32] Christopher Phelbs, “Someone Searched for You: and Me,” Chronicle of Higher Education, 18 March 2013,

[33] Thom Craver, “Goodbye, Keyword Data,” SearchEngine Watch, September 2013,

[34] Sandvig et al, “Auditing Algorithms,” 8.

[35] Somewhat ironically, after having posted a draft of this essay on my site, posted a link to a “recent study” on its startpage, claiming that “papers uploaded to” would “receive an 83% boost in citations over 5 years. ” This study was conducted by six employees, including its CEO Richard Price, and several data strategists from’s associate, Polynumeral. Even more ironically, this article, as featured on its co-author Maxwell Shron’s site, garnered more than 75,000 views in the very same time my own article got 149—the irony here being the fact that Maxwell has only 11 followers, while I have more than 300. What, then, makes this recommendation algorithm tick—promotional over-exposure or the actual community the platform advertises its services to? See Maxwell Shron, Richard Price, Ben Lund et. al., “Open Access Meets Discoverability: Citations to Articles Posted to,” May 2015.


Patrick Vonderau is a Professor of Cinema Studies at the Department for Media Studies at Stockholm University. He is interested in media and cultural history, technology and social theory, and especially in understanding the multi-faceted relations between “media” and “industries” in their broad historical, aesthetical, theoretical, and technological contexts. Recent book publications include Behind the Screen: Inside European Production Cultures (2013, with P. Szczepanik), Moving Data. The iPhone and the Future of Media (New York: Columbia University Press, 2013) and The YouTube Reader (London/Stockholm 2009) (both with P. Snickars). He is also an editor of the German scholarly journal Montage AV. Zeitschrift fuer Theorie & Geschichte audiovisueller Kommunikation and a co-founder of NECS - European Network for Cinema and Media Studies (