Monthly Archives: August 2017

New Texts Out Now: Nader Hashemi and Danny Postel, eds. Sectarianization: Mapping the New Politics of the Middle East

[This post originally appeared on Jadaliyya —Ed.]


Jadaliyya (J): What made you write this book?

Danny Postel and Nader Hashemi (DP and NH): Over the last several years, a narrative has taken root in Western media and policy circles that attributes the turmoil and violence engulfing the Middle East to supposedly ancient sectarian hatreds. “Sectarianism” has become a catchall explanation for virtually all of the region’s problems. Thomas Friedman, for instance, claims that in Yemen today “the main issue is the seventh century struggle over who is the rightful heir to the Prophet Muhammad — Shiites or Sunnis.” Barack Obama has been one the biggest proponents of this thesis. On several occasions, he has invoked “ancient sectarian differences” to explain the turmoil in the region. In his final State of the Union address, he asserted that the issues plaguing the Middle East today are “rooted in conflicts that date back millennia.” A more vulgar version of this view prevails among right-wing commentators. But in one form or another, this new sectarian essentialism, which is lazy and convenient — and deeply Orientalist — has become the new conventional wisdom in the West.

Our book forcefully challenges this narrative and offers an alternative set of explanations for the rise in sectarian conflict in the Middle East in recent years. Emphasis on recent: the book demonstrates that the sharp sectarian turn in the region’s politics is largely a phenomenon of the last few decades — really since 1979 — and that pundits who imagine it as an eternal or fixed feature of the Middle East are reading history backwards. So the book is an exercise in refutation and ideology critique on the one hand, while also offering a set of rigorous social scientific arguments about what exactly is driving the intensification of sectarian conflict in the Middle East today. Our contributors come from political science, history, anthropology, and religious studies, and it is from this range of disciplines that we present a social and political theory as well as a critical history of sectarianism.

J:  What particular topics, issues, and literatures does the book address?

(DP and NH): The first section of the book offers big-picture historical, theoretical, and geopolitical perspectives on the sectarianization process — that is, the escalation of sectarian conflict in recent years. The second section dives into a series of case studies, examining how the sectarianization process has played out in Syria, Iraq, Saudi Arabia, Iran, Lebanon, Yemen, Bahrain, and Kuwait. The concluding chapter explores the prospects of reversing the sectarianization process.

The book addresses a range of literatures: in the introduction, we draw on the literature on ethno-nationalist mobilization and evaluate the primordialist, instrumentalist, and constructivist schools of thought; in his chapter, Adam Gaiser revisits debates among sociologists of religion about the nature of sects and engages with theories of narrative identity; Fanar Haddad applies critical race theory to the politics of sectarianism in Iraq; Paulo Gabriel Hilu Pinto draws on the anthropologist Robert Weller’s concepts of saturation and precipitation to illuminate the sectarianization of the Syrian conflict; Eskandar Sadeghi-Boroujerdi draws on international relations theory — specifically Anoushiravan Ehteshami and Raymond A. Hinnebusch’s concept of “middle powers” in a “penetrated regional system” — to make sense of Iran’s role in the sectarianization process; drawing from the literature on republicanism, Islamism, and post-Islamism, Stacey Philbrick Yadav develops her original concept of “Islamist republicanism” and explores what she calls “convergent republicanism” among adversarial Islamists in Yemen; Toby Matthiesen deploys the concept of “securitization” associated with the Copenhagen school of critical security studies to examine the sectarianization process in Bahrain; Bassel Salloukh draws on Foucault, Gramsci, and James Tully in his analysis of what he calls the disciplinary logic of the sectarian system in Lebanon; Timothy D. Sisk draws on the growing body of research on ethnic and religious violence and post-conflict peacebuilding in search of lessons for de-sectarianization.

J: How does this book connect to and/or depart from your previous work?

(DP and NH): In 2013 and 2014 we were deeply engaged in the literature and the debate on the Syrian conflict. We organized two international conferences — one at the University of Denver, one at SOAS in London — and co-edited a book on the subject. It struck us that all sorts of journalists, activists, and even some scholars, across the ideological spectrum, characterized the Syrian conflict in sectarian terms. Prominent Syria commentators referred to the protests that began in March 2011 as a “Sunni uprising.” Diplomats cautioned against the West taking sides in “ancient” blood feuds. Some left-wing journalists and activists unwittingly echoed these essentialist, Orientalist tropes. This narrative of course belied the decidedly non-sectarian origins of the Syrian uprising. The slogans and demands of Syrian protesters throughout the spring and summer of 2011 were exactly those of the other Arab uprisings: dignity, social justice, democratic rights, an end to dictatorship. The Syrians making these demands came from various backgrounds and represented a cross-section of the society: Alawis, Christians, Druze, Ismailis, and Sunnis (Kurds, Armenians, and Arabs alike) took to the streets and demonstrated together, along with secular Syrians.

This history had been erased, and very quickly, in the sectarian narrative that took hold. We wanted to push back on that distorted narrative, but we also wanted to make sense of how exactly the Syrian conflict became sectarianized. So our interest in the sectarianization process emerged very directly out of our work on Syria. But we saw a pattern across the region: uprisings that began as non-sectarian/cross-sectarian but morphed into sectarian battles. In Syria, Bahrain, Yemen, and beyond, the sectarianization process took different forms in different countries, but the underlying dynamic was remarkably consistent. We thus set out to assemble the case studies, drawing on the leading experts on those countries, but also to theorize the phenomenon as a whole.

Our longstanding interest in democratic theory and social movements also animated this project. Nader Hashemi’s book Islam, Secularism, and Liberal Democracy: Toward a Democratic Theory for Muslim Societiesmakes a case for democratic pluralism in the Islamic world. The sectarianization process has undermined the struggle for democracy in Muslim societies by sowing division and cultivating hatred, to borrow Peter Gay’s felicitous phrase. Danny Postel worked in the US labor movement for several years (for the organization Interfaith Worker Justice, and for a coalition of labor unions and community organizations). His interest in labor movements in the MENA region (and progressive political mobilization more generally) is related to the issue of sectarianization insofar as the former is an example of people organizing around issues of shared interests and aims that transcend religious identity. It’s vital to remember that there have been all kinds of labor movements and other forms of political mobilization in the region and that the politics of the Middle East have not always revolved around sectarianism — nor must they forever.

J: Who do you hope will read this book, and what sort of impact would you like it to have?

(DP and NH): Our aims are ambitious: we want to change the very terms of the public conversation about sectarianism and to put a major dent in the currently ascendant narrative about why the Middle East is awash in violence today. We want to put the term sectarianization into general circulation and see it become part of the vocabulary of political debate.

We hope all sorts of people will read the book — scholars, journalists, researchers, policymakers, diplomats, religious leaders, and practitioners in the world of conflict resolution, peacebuilding, and human development. The book will soon be translated into Arabic, which is hugely important to us. We would love to see the Arabic edition reach not only scholars but people on the ground in the societies the book examines, especially religious leaders and activists engaged in cross-sectarian organizing. Those are the efforts that will chart the path beyond the maelstrom of sectarianization.

J: What other projects are you working on now?

(DP and NH): We’ve been developing a project on cross-ideological coalition building in deeply divided societies focused mainly on Tunisia and Egypt, but also drawing on cases outside the region. Nader is working on an intellectual and political history of Iran’s Green movement, and a volume on Islam and human rights. Danny is writing something on Syria and tragedy. Down the road he hopes to do something on the role of labor movements in the MENA region.

Excerpt from “The Sectarianization Thesis: A Social Theory of Sectarianism”:

This book forcefully challenges the lazy and Orientalist reliance on “sectarianism” as a catch-all explanation for the ills afflicting the Middle East today. We propose to shift the discussion of sectarianism by providing analternative interpretation of this subject that can better explain the various conflicts in the Middle East and why they have morphed from nonsectarian or cross-sectarian (and nonviolent) uprisings/movements intosectarianized battles and civil wars. The contributors to this volume—who include political scientists, historians, anthropologists, and religious studies scholars—examine this phenomenon as it has unfolded over a definite period of time via specific mechanisms. Through multiple case studies (Iraq, Syria, Lebanon, Bahrain, Yemen, Kuwait, Saudi Arabia, Iran) and with historical and theoretical chapters exploring the nature and evolution of sectarianization, they analyze and map this process, exploring not only how but why it has happened.

Conflict between sectarian Muslim groups has intensified dramatically in recent years. But why? What explains the upsurge in sectarian conflict at this particular moment in multiple Muslim societies? How can we best understand this phenomenon?

To answer this question, we propose the term sectarianization: a process shaped by political actors operating within specific contexts, pursuing political goals that involve popular mobilization around particular (religious) identity markers. Class dynamics, fragile states, and geopolitical rivalries also shape the sectarianization process. The term sectarianism is typically devoid of such reference points. It tends to imply a static given, a trans-historical force—an enduring and immutable characteristic of the Arab Islamic world from the seventh century until today.

The theme of political authoritarianism is central to the sectarianization thesis. This form of political rule has long dominated the politics of the Middle East, and its corrosive legacy has deeply sullied the polities and societies of the region. Authoritarianism, not theology, is the critical factor that shapes the sectarianization process. Authoritarian regimes in the Middle East have deliberately manipulated sectarian identities in various ways as a strategy for deflecting demands for political change and perpetuating their power. This anti-democratic political context is essential for understanding sectarian conflict in Muslim societies today, especially in those societies that contain a mix of Sunni and Shi‘a populations. To paraphrase the famous Clausewitz aphorism about war as a continuation of politics by other means, sectarian conflict in the Middle East today is the perpetuation of political rule via identity mobilization.

[W]hy are these conflicts intensifying now; and why in this particular region of the world? In other words, what explains the flaring of sectarian conflict at specific moments in time and in some places rather than others? Sunni–Shi‘a relations, for example, were not always conflict-ridden, nor was sectarianism a strong political force in modern Muslim politics until recently. How did Syrians and Iraqis with different sectarian identities manage to coexist for centuries without mass bloodshed? How did these pluralistic mosaics come unglued so precipitously? What are the key forces driving sectarianization?

The Geopolitics of Sectarianism: 1979, 2003, 2011

The key regional development that shaped the rise of sectarianism was the 1979 revolution in Iran. Western-backed dictatorships in the Middle East, particularly Saudi Arabia, feared that the spread of revolutionary Islam could cross the Persian Gulf and sweep them from power in the same manner as the Pahlavi monarchy had been toppled. In response, the Saudi kingdom and other Sunni authoritarian regimes invested significant resources in undermining the power and appeal of the Iranian revolution, seeking to portray it as a distinctly Shi‘a/Persian phenomenon based on a corruption of the Islamic tradition.32 Sunni Muslims, they argued, should not be duped by this distortion of the Prophet Muhammad’s message. Anti-Shi‘a polemics in the Sunni world increased dramatically during this period, fueled by significant sums of Arab Gulf money. Sunni–Shi‘a relations were deeply affected by this development, and Pakistan was an early battleground where this conflict played out.


The key international event at this time was the Soviet occupation of Afghanistan. Western support for the Afghan Mujahedeen, backed by Saudi petrodollars, produced a Sunni militant movement that attracted radical Islamists from around the world, most notably Osama Bin Laden and Ayman al-Zawahiri. This constellation of forces eventually morphed into al-Qaeda. The ideological orientation of these Salafist–jihadi groups was decidedly anti-Shi‘a, both in theory and practice, buttressed as it was by a neo-Wahhabi reading of the world.

The Saudi–Iranian rivalry is critical to understanding the rise of sectarianism in Muslim societies at the end of the twentieth century. Both Tehran and Riyadh lay claim to leadership of the Islamic world, and since 1979 they have battled for hearts and minds across the Middle East, North Africa, and parts of Asia.

[T]he 2003 US invasion and subsequent occupation of Iraq marked a turning point in Saudi–Iranian relations, and subsequently in sectarian relations across the region.

The toppling of Saddam Hussein dramatically affected the regional balance of power. The rise of Shi‘a Islamist parties in Iraq allied with Iran set off alarm bells in the Gulf Cooperation Council (GCC) countries. The subsequent Iraqi civil war, which after 2006 had a clear sectarian dimension to it, further inflamed Sunni–Shi‘a relations across the Middle East. The rise of Hezbollah in Lebanon was also a factor during this period. Its ability to expel Israel from southern Lebanon in 2000 and its perceived victory against Israel in the summer of 2006 increased the popularity and prestige of this Shi‘a militant group as a revolutionary force on the Sunni “Arab street.” An opinion poll at this time listed the Secretary General of Hezbollah, Hassan Nasrallah, as the most popular leader in the region, a fact that highlights both the chasm between state and society in the Arab world and explains how anti-imperialism trumped sectarian identity at the grassroots level during this period.

Around this time, King Abdullah II of Jordan reflected a common concern among Sunni Arab regimes when he invoked the specter of a new “Shi‘a Crescent.” Linking Beirut with Tehran and running through Damascus and Baghdad, this perceived rolling thunder threatened to dominate the politics of the region in the name of a new brand of transnational Shi‘a solidarity.

The “Arab Spring” of 2011 marked another turning point in Saudi–Iran relations and, consequently, in Sunni–Shi‘a relations more broadly. The Arab uprisings shook the foundations of Middle East authoritarianism. Both Iran and Saudi Arabia relied on sectarianism to deflect attention from popular demands for political change and to advance their influence in the region. The Saudi case is easier to diagnose and is better known. The Saudi regime blamed protests in Bahrain and in eastern Saudi Arabia on a Shi‘a conspiracy allegedly orchestrated from Tehran, while the Assad regime and its Iranian backers attributed the (nonviolent) Syrian protests of 2011 to Salafist “terrorists” supported by Riyadh and hell-bent on toppling Iran’s key regional ally in Damascus. The Iranian case of sectarianization is more subtle and less well known.

In the case of Syria, Iran has utilized a distinct sectarian narrative, albeit a subtle one, to mobilize support for the Assad regime, as Eskandar Sadeghi-Boroujerdi explains in his chapter in this volume. While officially Tehran claims that it is supporting the “legitimate” government in Damascus and fighting ISIS, all Syrian rebels are depicted as Salafi–jihadis who are bent on exterminating minorities should Assad be toppled. As the war in Syria has dragged on, Iran has organized a transnational Shi‘a militia movement from among the poor and devout Shi‘a communities of Afghanistan, Pakistan, and Iraq. These militias are recruited through an explicitly sectarian narrative that draws on classic Shi‘a themes of persecution, martyrdom, and sacrifice. The imminent threat of the destruction of Shi‘a shrines in Syria is invoked, and financial compensation, educational opportunities, and Iranian citizenship are offered as an incentive package.

The key claim of this book is that sectarianism fails to explain the current disorder in the Middle East. Viewing the region through a sectarian prism clouds rather than illuminates the complex realities of the region’s politics. The current instability is more accurately seen as rooted in a series of developmental crises stemming from the collapse of state authority. At the dawn of the twenty-first century a series of UN Arab Human Development Reports forecast and predicted that this region was headed for a deep crisis unless these problems were addressed. The foreign policies of leading Western states toward the Arab-Islamic world have only made matters worse.

While it is true that religious identities are more salient in the politics of the Middle East today than they were in previous periods, it is also true that these identities have been politicized by state actors in pursuit of political gain. Authoritarianism is the key context for understanding this problem. In other words, there is a symbiotic relationship between social pressure from below—demands for greater inclusion, rights, recognition, and representation—and the refusal by the state from above to share or relinquish power. This produces a crisis of legitimacy that ruling elites must carefully manage to retain power. The result of this political dynamic is sectarianization.

[Excerpted from Sectarianization: Mapping the New Politics of the Middle East, with author permission, (c) 2017.]

Leave a comment

Filed under Danny Postel

A Response That Isn’t

Chad Wellmon, Andrew Piper, and Yuancheng Zhu

The post by Jordan Brower and Scott Ganz is less a response than an attempt to invalidate by suggestion. Debate over the implications of specific measurements or problems in data collection are essential to scholarly inquiry. But their post fails to deliver empirical evidence for their main argument: that our descriptions of gender bias and the concentration of institutional prestige in leading humanities journals should be met with deep doubts. Their ethos and strategy of argumentation is to instill doubt via suspicion rather than achieve clarity about the issues at stake. They do so by proposing strict disciplinary hierarchies and methodological fault lines as a means of invalidating empirical evidence.

Yet as we will show, their claims are based on a misrepresentation of the essay’s underlying arguments; unqualified overstatements of the invalidity of one measure used in the essay; and the use of anecdotal evidence to disqualify the study’s data. Under the guise of empirical validity, their post conceals its own interpretive agenda and plays the very game of institutional prestige that our article seeks to understand and bring to light.

We welcome and have already received pointed criticisms and incisive engagements from others. We will continue to incorporate these insights as we move forward with our work. We agree with Brower and Ganz that multiple disciplinary perspectives are warranted to fully understand our object of study. For this reason we have invited Yuancheng Zhu, a former PhD in statistics and now research fellow at the Wharton School of the University of Pennsylvania, to review our findings and offer feedback.

With respect to the particular claims Brower and Ganz make, we will show:

  1. they address only a portion of––and only two of seven total tables and figures in––an article whose findings they wish to refute;
  2. their proposed heterogeneity measure is neither mathematically more sound nor empirically sufficient to invalidate the measure we chose to prioritize;
  3. their identification of actual errors in the data set do not invalidate the statistical significance of our findings;
  4. their anecdotal reasoning is ultimately deployed to defend a notion of “quality” as an explanation of extreme institutional inequality, a defense for which they present no evidence.

1. Who Gets to Police Disciplinary Boundaries?

Brower and Ganz argue that our essay belongs to the social sciences and, therefore, that neither the humanities nor the field to which it actually aspires to belong, cultural analytics, has a legitimate claim to the types of argument, evidence, and knowledge that we draw upon. Such boundary keeping is one of the institutional norms we hoped to put into question in our essay, because it is a central strategy for establishing and maintaining epistemic authority.

But Brower and Ganz’s boundary policing is self-serving. Although they identify the entire essay as “social science,” they only discuss sections that account for roughly 35 percent of our original article and only two of seven figures and tables presented as evidence. Our essay sought to address a complex problem, and so we brought together multiple ways of understanding a problem, from historical and conceptual analysis to contemporary data, in order better to understand institutional diversity and gender bias. Brower and Ganz ignored a majority of our essay and yet sought to invalidate it in its entirety.

2. Claiming that HHI Is “Right” Is Wrong.

Brower and Ganz focus on two different methods of measuring inequality as discussed in our essay, and they suggest that our choice of method undermines our entire argument. In the process, they suggest that we did not use two different measures or discuss HHI (or talk about other things like gender bias). They also omit seven other possible measures we could have used. In other words, they present a single statistical measure as a direct representation of reality, rather than one method to model a challenging concept.

If we view the publication status of each year as a probability distribution over the institutions, then coming up with a single score is simply trying to use one number to summarize a multidimensional object. Doing so inevitably requires a loss of information, no matter how one chooses to do it. Just like mean, median, or mode summarizes the average position of a sample or a distribution, type-token score—or the HH index—summarizes “heterogeneity” from different perspectives. Brower and Ganz call the use of type-token ratio a “serious problem,” but in most circumstances one does not call using mean rather than median to summarize data a serious problem.

If there is not a single appropriate score to use, which one should we choose? The first question is what assumptions we are trying to model. The type-token ratio we used assumes that the ratio of institutions to articles is a good representation of inequality. The small number of institutions represented across articles suggests that there is a lack of diversity in the institutional landscape of publishing. The HH index looks at the market share of each actor (here, institutions), so that the more articles that an institution commands, the more concentrated the “industry” is thought to be. Because the HH index is typically used to measure financial competitiveness, it is based on the assumption that simply increasing the number of actors in the field decreases the inequality among institutional representation––that is, that more companies means more competitiveness. But as we argue in our piece, this is not an assumption we wanted to make.

Here is a demonstration of the problem drawn from the email correspondence from Ganz that we received prior to publication:

For example, imagine in year 1, there are 10 articles distributed equally among five institutions. Your heterogeneity metric would provide a score of 5/10 = 0.5.

Then in year 2, there are 18 articles distributed equally among six institutions. We would want this to be a more heterogeneous population (because inequality has remained the same, but the total number of institutions has increased). However, according to your metrics, this would indicate less heterogeneity (6/18 = 0.33).

In our case, we do not actually want the second example to suggest greater heterogeneity. In effect the number of articles has increased by 60 percent, but the institutional diversity by only 20 percent. In our view heterogeneity has decreased in this scenario, not increased. More institutions (the actors in the model) is for us not an inherent good. It’s the ratio of institutions to articles that matters most to us.

The second way to answer the question is to understand the extent to which each measure would (or would not) represent the underlying distributions of the data in different ways. Assuming that the number of articles for each journal is relatively similar each year, the type-token score and the HH index actually belong to the same class of metric, the Renyi entropy. The HH index is equivalent to the entropy with alpha equal to 2 (ignoring the log and the constant), and the type-token score corresponds to when alpha equals 0 (it is log of the number of “types”; we assume that the number of tokens is relatively constant). To put it in a more mathematical way, HH index corresponds to the L2 norm of the probability vector, and type-token score corresponds to the L0 norm. Given that the L1 norm of the probability vector is 1 (probabilities sum up to 1), the HH index and type-token score tend to be negatively correlated. There is, then, not much of a difference between options. Another special case is when alpha = 1, which is the usual definition of entropy.

A big assumption is that the number of articles each year stays relatively constant. It is also debated and debatable which one (TT score or HH index) is more sensitive to the sample size. If we look at the article distributions for each journal, the assumption of a constant number of articles is in this case a fair one to make. Once becoming nonzero, the number of publications for each journal stays relatively unchanged, in terms of scale. It is indeed the case that sample size will affect both metrics, just like sample entropy will be affected by sample size. One could eliminate the effect of sample size by randomly downsizing each year to the same number (or maybe aggregating neighboring years and then downsizing).

If the two metrics are similar, then why do they appear to tell different stories? In fact, upon further review they appear to be telling the same story. In figure 1, we see the two scores plotted for each journal. The first row is the type-token scores for each of the four journals, red for institutions and blue for PhDs. The second row is for 1/HHI,  the effective number. In none of the plots do we see the dramatic decrease of heterogeneity in the early years shown in figure 4 of the original essay or the consistently strong increase of heterogeneity that Ganz and Brower argue for. The first row and the second row agree with each other in terms of the general trend most of the time. This is because in our figure 4 and Brower and Ganz’s replication, the four journals are aggregated. When two journals (Critical Inquiry and Representations) come into play in the late seventies and the eighties, the scores are dragged down because, on average, those two journals are less diverse. Hence, the two metrics do give us the same trend once the journals are disaggregated.

So when we pull apart the four journals, what story do they tell? If we run a linear regression model on each of the journals individually, since 1990 there has either been no change or a decline of heterogeneity for both measures (with one notable exception, PMLA for author institutions which has increased). In other words, either nothing changes about our original argument, or things actually look worse from this perspective.

We were grateful to Brower and Ganz when they first shared their thinking about HHI and tried to acknowledge that gratitude, even while disagreeing with their assumptions, in our essay. Understanding different models and different kinds of evidence is, we’d suggest, a central value of scholarship in the social sciences or in the humanities. That is why we discussed the two measures together. But to suggest that the marginal differences between the scores invalidate an entire study is wrong. It is also not accurate to imply that we made this graph a centerpiece of our essay—“its most provocative finding,” in their words.

Consider how we frame our discussion of the time-based findings in our essay. We point out the competing ways of seeing this early trend and emphasize that post-1990 levels of inequality have remained unchanged. Here is our text:

Using a different measure such as the Herfindahl-Hirschman Index (HHI) discusssed in note 37 above suggests a different trajectory for the pre-1990 data. According to this measure, prior to 1990 there was a greater amount of homogeneity in both PhD and authorial institutions, but no significant change since then. In other words, while there is some uncertainty surrounding the picture prior to 1990, an uncertainty that is in part related to the changing number of journal articles in our data, since 1990 there has been no significant change to the institutional concentrations at either the authorial or PhD level. It is safe to say that in the last quarter century this problem has not improved.

In other words, based on what we know, it is safe to say that the problem has remained unchanged for the past quarter century, though one could argue that in some instances it has gotten worse. If you turned the post-1990 data into a Gini coefficient, the degree of institutional inequality for PhD training would be 0.82, compared to a Gini of 0.45 for U.S. wealth inequality. But for Brower and Ganz, this recent consistency is overshadowed by the earlier improvement that they detect. To insist that institutional diversity is improving is, at best, to miss the proverbial forest for the trees. At worst, it’s misleading. Their argument is something like: We know there has been no change to the extremely high levels of concentration for the past twenty-five years. But if you just add in twenty more years before that then things have been getting better.

Their second example about the relative heterogeneity between journals reflects a similar pattern: legitimate concern about a potential effect of the data that is blown into a dramatic rebuttal not supported by the empirical results.

In the one example of PhD heterogeneity, they show that a random sample of articles always has PMLA with more diversity than Representations and yet our measure shows that Representations exhibits more diversity than PMLA. What is going on here?

It appears Representations is unfairly being promoted in our model because it publishes so many fewer articles than PMLA (PMLA has more than twice as many articles as Representations). But notice how they choose to focus on the two most disparate journals to make their point. What about for the rest of the categories?

Interestingly, when it comes to institutional diversity, the only difference that their proposed measure makes is to shift the relative ranking of Representations. What is troubling here is the fact they they chose not to show this similarity when they replicated our findings, which are shown here:

Author   PhD  
Ours HHI Ours HHI
CI Rep CI Rep


In other words, our essay overestimates one journal’s relative diversity. We agree that their example is valid, important, interesting and worth using. But as an argument for invalidation it fails. How could their measure invalidate our broader argument when it reproduces all but one of our findings?

Given both measures’ strong correlation with article output, we would argue that the best recourse is to randomly sample from the pools to control for sample size rather than rely on yearly data. In this way we we avoid the trap of type-token ratios that are overly sensitive to sample size and the HHI assumption of more institutions being an inherent good. Doing so for 1,000 random samples (of 100 articles per sample), we replicate the rankings produced by the HHI score (ditto for a Gini coefficient). So Brower and Ganz are correct in arguing that we overrepresented Representations’ diversity, which should be the lowest for all journals in both categories. We are happy to revise this in our initial essay. But to suggest as they do that this invalidates the study is a gross oversimplification.

3. When Is Error a Problem?

Brower and Ganz’s final major point is this: “We are concerned that the data, in its current state, is sufficiently error-laden to call into question any claims the authors wish to make.” This is indeed major cause for concern. But Brower and Ganz provide little evidence for their sweeping claim.

Brower and Ganz are correct to point out errors in our data set, a data set which we made public months ago precisely in hopes that colleagues would help us improve it. This is indeed a nascent field, and we do not have the same long-standing infrastructures in place for data collection that the social sciences do. We’re learning, and we are grateful to have generous people reading our work in advance and helping contribute to the collective effort of improving data for public consumption.

As with the above discussion about HHI, the real question is, what is the effect of these errors in our data set? Is the data sufficiently “error-laden” to call into question any of the findings, as they assert? That’s a big claim, one that Brower and Ganz could have tested but chose not to.

We can address this issue by testing what effect random errors might have on our findings. This too we can do in two ways. We can either remove an increasing number of articles to see what effect taking erroneous articles out of the data set might have, or we can randomly reassign labels according to a kind of worst-case scenario logic. In the case of gender, it would mean flipping a gender label from its current state to its opposite. What if you changed an increasing number of people’s gender—how would that impact estimates of gender bias? In the case of institutional diversity, we could relabel articles according to some random and completely erroneous university name (“University of WqX30i0Z”) to simulate what would happen if we mistakenly entered data that could not contribute to increased concentration (since we would choose a new fantastic university with every error). How many errors in the data set would be necessary before assumptions about inequality, gender bias, or change over time need to be reconsidered?

Figure 2 shows the impact that those two types of errors have on three of our primary metrics. As we can see, removing even fifty percent of articles from our data set has no impact on any of our measures. The results of gender bias and overall concentratedness are more sensitive to random errors. But here too it takes more than 10 percent of articles (or over 500 mistakes) before you see any appreciable shift (before the Gini drops below 0.8 for PhDs and 0.7 for authors). Gender equality is only achieved when you flip 49 percent of all authors to their opposite gender. And in no cases does the problem ever look like it’s improving since 1990.

But what if those errors are more systematic—in other words, if the errors they identify are not random, but have a particular quality about them (for example, if everyone wrongly included had actually gone to Harvard). So let’s take a look. Here are the errors they identify:

  • 100 mislabeled titles
  • twenty-three letters that should not be considered publications
  • one omitted article that was published but not included because it was over our page filter limit
  • eight articles that appear in duplicate and one in triplicate
  • one mislabeled gender (sorry Lindsay Waters)

First, consider those 100 mislabeled titles. We were not counting titles, but rather institutional affiliations. While they do matter for the record (and we have corrected them; the corrected titles will appear in the revised version of our publicly available data set), they have little bearing on our findings.

In terms of duplicates, all but one duplicate occurred because authors have multiple institutional affiliations. We have clarified this by adding article IDs and a long document explaining all instances of duplicates, which will be included with the revised data.

So what about those letters? Actually, the problem is worse than Brower and Ganz point out. We inadvertently included a number of texts below the six-page filter we had set as our definition of an article. We are thankful that Brower and Ganz have helped identify this error. After a review of our dataset, we found 251 contributions that did not meet our article threshold. These were extremely short documents (one or two pages), such as forums, roundtables, and letters that should not have been included.

So, do these errors call into question our findings? How do they impact the overall results?

Here is a list of the major findings before and after we cleaned our dataset:


                                                                       Before                         After

Gini coefficient

PhD institution.                                       0.816                           0.816

Author institution                                   0.746                           0.743

Diversity over time (since 1990)

#cases of decrease                                   3                                  3

#cases of no change                                5                                  5

#cases of increase                                   0                                  0

Journal Diversity Ranking                             PMLA                         PMLA

NLH                           NLH

      CI                               CI

                                                                            Rep                              Rep

Gender Bias (% Women)

4 Journal Yearly Mean                        30.4%                          30.7%

4 Journal Yearly Mean Since 2010   39.4%                          39.5%


Finally, they say we have failed to adequately define our problem, once again invalidating the whole undertaking:

Wellmon and Piper fail to adequately answer the logically prior question that undergirds their study: what is a publication?

What is a publication, indeed? And why and how did printed publication come to be the arbiter of scholarly legitimacy and authority in the modern research university? We think these are important and “logically prior” questions as well and that’s why we devoted the first 3,262 words of our essay to considering them. This hardly exhausts what is a complex conceptual problem, but to suggest we didn’t consider it is disingenuous.

So let’s start by granting Brower and Ganz their legitimate concern. Confronted with the historical and conceptual difficulty of defining a publication, we made a heuristic choice. For the purposes of our study, we defined an article as a published text of six pages or more in length. It would be interesting to focus on a narrower definition of “publication,” as a “research article” of a specified length that undergoes a particular type of review process across time and publications. But that in no way reflects the vast bulk of “publications” in these journals. Imposing norms that might be better codified in other fields, Brower and Ganz’s desired definition overlooks the very real inconsistencies that surround publication and peer review practices in the humanities generally and in these journals’ histories in particular. As with their insistence on a single measure, they ask for a single immutable definition of a publication for a historical reality that is far more varied than their definition accounts for. Their insistence on definitional clarity is historically anachronistic and disciplinarily incongruous. It is precisely this absence of consensus and self-knowledge within humanities scholarship––and the consequences of such non-knowledge––that our piece aims to bring to light.

Clearly more work can be done here. Subsetting our data by other parameters and testing the extent to which this impacts our findings would indeed be helpful and insightful. And we welcome more collaboration to continue to remove errors in the dataset. In fact, after the publication of our essay, Jonathan Goodwin kindly noted anomalies in our PhD program size numbers, which when adjusted change the correlation between program size and article output from 0.358 to 0.541.

4. Is Quality Measurable?

In sum, we readily concede that the authors raise legitimate concerns about the quality and meaning of different measures and how they might, or might not, tell different stories about our data. This is why we discuss them in our piece in the first place. We also appreciate that they have drawn our attention to errors in the dataset. We would be surprised if there were none. The point of statistical inference is to make estimations of validity given assumptions about error.

What we do not concede is that any of these issues makes the problem of institutional inequality and gender disparity in elite humanities publishing disappear. None of the issues Ganz and Brower raise invalidate or even undermine the basic findings surrounding our sample of contemporary publishing––that scholarship publishing in these four prestige humanities journals is massively concentrated in the hands of a few elite institutions, that most journals do not have gender parity in publishing, and that the situation has not improved in the past quarter century.

There are many ways to think about what to do about this problem. And here the authors are on even shakier evidentiary ground. We make no claims in our piece about what the causes of this admittedly complex problem might be. “Where’s the test for quality?” they ask. This is precisely something we did not test because the data in its current form does not allow for such inferences. In this first essay, which is part of a longer-term project, we simply want readers to be aware of the historical context of academic publication in the humanities and introduce them (and ourselves) to its current state of affairs for this limited sample.

Ganz and Brower, by contrast, assume, in their response at least, that quality––a concept for which they provide no definition and no measure––is the cause of the institutional disparity we found. They suggest that blind peer review is the most effective guarantor of this nebulous concept called “quality.” They provide no evidence for their claims. But there is strong counter evidence that peer review does not, in fact, function as robust a control mechanism as the authors wish to insinuate. For a brief taste of how complex and relatively recent peer review is, we would recommend Melinda Baldwin’s Making “Nature”: The History of a Scientific Journal as well as studies of other fields such as Rebecca M. Blank’s “The Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review” or Amber E. Budden’s “Double-Blind Review Favours Increased Representation of Female Authors.”[1]

These are complicated issues with deep institutional and epistemic consequences. It is neither analytically productive nor logically coherent to conclude, as Ganz and Brower do, that because high prestige institutions are disproportionately represented in high prestige publications, high prestige institutions produce higher quality scholarship. It is precisely this kind of circular logic that we hope to question before asserting that the status quo is the best state of affairs.

In our essay we simply argue that whatever filter we in the humanities are using to adjudicate publication systems (call it patronage, call it quality, call it various versions of blind peer review, call it “Harvard and Yale PhDs are just smarter”) has been remarkably effective at maintaining both gender and institutional inequality. This is what we have found. We would welcome a debate about the causes and the competing goods that various filtering systems must inevitably balance. This is precisely the type of debate our article hoped to invoke. But Brower and Ganz sought to invalidate our arguments and findings by anecdote and quantitative obfuscation. And the effect, intended or not, is an argument for the status quo.

[1] See Melinda Baldwin, Making “Nature”: The History of a Scientific Journal (Chicago, 2015); Rebecca M. Blank, “Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review” (American Economic Review 81 [Dec. 1991]: 1041–67); and Amber E. Budden et al., “Double-Blind Review Favours Increased Representation of Female Authors” (Trends in Ecology and Evolution 23 [Jan. 2008]: 4-6).

Chad Wellmon is associate professor of German studies at the University of Virginia. He is the author, most recently, of Organizing Enlightenment: Information Overload and the Invention of the Modern Research University and coeditor of Rise of the Research University: A Sourcebook. He can be reached at mcw9d@virginia.eduAndrew Piper is professor and William Dawson Scholar of Languages, Literatures, and Cultures at McGill University. He is the director of .txtLAB,  a digital humanities laboratory, and author of Book Was There: Reading in Electronic Times. Yuancheng Zhu is a former PhD in statistics and now research fellow at the Wharton School of the University of Pennsylvania


Leave a comment

Filed under Uncategorized