Authored by
1 link
Graph · Publication
01 · In focus
The structured facts the source records about On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? ๐ฆ, the count of declared adjacencies in the corpus, and the federation map zoomed on this node and its neighbours.
publication
โ2 declared connections
02 · Connections
Split by direction. Direct links are the ones On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? ๐ฆ’s source record names; inferred backlinks are records elsewhere in the corpus that point at this entity.
1 link
Links named in this entity's structured fields.
1 link
1 link
Other records that name this entity.
1 link
03 · Background
Body prose as it appears in movement-graph’s published markdown for this entity. Links to other corpus entities resolve to their graph page; links to deeper repo paths are kept as text so the page does not invent a route.
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? ๐ฆ is a peer-reviewed conference paper by Emily M. Bender (University of Washington), Timnit Gebru (then at Google), Angelina McMillan-Major (then a University of Washington PhD student), and Margaret Mitchell (then at Google, publishing under the pseudonym "Shmargaret Shmitchell"), presented at the ACM Conference on Fairness, Accountability, and Transparency (FAccT '21) held online from 3-10 March 2021 and published in the conference proceedings as pages 610-623 with DOI 10.1145/3442188.3445922.
The paper sets out a structured critique of the scaling-without-limits trajectory of large language models, organised around four categories of risk: the environmental and financial costs of training ever-larger models; the inscrutability of the massive uncurated training corpora and the dangerous biases embedded in them; the opportunity cost of routing the field's research effort into LLM-centric work that crowds out language-grounding alternatives; and the potential for deception arising from text that reads as meaningful but is produced without communicative intent or grounding in the world โ the latter risk being the one the paper's signature framing names. The authors characterise a large language model as "a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot", a framing that became the most propagated single piece of LLM-critique vocabulary in the half-decade that followed.
The paper became a public controversy before it was presented. In December 2020, Google management โ including the company's AI research head Jeff Dean and Megan Kacholia, the engineering vice-president to whom Gebru reported โ demanded that Gebru either retract the paper or remove the names of Google-employed authors, with Dean publicly framing the request as a finding that the paper "didn't meet our bar for publication" on grounds that it ignored too much relevant recent research. Gebru responded with conditions for either resolving the dispute or working out an employment end date; Google's reply โ interpreted by Gebru as an unannounced termination and by Google as accepting a resignation she maintained she had not offered โ landed on 2 December 2020 and was the trigger for the most public Big Tech research-ethics dispute of the decade. Margaret Mitchell was terminated in February 2021 after she allegedly created automated scripts to crawl Google's internal servers for evidence of Gebru's treatment; a parallel internal Google review (the public "Dean Memo") and a sustained employee-petition response followed across the same months. The controversy's downstream organisational artefact is the Distributed AI Research Institute, which Gebru founded on the one-year anniversary of her Google exit on 2 December 2021 as the independent community-rooted research institute she had named as the appropriate home for algorithmic-accountability work that the in-house Big Tech research labs structurally could not support.
The paper's continuing public role tracks two arcs. As a piece of academic vocabulary, "stochastic parrot" was designated the American Dialect Society's 2023 AI-related Word of the Year, with even Sam Altman tweeting "i am a stochastic parrot, and so r u" in the months after ChatGPT's release. As an organising artefact, the paper's four-risk framing โ environmental costs, training-data bias, research-opportunity cost, and deception-without-meaning โ became the load-bearing public-policy vocabulary the grassroots side of the make-AI-good movement has carried into its organising on LLM-driven systems: the AI-safety / Pause coalition's protest framings, the EU AI Act civil-society coalition's foundational-rights statements, and the post-2022 wave of artist-, writer-, content-moderator-, and worker-side organising on generative-AI deployment each cite or operationalise the paper's risk taxonomy as their academic-foundational reference point.
Within the corpus, Stochastic Parrots is the foundational academic artefact of the LLM-critique field and the publication-side anchor of the post-DAIR algorithmic-accountability research programme that Timnit Gebru and DAIR carry forward. It fills the LLM-critique peer-reviewed-paper publication sub-type that the corpus's earlier algorithmic-accountability paper-side artefact โ the 2018 Gender Shades audit co-authored by Joy Buolamwini and Gebru โ does not occupy, and it stands alongside Gender Shades as the second of the corpus's two academic-origin / grassroots-propagation foundational papers. Both papers share the same downstream pattern: an academic-origin empirical or theoretical artefact whose later organising significance is carried by a grassroots research-and-advocacy organisation founded by one of its authors (Algorithmic Justice League on the Buolamwini side; DAIR on the Gebru side), making the two papers the corpus's clearest single pair of academic-to-organising bridges on the algorithmic-accountability axis.
04 · Sources
5 sources listed from the pinned corpus. Links are shown only when the source URL is a valid HTTP(S) address.
ACM Digital Library landing page โ primary source for the formal citation (Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT '21, pages 610โ623), DOI 10.1145/3442188.3445922, and the four-author byline with Margaret Mitchell publishing under the pseudonym "Shmargaret Shmitchell"
Wikipedia article on the paper โ independent secondary source for the four-author byline (Bender / Gebru / McMillan-Major / Mitchell), the four risk categories (environmental and financial costs, opaque training data and embedded bias, opportunity cost of LLM-centric research, potential for deception and "stochastic parrot" framing), the 2021 ACM FAccT publication, the Google retraction-pressure controversy timeline, the paper's continuing role in LLM-critique organising, and the 2023 American Dialect Society "AI-related Word of the Year" designation for "stochastic parrot"
Wikipedia biographical article on Timnit Gebru โ primary secondary source for the December 2020 Google retraction-pressure controversy from the author side (Jeff Dean's "didn't meet our bar for publication" framing, Megan Kacholia's role, Gebru's email to the Google Brain Women and Allies listserv, the disputed "we accept your resignation" termination, the February 2021 Margaret Mitchell termination after Mitchell allegedly scripted internal-server searches for evidence of Gebru's treatment, and the throughline from the controversy to the 2 December 2021 founding of DAIR)
ACM FAccT 2021 conference site โ primary source confirming the conference was held online from 3-10 March 2021 as the fourth annual ACM Conference on Fairness, Accountability, and Transparency
Author-side hosting of the paper PDF from Emily M. Bender's University of Washington faculty page (redirects to the ACM DL canonical record at the time of last check) โ primary source for the paper's continuing author-side dissemination outside the ACM paywall
Source: entities/publications/pub-stochastic-parrots.md in movement-graph at pin 3cc1a36.