
The Story of arXiv and its Impact on Scientific Research
“Just when I thought I was out, they pull me back in!” With a sly grin, Paul Ginsparg quoted Michael Corleone, a sentiment he understands well. Ginsparg, a physics professor at Cornell University and a MacArthur genius, created arXiv nearly 35 years ago, a digital repository for researchers to share findings before formal review.
Visit arXiv.org (pronounced "archive") and you'll see its Web 1.0 design, a testament to its enduring legacy. This unassuming platform has profoundly reshaped the scientific community. Were arXiv to disappear, scientists worldwide would face immediate disruption. “Everybody in math and physics uses it,” says Scott Aaronson, a computer scientist. “I scan it every night.”
In academia, publishing is a universally acknowledged problem. For-profit giants like Elsevier and Springer dominate, demanding free content from authors and relying on unpaid peer reviews, only to sell access at exorbitant prices. arXiv offered a solution: instant, free access to preprints—scientific papers before vetting.
The Impact of arXiv
arXiv demonstrated that research dissemination could be separated from formal refereeing, says Paul Fendley, an early arXiv moderator. During crises like the Covid pandemic, platforms inspired by arXiv, such as bioRxiv and medRxiv, facilitated rapid dissemination of breakthroughs, potentially saving millions of lives.
While arXiv submissions aren’t peer-reviewed, they are moderated by experts to ensure basic academic standards. In 2021, Nature recognized arXiv as one of the "10 computer codes that transformed science," praising its role in fostering collaboration. Today, arXiv hosts over 2.6 million papers, receives 20,000 monthly submissions, and boasts 5 million active users. Landmark discoveries, including the "transformers" paper that launched the AI boom, debuted on arXiv.
For scientists, a world without arXiv is unimaginable. However, its inner workings reveal challenges, from bureaucratic strife to outdated code. Ginsparg describes arXiv as “a child I sent off to college but who keeps coming back to camp out in my living room, behaving badly.”
Ginsparg, now 69, remains actively involved, driven by a quest to maintain arXiv's quality. His journey began in 1991 at Los Alamos National Laboratory, where he automated the distribution of physics preprints after a fateful encounter at a conference. He crossed paths with internet pioneers like Bill Gates and Tim Berners-Lee, shaping arXiv into a vital resource.
Challenges and Triumphs
Early on, arXiv faced scaling and moderation challenges. Sergey Brin and Larry Page even caused a slowdown while indexing the web for Google. Despite its success, arXiv wasn't always championed by Los Alamos, prompting Ginsparg's return to Cornell University.
At Cornell, arXiv faced administrative hurdles and technical difficulties. Ginsparg's hands-on approach sometimes clashed with the library's management. Despite these challenges, arXiv persevered, eventually receiving funding from the Simons Foundation and undergoing a major refactoring.
Despite criticisms and controversies, Ginsparg remains dedicated to arXiv. He's not driven by grand ideologies but by a genuine desire to maintain its integrity. As he aptly puts it, "They keep bringing me back," finding the challenges and the opportunity to test ideas incredibly entertaining.
Source: Wired