Paywallgate: hacker breaches MIT, downloads millions of papers

While  Joe Romm and Keith Olberman spin the most absurd conspiracy theory imaginable related to Climategate, that it was the work of “News of the World”, Murdoch, and/or Wallis, we find another example of academic file hacking, this one far simpler but far larger in volume. Our keystone cops Romm and Olberman miss the obvious, if it was NOTW/Murdoch, why didn’t it show up in those newspapers first, instead of on blogs like CA, tAV, RC, and of course WUWT?

While they rage ridiculously, we now have an example of a scientific hack that illustrates just how simple it is to do, and how bad the academics were (at MIT no less) at preventing it even though they knew it was happening. All it took was one guy, a laptop, some simple scripts, and an unsecured network switch cabinet like this one on campus at right. Apparently the guy just shoved his laptop into the cabinet under the wires and boxes, hooked it to the switch, and MIT was none the wiser.

From the Register:

Reddit programmer charged with massive data theft

Harvard ethics fellow accused of hacking MIT

By Dan Goodin

Posted in Crime, 19th July 2011 17:43 GMT

A former employee of Reddit has been accused of hacking into the computer systems of the Massachusetts Institute of Technology and downloading almost 5 million scholarly documents from a nonprofit archive service.

Aaron Swartz, a 24-year-old researcher in Harvard University’s Center for Ethics, broke into a locked computer-wiring closet in an MIT basement and used a switch there to gain unauthorized access the college’s network, federal prosecutors alleged Tuesday. He then downloaded 4.8 million articles from JSTOR [1], an online archive of more than 1,000 academic journals, according to an indictment filed in US District Court in Boston.

When JSTOR blocked the MIT IP address Swartz used in September, for example, the Harvard fellow allegedly incremented a single digit and resumed his wholesale downloading binge, which was streamlined with a custom Python script. JSTOR at times responded by blocking huge ranges of IP addresses, causing legitimate JSTOR users at MIT to be denied access.

More: http://www.theregister.co.uk/2011/07/19/harvard_fellow_indicted/

It has long been speculated (and analysed) that the Climategate release was an inside job, or at the very least done by somebody with inside access. Hooking up your laptop to the intyernal network via an unsecured switch cabinet seems to be a pretty simple way to go about getting internal access.

Given how sloppy CRU was at leaving files lying around in the open (Steve McIntyre had fun with the “mole” story prior to Climategate), getting onto the internal UEA/CRU network might have been all that was needed.

h/t to WUWT reader AndiC

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
60 Comments
Inline Feedbacks
View all comments
July 24, 2011 6:59 am

its nice better put all the documents on internet for free downloading atleast other students would be benefited

Enneagram
July 24, 2011 10:18 am

Nothing to lose from corporate science: Real science and real discoveries, as history indubitably shows are done by gifted individuals. (free market again!)

PaulH
July 24, 2011 10:50 am

Millions of academic papers? I think I see a cure for my insomnia!
/justkidding

Rhys Jaggar
July 24, 2011 10:51 am

Trust me, hacking of this kind is ubiquitous. Your PC can be hacked into even if its not on the internet. Trust me, I know. Mine was hacked that way. I don’t know how it’s done, but I know it happened. Several years ago.
It’s really the height of farce to have academics getting hoity toity about plagiarism of young students when professional hacking is so commonplace amongst academic professors.
You’d be shocked at how many sackings there would be if a new law were passed firing all Professors who engaged in it or benefitted from it.
Shocked.

July 24, 2011 11:39 am

So we are supposed to believe that this Guy sneaked into some room, where he attached his laptop, to steal the scientific journals & papers of MIT. Strange behaviour indeed, when actually MIT do make these documents available to anybody FREE OF CHARGE !
…… http://vera.mit.edu/
There is even apps for iPhone & Kindle
Strange behaviour for such a “secretive organisation”
Even stranger behaviour that this guy didn’t just go to M.I.T. library,
where he could have seen all the hard copy journals and papers for himself.
……smells like another “False Flag” operation

July 24, 2011 11:53 am

Oh and by the way “JSTOR” – http://www.jstor.org/
seems like just an index, of all the journals that I
looked at there, they didn’t have a SINGLE one stored
in their so called archive.
All they showed was some short abstrat, and then referred
users to an “external site”, like Wiley or Springer & etc.
It is hard to see how any person in such circumstances
leech any files or content via JSTOR, without having
access to Wiley or Springer & etc.
M.I.T.’s own PUBLIC archives of journals contains far more
information and materials that the so called hacker could
have obtained via JSTOR. Nearly all the Journals which JSTOR
do carry, appear to be from USA Universities, where the public
could obtain these journals first hand, and free of charge, in
most cases.
IMHO JSTOR is just like a glorified video download tracker,
just lists of lists, summaries, and redirects to other websites.

Peter Walsh
July 24, 2011 1:21 pm

gnomish says:
July 23, 2011 at 10:18 am
i have a hypothesis that somebody who puts out a book within days of the climategate leak probably had the emails for a while before then.
how bout that.
Are you pointing at Bishop Hill and his great book:
“The Hockey Stick illusion”
(Climategate and the Corruption of Science)
I wonder???

Gil R.
July 24, 2011 5:39 pm

Axel, perhaps you wrote your post before reading mine, but your impression of JSTOR is wrong. It *has* the articles online, from every volume of hundreds or thousands of journals. People with access can download a .pdf. JSTOR is NOT an empty site with nothing but links, but you don’t get to see the content without paying. For those of us with university jobs it’s free, meaning the university pays a subscription giving everyone full access, while anyone else on the planet can purchase articles individually.
Adding a bit to my previous post:
Having read about this case elsewhere besides WUWT I know that the accused Harvard fellow wanted to engage in data-mining of some sort, which is why he needed millions of articles. Not sure if he first tried to ask JSTOR for access before trying to take them in a way that he obviously knew was fishy. But even if JSTOR articles were free to him — because as a Harvard fellow I guarantee you he had JSTOR access — there’s no way that under the subscription agreement to which he was indirectly party he had the right to download millions of articles, even if his intention was to further human knowledge.
Three overall thoughts:
1) JSTOR is an outstanding service for those of us in scores of academic fields, and anyone who would ever launch a crusade against them for charging for what they have spent the money to scan is just an ignoramus. (Just read their online mission statement to see that they are a force for good. And I write this as one who has seen the number of journals in his own obscure field — ancient history — multiply over time, with more and more becoming available. That’s been paid for by subscribers and those purchasing articles, and possibly some charitable grants.)
2) This Swartz guy (I think that’s the name) did something he wasn’t supposed to do, and the Feds seem to be making an example of him.
3) Whoever has raised the issue of this scholarship possibly belonging to the public because it was paid for by taxpayers is wrong. The overwhelming majority of articles on JSTOR did not get taxpayer funding. But even for some that are, that doesn’t matter, because the journals hold the copyrights, so it’s not a valid legal argument that the journals should forego those copyrights when it comes to archived online versions. (And I’d add that, as someone who once got a federal grant for my own humanities research, nowhere in the rules did it say anything about my research belonging to the public in any way. I just was required to credit the granting institution in the acknowledgments. So if the federal government itself doesn’t think receiving a grant makes your research public property, I hardly think we should hold scholars to a higher standard.)
By the way, the original point to this case being mentioned on WUWT was to demonstrate to regular reader Keith Olbermann how easy it is to hack into a university system, and I love the fact that Anthony Watts made that connection.

LarryD
July 25, 2011 6:58 am

Just what would constitute “research” in an Ethics Center?
The whole “Professional Journal” industry has been a building sore point inside scholarly/scientific circles for decades, the financial incentives of the journals are in conflict with the incentives of the scholars to get their work widely disseminated.

July 26, 2011 10:12 am

Girl R
You say “JSTOR is NOT an empty site with nothing but links”
so here are some examples of what I am talking about ……
– please bear with me in this rather intricate expose of the pay-wall robbers
and why YOU don’t need to pay for this type of info, no matter who you are.
Just picked at random latest issues listed ….
Arctic, Antarctic, and Alpine Research (2008-2009) http://www.jstor.org/stable/i40012464
so the volumes 40 & 41 ARE NOT Available, except exterally from “BioOne” in that case.
Journal of Ecology (2006-2010) http://www.jstor.org/stable/i27754369
so the volumes 94-98 ARE NOT Available, except exterally from “Wiley” in that case.
Paleobiology (2006-2010) http://www.jstor.org/stable/i20445604
so the volumes 33-36 ARE NOT Available, except exterally from “BioOne” in that case.
So in the first instance I might wish to see the article:
Snow Cover Effects on Glacier Ice Surface Temperature
Margherita Maggioni, Michele Freppaz, Paolo Piccini, Mark W. Williams and Ermanno Zanini
Arctic, Antarctic, and Alpine Research – Vol. 41, No. 3 (Aug., 2009), pp. 323-329
JSTOR says …
“JSTOR does not currently archive this article.
You may have access to this article on an external site.”
http://www.jstor.org/pss/40305840
Should you then follow the link to see “Article on external site”
You arrive at another paywall operated by BioOne
“Currently you do not have access to this article, please log in or see options below”
http://www.bioone.org/doi/full/10.1657/1938-4246-41.3.323
THIS IS VERY COMMON
……..and so on. So yes JSTOR does have many journals, but they are on the whole, older editions, as much as ten years old in many cases. The latest three or four years are inevitably missing and users ARE referred to external sources. Having a JSTOR account doesn’t automatically log you in to Wiley or any of those others, but your University Library will most likely have subscriptions to all those external archives, and so for the user it will be transparent. Sometimes you may be asked to input your OWN university credentials if they have a group subscription.
Still an unofficial researcher will have serious difficulty, unless they have a brain !!!
DO LIKE THIS :
http://www.google.com/search?hl=en&source=hp&q=%22Snow+Cover+Effects+on+Glacier+Ice+Surface+Temperature%22
AND GET THIS RESULT:
http://snobear.colorado.edu/Markw/Research/09_italy_snow.pdf
from one of the authors of that publication, in full, FREE of charge.
Similarly for any other article that I found on JSTOR lists of lists.
JSTOR says – Just Shuttling That Old Redirection
MyBrain says – Maybe You Better Research Another Itinerary Nowadays
Google is a – Great Old Oracle, Good Library Engine
– QED
🙂