Forensic analysis of the fake Heartland 'Climate Strategy Memo' concludes Peter Gleick is the likely forger

gleickpic[1]Readers may recall that on February 22nd, I offered up some open source stylometry/textometry software called JGAAP (Java Graphical Authorship Attribution Program), with a suggestion that readers make use of it to determine the authorship of the faked Heartland strategy memo disseminated to the media by Peter Gleick.

A link to that article is here:

An online and open exercise in stylometry/textometry: Crowdsourcing the Gleick “Climate Strategy Memo” authorship

The reason I did that was that many had speculated that Dr. Peter Gleick was the author. Gleick, who admitted to obtaining the Heartland board meeting documents under false pretenses, and likely illegally, denies he wrote it. Except for a few holdouts and those who won’t give an opinion, like Andy Revkin, other prominent voices of the online community such as Megan McArdle of The Atlantic think otherwise, and she doesn’t even see it as a professionally written memo:

“…their Top Secret Here’s All the Bad Stuff We’re Gonna Do This Year memo…reads like it was written from the secret villain lair in a Batman comic. By an intern.”

In posting about JGAAP software crowdsourcing, I had hoped that the wide professional base of readers could make use of this software and would be able to come to conclusions using it, but there were complications that made the task more difficult than it would normally be. These complications included the fact that there were cut and pasted elements of other stolen Heartland documents in the “Climate Strategy Memo,” making it difficult for the software to delineate the separate writing styles without knowledgeable fine tuning.

These complications became especially evident when writer Shawn Otto at the Huffington Post used the JGAAP software to do his own analysis, coming to the conclusion that Joe Bast, president of the Heartland Institute, had authored the fake memo.  The problem was that Mr. Otto did not perform the due diligence required in his selection of documents and the JGAAP software controls, and this led to an erroneous result.

In the end I realized that only professionals familiar with the science of stylometry/textometry would be able to make a credible determination as to the authorship. So, I asked for help.

On February 23, 2012 I sent the Evaluating Variations in Language Laboratory (the group responsible for the JGAAP software) a request for assistance. Mainly what I was looking for initially was tips on how to best operate their software, but given the high profile nature of this issue, and the unique situation, they referred me to Juola & Associates and its president, Patrick Brennan, who responded with an even better offer. They would use their larger collection of tools and techniques reserved for their forensics consulting work and apply it to the task, pro bono. Normally such professional analysis for courtroom quality work nets them fees comparable to what a metropolitan lawyer might charge, so not only was I extremely grateful, but realized it was an offer I couldn’t refuse.

In my email to Brennan on Fri, Feb 24, 2012 at 5:07 PM I wrote:

For the record, I do not know what the outcome might be, but it is always best to consult experts externally who have no financial interest in the outcome of the case.

Here’s the background on the group:

Juola & Associates (www.juolaassoc.com) is the premier provider of expert analysis and testimony in the field of text and authorship. Our scientists are leading, world-recognized experts in the fields of stylometry, authorship attribution, authorship verification, and author analysis.  Every written document is a snapshot of the person who wrote it; through our analysis, we can determine everything from sociological information to biographical information, even the identity of the author.  We provide sound, tested, and legally-recognized analysis as well as expert testimony by Dr. Patrick Juola, arguably one of the world’s leaders in the field of Forensic Stylometry.

We have worked with groups as wide-ranging as multinational companies, Federal courts, research groups, and individuals seeking political asylum.  We have literally written the book (ISBN 978-1-60198-118-9) on computational methods for authorship analysis and profiling.

The lead analysis was conducted by Patrick Juola, Ph.D., Director of Research, and director of the Evaluating Variations in Language Laboratory at Duquesne University in Pittsburgh. Juola & Associates, headed by President Patrick Brennan is a separate commercial entity that provides analysis and consultation on stylometry.

Dr. Juola has published his analysis of the “Climate Strategy Memo,” which I present first and in entirety here at WUWT.

First, the short read:

Stylometric Report – Heartland Institute Memo

Patrick Juola, Ph.D.

Summary

As an expert in computational and forensic linguistics, I have reviewed the alleged Heartland memo to determine who the primary author of the report is, and more specifically whether the primary author was Peter Gleick or Joseph Bast. I conclude, based on a computational analysis, that the author is more likely to be Gleick than Bast.

And the larger excerpt of the document, bolds mine:

Analysis

24 This task is challenging for several reasons, some technical and some linguistic.

25 First, the Heartland memo as published contains a great many quotations taken from other sources. As originally published, the memo contains approximately 717 words, but at least 266 of those words have been identified as belonging to phrases (or paraphrases of phrases) found elsewhere in the stolen documents). [N.b. this identification was done by the Heartland Institute, who admit that these 266 words are “paraphrases [of] text appearing in one of the stolen documuments.”

As paraphrases, they may nor may not reflect the style of the original authors, and they also may or may not reflect the style of the alleged forger. For this reason, we analyzed both the full document as well as the 451-word redacted document with the controversial passages removed.

26 Second, even the full-length document is rather short for an accurate analysis. Most authorship attribution experts recommend larger samples if possible. (E.g., Eder recommends 3500 words per sample, noting that results obtained from fewer than 3000 words “are simply disastrous.”)

27 Thirdly, perhaps as a result of the previous factors, we have observed that Bast and Gleick appear to have extremely similar writing styles.

Results

28 Despite this difficulty, we were able to identify and calibrate an appropriate analysis method. Using this method, we analyzed both the complete Heartland memo and the selections from the Heartland memo that had been identified as not copied from other stolen documents. In both analyses, the JGAAP system identified the author as Peter Gleick.

29 In particular, the JGAAP system identified the author of the complete (unredacted) memo as Peter Gleick, despite the large amount of text that even Bast admits is largely taken from genuine writings of the Heartland Institute. We justify this result by observing, first, that much of the quotation is actual paraphrase, and the amount of undisputed writing is still nearly 2/3 of the full memo.

Conclusions

30 In response to the question of who wrote the disputed Heartland strategy memo, it is difficult to deliver an answer with complete certainty. The writing styles are similar and the sample is extremely small, both of which act to reduce the accuracy of our analysis. Our procedure by assumption excluded every possible author but Bast and Gleick. Nevertheless, the analytic method that correctly and reliably identified twelve of twelve authors in calibration testing also selected Gleick as the author of the disputed document. Having examined these documents and their results, I therefore consider it more likely than not that Gleick is in fact the author/compiler of the document entitled ”Confidential Memo: 2012 Heartland Climate Strategy,” and further that the document does not represent a genuine strategy memo from the Heartland Institute.

It seems very likely then, given the result of this analysis, plus the circumstances, proximity, motive, and opportunity, that Dr. Peter Gleick forged the document known as ”Confidential Memo: 2012 Heartland Climate Strategy.” The preponderance of the evidence points squarely to Gleick. According to Wikipedia’s entry on the “legal burden of proof”:

Preponderance of the evidence, also known as balance of probabilities is the standard required in most civil cases. This is also the standard of proof used in Grand Jury indictment proceedings (which, unlike civil proceedings, are procedurally unrebuttable).

Further, it is abundantly clear that this document was not authored by Heartland’s Joe Bast, nor was it included as part of the board package of documents Dr. Gleick (by his own admission) phished under false pretenses from Heartland.

The complete analysis by Dr. Juola is available here: MemoReport (PDF 101k)

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

166 Comments
Inline Feedbacks
View all comments
Gary Swift
March 14, 2012 7:34 am

Evaluating Variations in Language Laboratory
Anthony, did you make that up? Really? EViLL?
“…their Top Secret Here’s All the Bad Stuff We’re Gonna Do This Year memo…reads like it was written from the secret villain lair in a Batman comic. By an intern.”
lolz. Absolutely fantastic.

Mike M
March 14, 2012 7:37 am

Joe should challenge Peter to a lie detector test – or vice versa.
I don’t leave out the possibility that Joe realized it was Peter who was phishing – or someone else at Heartland realized it – then sent the fake to Peter hoping he’d be stupid enough to take the bait. This makes it possible for Peter and Joe to both be telling the truth.

Tom in Florida
March 14, 2012 7:37 am

My, my how what goes around comes around. So the analysis eliminates Bast and the only one left is Gleick. Much like the reasoning of the AGW crowd that says once we’ve eliminated natural causes for warming the only thing left is man made CO2.

March 14, 2012 7:40 am

I would add to my previous comment:
1. Brennan and Greenstadt couldn’t determine an author above chance from a choice of 15 possible authors – whereas Dr Juola had to only consider 2 – so his task is easier than the baseline
2. One key difference from Dr Juola’s analysis from Otto’s, is Dr Juola knows what he is doing rather than blinding running software he doesn’t understand.
But: I would personally feel more detail is needed, before I would consider this analysis to be compelling.

March 14, 2012 7:41 am

An error in my first comment on this thread:
(b) can’t understand Heartland’s spreadsheets (there are two maths errors – Koch & the double counting of $88,000) – which makes a Heartland insider likely
I meant of course:
(b) can’t understand Heartland’s spreadsheets (there are two maths errors – Koch & the double counting of $88,000) – which makes a Heartland insider *UN*likely
REPLY: Fixed, Anthony

March 14, 2012 7:44 am

Simple old fashioned police logic: 1.Suspect admits to having stolen the documents.
2. The admittedly forged document contains paraphrases and quotes that could only have come from the stolen documents.
3.Suspect is the forger.
The rest is just more evidence.

ferd berple
March 14, 2012 7:46 am

I don’t understand. Gleick has apparently admitted to engaging in identity theft to obtain documents. Identity theft is a crime under US law. Doesn’t ignoring Gliek tend to case US law into disrepute?
http://www.justice.gov/criminal/fraud/websites/idtheft.html
What’s the Department of Justice Doing About Identity Theft and Fraud?
The Department of Justice prosecutes cases of identity theft and fraud under a variety of federal statutes. In the fall of 1998, for example, Congress passed the Identity Theft and Assumption Deterrence Act . This legislation created a new offense of identity theft, which prohibits knowingly transfer[ring] or us[ing], without lawful authority, a means of identification of another person with the intent to commit, or to aid or abet, any unlawful activity that constitutes a violation of Federal law, or that constitutes a felony under any applicable State or local law.
18 U.S.C. § 1028(a)(7). This offense, in most circumstances, carries a maximum term of 15 years’ imprisonment, a fine, and criminal forfeiture of any personal property used or intended to be used to commit the offense.
Schemes to commit identity theft or fraud may also involve violations of other statutes such as identification fraud (18 U.S.C. § 1028), credit card fraud (18 U.S.C. § 1029), computer fraud (18 U.S.C. § 1030), mail fraud (18 U.S.C. § 1341), wire fraud (18 U.S.C. § 1343), or financial institution fraud (18 U.S.C. § 1344). Each of these federal offenses are felonies that carry substantial penalties ­ in some cases, as high as 30 years’ imprisonment, fines, and criminal forfeiture.

Dave
March 14, 2012 7:49 am

One has to wonder why Gleick hasn’t been arrested already since he admitted phishing data from Heartland. Isn’t that illegal?

Shevva
March 14, 2012 7:52 am

I’d just like to add my thanks to Patrick Juola, Ph.D, for his time and effect and let the courts decide the rest.

Louis Hooffstetter
March 14, 2012 7:57 am

Has Gleick been charged with any crime(s) yet? If not, why not?
If the Justice Department ignores this, that by itself is evidence of corruption at even higher levels.

Severian
March 14, 2012 8:01 am

Let’s see. “More likely than not” is, according to the IPCC and AGW folks, the same as “absolutely certain.”
Seems only fair to use the same standards eh?

MarkW
March 14, 2012 8:02 am

One point that seems to have been missed in the comments so far.
In addition to fingering Gliek as the probably author of the “strategy” document. This test specifically excluded Bass as a possible author.
Regardless of who actually wrote the document, it’s a forgery.

Nylo
March 14, 2012 8:14 am

If I understood the conclussions of the report correctly, they don’t say that it was Gleick, but that it is far more likely that it was Gleick than it being Bast. This certainly excludes Bast, but doesn’t mean Gleik is guilty, because in no way does it exclude or minimize the chance that it was neither of them, which is Gleik’s defense (someone else passed this fake document to him).
REPLY: See the section about preponderence of the evidence – Anthony

Coach Springer
March 14, 2012 8:24 am

[snip – lets not make suggestions that may incite others]

dwyoder
March 14, 2012 8:26 am

There is a concensus. The science is irrefutable. It was Gleick.

Mindert Eiting
March 14, 2012 8:33 am

‘One has to wonder why Gleick hasn’t been arrested already since he admitted phishing data from Heartland. Isn’t that illegal?’ Yes Dave, but sometimes it is better to leave a suspect on the street if you want to catch a bigger fish.

Bill
March 14, 2012 8:33 am

Let’s see. The company’s acronym is EViLL and it is called Juola. The J is probably silent, which means it is pronounced OIL-A. EViLL OIL. Gleick was right after all!!

theduke
March 14, 2012 8:34 am

It seems to me that the report does not deal with the possibility that a third person was the author.
Perhaps someone can comment on that. Gleick did not state specifically that Bast or anyone at Heartland was the author although, of course, he would love people to think that.
I think it’s still possible that another person wrote it and that person is known to Gleick. Hell, Gleick may have had his assistant write it if he were purposefully trying to conceal his involvement.

March 14, 2012 8:35 am

I think what would be really compelling – and a use for crowd source
Is save the setup used by Dr Juola and make it available to the crowd.
Members of the crowd can then go around finding other documents – documents which were not in the training or test sets – but which are known to be written by Gleick or Bast, and run Dr Juola’s test against them.
If we test 100 documents, say 50 written by Gleick, and 50 written by Bast, we can get some idea what percentage it identifies correctly. Of course, such a test would be on a non-hostile author – so easier than analysis the fake – , but if find the methodology is say 99% correct on a non-hostile author, it would be a lot more convincing than if it is say 51% correct on a non-hostile author.

Bill
March 14, 2012 8:35 am

I also like the “more likely than not” Reminds me of the IPCC language that says there is at least a 51% chance (+/- 50%??) of something or other.

mikegeo
March 14, 2012 8:38 am

Nylo
Gleick said he got the memo and then he phished from Heartland. This analysis says the memo contains cut and pasted phrases from the phished documents. Hence the memo was created afterwards and not received by Gleick beforehand as he tried to claim.
So perhaps Gleick left every bit of ethics he had at that AGU posting and hence couldn’t help himself?

DirkH
March 14, 2012 8:43 am

Tom in Florida says:
March 14, 2012 at 7:37 am
“My, my how what goes around comes around. So the analysis eliminates Bast and the only one left is Gleick. Much like the reasoning of the AGW crowd that says once we’ve eliminated natural causes for warming the only thing left is man made CO2.”
Tom, imagine I send a document to some bloggers that I scanned, and that looks like a bad parody of a DoD document, that says “we plan to bomb country XXX on that and that day”.
I later “confess” that yes, I scanned it, and I got the original in the mail, and I thought it could be genuine so I sent it to the bloggers without telling them how I got it, letting them also believe it’s genuine – let’s assume for the moment that those bloggers are so gullible they don’t question the document.
Wouldn’t that make me just a tiny bit suspect.

Gail Combs
March 14, 2012 8:44 am

Ric Werme says:
March 14, 2012 at 6:40 am
Preponderance of the evidence, also known as balance of probabilities is the standard required in most civil cases.
It’s worth noting, especially for non-US readers, that this is a much lower bar than the criminal standard “Beyond reasonable doubt”……
_____________________________________
In other words it is good enough to get a search warrant but not good enough to hang him.

Editor
March 14, 2012 8:52 am

Louis Hooffstetter says:
March 14, 2012 at 7:57 am
> Has Gleick been charged with any crime(s) yet? If not, why not?
Not yet. I think there’s enough evidence to charge him, however:
1) He’s not a threat to cause bodily harm to anyone.
2) He’s likely not a threat to cause financial harm in the immediate future.
3) This may need to go through a grand jury.
4) Affected parties are in multiple states, the investigation and interviewing witnesses is not just a matter of driving around town.
There’s no rush. It’s more important to get the charges right and the evidence lined up than it is to arraign him with an incomplete investigation.
> If the Justice Department ignores this, that by itself is evidence of corruption at even higher levels.
Or it’s evidence of more pressing matters. Perhaps you can research other federal white collar crime cases and come up with a frequency distribution of time between opening the investigation and arraignment. One thing that may reduce the time in this case is there’s likely no ongoing criminal activity. When there is, the investigation time can be lengthened in order to get well documented evidence, videos, etc.
Patience.

HowardG
March 14, 2012 8:53 am

If possible I think it would be a satisfactory end if Heartland would agree not to press charges against Gleick for the crimes related to impersonation of a Heartland board member if Gleick would confess to authorship of the fake memo and apologize. This saves the taxpayers some money and resolves the who done it question.