Forensic analysis of the fake Heartland 'Climate Strategy Memo' concludes Peter Gleick is the likely forger

gleickpic[1]Readers may recall that on February 22nd, I offered up some open source stylometry/textometry software called JGAAP (Java Graphical Authorship Attribution Program), with a suggestion that readers make use of it to determine the authorship of the faked Heartland strategy memo disseminated to the media by Peter Gleick.

A link to that article is here:

An online and open exercise in stylometry/textometry: Crowdsourcing the Gleick “Climate Strategy Memo” authorship

The reason I did that was that many had speculated that Dr. Peter Gleick was the author. Gleick, who admitted to obtaining the Heartland board meeting documents under false pretenses, and likely illegally, denies he wrote it. Except for a few holdouts and those who won’t give an opinion, like Andy Revkin, other prominent voices of the online community such as Megan McArdle of The Atlantic think otherwise, and she doesn’t even see it as a professionally written memo:

“…their Top Secret Here’s All the Bad Stuff We’re Gonna Do This Year memo…reads like it was written from the secret villain lair in a Batman comic. By an intern.”

In posting about JGAAP software crowdsourcing, I had hoped that the wide professional base of readers could make use of this software and would be able to come to conclusions using it, but there were complications that made the task more difficult than it would normally be. These complications included the fact that there were cut and pasted elements of other stolen Heartland documents in the “Climate Strategy Memo,” making it difficult for the software to delineate the separate writing styles without knowledgeable fine tuning.

These complications became especially evident when writer Shawn Otto at the Huffington Post used the JGAAP software to do his own analysis, coming to the conclusion that Joe Bast, president of the Heartland Institute, had authored the fake memo.  The problem was that Mr. Otto did not perform the due diligence required in his selection of documents and the JGAAP software controls, and this led to an erroneous result.

In the end I realized that only professionals familiar with the science of stylometry/textometry would be able to make a credible determination as to the authorship. So, I asked for help.

On February 23, 2012 I sent the Evaluating Variations in Language Laboratory (the group responsible for the JGAAP software) a request for assistance. Mainly what I was looking for initially was tips on how to best operate their software, but given the high profile nature of this issue, and the unique situation, they referred me to Juola & Associates and its president, Patrick Brennan, who responded with an even better offer. They would use their larger collection of tools and techniques reserved for their forensics consulting work and apply it to the task, pro bono. Normally such professional analysis for courtroom quality work nets them fees comparable to what a metropolitan lawyer might charge, so not only was I extremely grateful, but realized it was an offer I couldn’t refuse.

In my email to Brennan on Fri, Feb 24, 2012 at 5:07 PM I wrote:

For the record, I do not know what the outcome might be, but it is always best to consult experts externally who have no financial interest in the outcome of the case.

Here’s the background on the group:

Juola & Associates (www.juolaassoc.com) is the premier provider of expert analysis and testimony in the field of text and authorship. Our scientists are leading, world-recognized experts in the fields of stylometry, authorship attribution, authorship verification, and author analysis.  Every written document is a snapshot of the person who wrote it; through our analysis, we can determine everything from sociological information to biographical information, even the identity of the author.  We provide sound, tested, and legally-recognized analysis as well as expert testimony by Dr. Patrick Juola, arguably one of the world’s leaders in the field of Forensic Stylometry.

We have worked with groups as wide-ranging as multinational companies, Federal courts, research groups, and individuals seeking political asylum.  We have literally written the book (ISBN 978-1-60198-118-9) on computational methods for authorship analysis and profiling.

The lead analysis was conducted by Patrick Juola, Ph.D., Director of Research, and director of the Evaluating Variations in Language Laboratory at Duquesne University in Pittsburgh. Juola & Associates, headed by President Patrick Brennan is a separate commercial entity that provides analysis and consultation on stylometry.

Dr. Juola has published his analysis of the “Climate Strategy Memo,” which I present first and in entirety here at WUWT.

First, the short read:

Stylometric Report – Heartland Institute Memo

Patrick Juola, Ph.D.

Summary

As an expert in computational and forensic linguistics, I have reviewed the alleged Heartland memo to determine who the primary author of the report is, and more specifically whether the primary author was Peter Gleick or Joseph Bast. I conclude, based on a computational analysis, that the author is more likely to be Gleick than Bast.

And the larger excerpt of the document, bolds mine:

Analysis

24 This task is challenging for several reasons, some technical and some linguistic.

25 First, the Heartland memo as published contains a great many quotations taken from other sources. As originally published, the memo contains approximately 717 words, but at least 266 of those words have been identified as belonging to phrases (or paraphrases of phrases) found elsewhere in the stolen documents). [N.b. this identification was done by the Heartland Institute, who admit that these 266 words are “paraphrases [of] text appearing in one of the stolen documuments.”

As paraphrases, they may nor may not reflect the style of the original authors, and they also may or may not reflect the style of the alleged forger. For this reason, we analyzed both the full document as well as the 451-word redacted document with the controversial passages removed.

26 Second, even the full-length document is rather short for an accurate analysis. Most authorship attribution experts recommend larger samples if possible. (E.g., Eder recommends 3500 words per sample, noting that results obtained from fewer than 3000 words “are simply disastrous.”)

27 Thirdly, perhaps as a result of the previous factors, we have observed that Bast and Gleick appear to have extremely similar writing styles.

Results

28 Despite this difficulty, we were able to identify and calibrate an appropriate analysis method. Using this method, we analyzed both the complete Heartland memo and the selections from the Heartland memo that had been identified as not copied from other stolen documents. In both analyses, the JGAAP system identified the author as Peter Gleick.

29 In particular, the JGAAP system identified the author of the complete (unredacted) memo as Peter Gleick, despite the large amount of text that even Bast admits is largely taken from genuine writings of the Heartland Institute. We justify this result by observing, first, that much of the quotation is actual paraphrase, and the amount of undisputed writing is still nearly 2/3 of the full memo.

Conclusions

30 In response to the question of who wrote the disputed Heartland strategy memo, it is difficult to deliver an answer with complete certainty. The writing styles are similar and the sample is extremely small, both of which act to reduce the accuracy of our analysis. Our procedure by assumption excluded every possible author but Bast and Gleick. Nevertheless, the analytic method that correctly and reliably identified twelve of twelve authors in calibration testing also selected Gleick as the author of the disputed document. Having examined these documents and their results, I therefore consider it more likely than not that Gleick is in fact the author/compiler of the document entitled ”Confidential Memo: 2012 Heartland Climate Strategy,” and further that the document does not represent a genuine strategy memo from the Heartland Institute.

It seems very likely then, given the result of this analysis, plus the circumstances, proximity, motive, and opportunity, that Dr. Peter Gleick forged the document known as ”Confidential Memo: 2012 Heartland Climate Strategy.” The preponderance of the evidence points squarely to Gleick. According to Wikipedia’s entry on the “legal burden of proof”:

Preponderance of the evidence, also known as balance of probabilities is the standard required in most civil cases. This is also the standard of proof used in Grand Jury indictment proceedings (which, unlike civil proceedings, are procedurally unrebuttable).

Further, it is abundantly clear that this document was not authored by Heartland’s Joe Bast, nor was it included as part of the board package of documents Dr. Gleick (by his own admission) phished under false pretenses from Heartland.

The complete analysis by Dr. Juola is available here: MemoReport (PDF 101k)

Advertisements

  Subscribe  
newest oldest most voted
Notify of
geo

This is certainly interesting, but I’m having a hard time getting past him acknowledging other experts advise that trying to use samples of less than 3,000 words “are simply disastrous” when this sample is so much smaller in either form (paraphrases included or not).
That’s a pretty big caveat.

Harold Ambler

The MSM, NYT included of course, is waiting for this to blow over. In so doing, they are making life a lot easier for a criminal.

So despite the alaemist crowd still desperately attempting to blame Heartland for the memo, it was Gleick the reprobate after all. Anyone who still employs that guy, or has anything to do with him, is condoning dishonesty. Gleick needs to just go away, and let someone honest take over.

richard verney

It is a serious allegation that he actually forged the document. I wonder whether Heartland would have the nerve to run with that allegation.
Of course, Gleick could lay such allegations to rest by simply disclosing how he came into possession of the ‘faked’ document. He could open up the paper trail, if there truly is a trail.

David L

By pro bono do you actually mean it’s being funded by Big Oil? /sarc

Latitude

…I don’t see anything wrong with the memo in the first place

The Black Adder

`It seems very likely then, given the result of this analysis, plus the circumstances, proximity, motive, and opportunity, that Dr. Peter Gleick forged the document known as ”Confidential Memo: 2012 Heartland Climate Strategy.” `
Well, well, well…
Nothing about this business suprises me anymore.
It is dirty, dishonest and disingenuous.
Science is the loser, Thankyou for nothing Dr Gleick!

The report is impressive and potentially devastating for Gleick. Where a number of WUWT readers tend to be scientifically (or perhaps analytically is better) orientated, they should avoid taking a scientific method-like approach to analyzing the report and its conclusions. Rather, it should be viewed as a member of a jury hearing introduced evidence. In this light, the report certainly lends credence to Gleick as the strategy memo’s author – beyond the level of speculation or conjecture via a quantitative analysis.
I’m still curious, though, as to how Heartland (and/or the FBI) will address Gleick’s clearly criminal activity from a legal perspective – false moral justifications from Gleick apologists notwithstanding.

anon

Yes, interesting but difficult to take very seriously. Would a genuine language expert use “The quick brown fox jumped over the lazy dog” (missing the letter ‘s’) as an example?

John F. Hultquist

Given a 300 word essay by my wife and another by someone else, I would be able to tell which was which about half way through. What Dr. Patrick Juola has going for him here is that the two most likely writers are both known. The analysis (with high probability) says eliminate Bast. Then it says, also with high probability, that the other fellow is the culprit. Also, knowing that the other fellow was a bit unhinged at the time, we have the third strike. Back to the dugout, Gleick.

Russ R.

Well done Anthony. I especially appreciate that you took care to point out the various caveats and limitations in the analysis.
I only wish that all scientists and journalists displayed such integrity.

Preponderance of the evidence, also known as balance of probabilities is the standard required in most civil cases.

It’s worth noting, especially for non-US readers, that this is a much lower bar than the criminal standard “Beyond reasonable doubt”.
This is what allows things like OJ Simpson to be found not guilty of murder, but also a civil result that he was responsible for his wife’s death. (Well, that and some really poor handling by the prosecution, but let’s not go there.)
So, from the self-critique, it appears this analysis is inadequate by itself to prove Gleick violated federal wire fraud statutes. However, it could have important implications for any civil proceedings between Heartland and Gleick.
I suspect both the FBI and Heartland know that already.

wouldnt it be great if Ar5 where written with this much attention to uncertainties.

Has Peter Gleick actually denied penning that ‘secret strategy memo’?
If I recall correctly, the wordings were carefully (lawyerly) crafted around the hot topics and questions pertaining to that memo.
In the same manner as he did claim, after having been ridiculed and pressured for ‘reviewing’ Donna F’s book without touching upon its contents, that he actually had read it. But carefully avoided the real relevant question whether he actually had read it prior to his Amazon ‘review’.

John Blake

If expert testimony, with caveats, gives odds sufficient to state that the “weight of evidence” identifies the sole plausible candidate to have perpetrated this egregious fraud,
then as in forensic genealogy –relating to trusts and estates– an impartial jury is fully entitled to convict Peter Gleick of criminal forgery (among other things).
For over a generation now, pseudo-intellectual poseurs like Gleick have poisoned climate science with impunity. Let’s hope that Joe Bast on Heartland’s behalf pursues this case, prosecuting Gleick as the simpering phony that he is.

johnnyjapan

This what I got from your link….
Navigation
Links
Contact
Login
JComputing was started from a authorship attribution / stylometry research group working out of Duquesne University under the direction of Prof. Patrick Juola.
Johnny Japan

oMan

As the experts themselves caution us, this was a probe at (or beyond?) the limits of detection. A tiny sample and a binary decision: eliminate Bast and Gleick is the only one left. Ideally the software would have been given a “line up” of hundreds or thousands of possible culprits and big samples to compare. But we don’t live in an ideal world. I agree this might not be enough to hang Gleick but it does put him in the hot seat. I doubt he will respond; his best strategy is to go to ground, quietly settle with Heartland (possibly by applying pressure through unseen or indirect channels: the War Against The Donors is his main strategic objective in any case) and then re-emerge in a few months as loud and proud as ever.
I think Heartland either breaks him, very publicly and convincingly, or he and his team will eventually destroy Heartland. This is for reals. Just my uninformed opinion, of course.

Steve S

Based on all this, I wonder if P. Gleick would voluntarily submit to a polygraph. I’m guessing…’No’.

Jimbo

Now if Dr. Gleick has to go to court I wonder who Joe Bast will call in as an expert witness?
I’ve said it before and I’ll say it again Gleick should do the cathartic thing and just own up. This is an open secret for goodness sake man. You lied once why wouldn’t you lie again.

kadaka (KD Knoebel)

Juola & Associates (http://www.juolaassoc.com) is the premier provider of expert analysis and testimony in the field of text and authorship.
Could we start up a fund drive to hire them to see if Dr. Mann’s latest novel of historical fiction was truly authored by him?

Scott Covert

Thanks, to the “Evaluating Variations in Language Laboratory”. I suppose the payback for the Pro Bono could be a good will thing but something they clearly do not need.
Their donation of time and resources to this matter is very much appreciated, at least, by myself.
Good idea to go to the experts on this one to get a truly objective opinion. Just think what a game changer this would be if Bast was fingered as the author. In any case the truth is more important than motivation and I am sure all you are after is the truth.
Thanks Anthony, and thanks Patrick Juola, Ph.D.

Jonas N says:
March 14, 2012 at 6:52 am
Has Peter Gleick actually denied penning that ‘secret strategy memo’?
No he hasn’t. He denied writing the alleged anonymous document which he allegedly received through the alleged US Postal Service. One is led to infer that this was the 2012 Strategy Memo but nowhere does Gleick confirm this.
If Gleick thought the 2012 Strategy Memo was genuine, why did he not explicitly ask Heartland to email him a copy when he was phishing for confirming documents ? The embedded document characteristics would have proved to the world that this was a genuine Heartland document.
My bet is that not only did Gleick write the fake memo after phishing and finding no dirt, but also that the anonymous document never existed.

Game, Set, Match. (with emphasis on the match!).

I find the analysis interesting but uncompelling for the reason identified previously by myself, and in more detail by my colleague Byronic, in relation to Otto’s and Greg Laden’s analysis – stylometric analysis is very difficult against a hostile author.
http://books.google.co.uk/books?id=NdnMX5NUBJQC&pg=PA117&lpg=PA117&dq=jgaap+obfuscation&source=bl&ots=M5J5HnXZ7-&sig=DloBYZcM5vbdnIt_yeKDGJNdYRc&hl=en&sa=X&ei=sdhUT9jqN4eT8gPd1KjxBQ&ved=0CDcQ6AEwAw#v=onepage&q=jgaap%20obfuscation&f=false
“Brennan and Greenstadt applied three fairly standard stylometric methods to determine authorship of obfuscated or imitative essays. Their results for obsfuscated essays were essentially at chance, suggesting that attempts to disguise or imitate style are likely to be successful against stylometric methods.”
For the record, I can’t speak for Byronic, but I believe that Gleick is the author for a very simple reason:
The memo must have been written by
(a) somebody who had access to Heartland documents prior to February 13th – which narrows it down to somebody at Heartland or Gleick
(b) can’t understand Heartland’s spreadsheets (there are two maths errors – Koch & the double counting of $88,000) – which makes a Heartland insider unlikely
(c) believed that Gleick and his Forbes column was not just important, but very important – which rules out everybody except for Gleick & perhaps his mum.

richard verney

Ric Werme says:
March 14, 2012 at 6:40 am
///////////////////////////////////////////////////////
Your summary of the burden of proof and why you can therefore see different outcomes in related civil and criminal proceedings is useful.
However, whilst in the UK we have that distinction, the boundaries are sometimes blurred in that for example in a civil case which involves an allegation of fraud, my understanding is that the criminal burden of proof (or at any rate ‘clear and convincing evidence’) is required in order to substantiate that allegation.

Gary Swift

Evaluating Variations in Language Laboratory
Anthony, did you make that up? Really? EViLL?
“…their Top Secret Here’s All the Bad Stuff We’re Gonna Do This Year memo…reads like it was written from the secret villain lair in a Batman comic. By an intern.”
lolz. Absolutely fantastic.

Mike M

Joe should challenge Peter to a lie detector test – or vice versa.
I don’t leave out the possibility that Joe realized it was Peter who was phishing – or someone else at Heartland realized it – then sent the fake to Peter hoping he’d be stupid enough to take the bait. This makes it possible for Peter and Joe to both be telling the truth.

Tom in Florida

My, my how what goes around comes around. So the analysis eliminates Bast and the only one left is Gleick. Much like the reasoning of the AGW crowd that says once we’ve eliminated natural causes for warming the only thing left is man made CO2.

I would add to my previous comment:
1. Brennan and Greenstadt couldn’t determine an author above chance from a choice of 15 possible authors – whereas Dr Juola had to only consider 2 – so his task is easier than the baseline
2. One key difference from Dr Juola’s analysis from Otto’s, is Dr Juola knows what he is doing rather than blinding running software he doesn’t understand.
But: I would personally feel more detail is needed, before I would consider this analysis to be compelling.

An error in my first comment on this thread:
(b) can’t understand Heartland’s spreadsheets (there are two maths errors – Koch & the double counting of $88,000) – which makes a Heartland insider likely
I meant of course:
(b) can’t understand Heartland’s spreadsheets (there are two maths errors – Koch & the double counting of $88,000) – which makes a Heartland insider *UN*likely
REPLY: Fixed, Anthony

Simple old fashioned police logic: 1.Suspect admits to having stolen the documents.
2. The admittedly forged document contains paraphrases and quotes that could only have come from the stolen documents.
3.Suspect is the forger.
The rest is just more evidence.

ferdberple

I don’t understand. Gleick has apparently admitted to engaging in identity theft to obtain documents. Identity theft is a crime under US law. Doesn’t ignoring Gliek tend to case US law into disrepute?
http://www.justice.gov/criminal/fraud/websites/idtheft.html
What’s the Department of Justice Doing About Identity Theft and Fraud?
The Department of Justice prosecutes cases of identity theft and fraud under a variety of federal statutes. In the fall of 1998, for example, Congress passed the Identity Theft and Assumption Deterrence Act . This legislation created a new offense of identity theft, which prohibits knowingly transfer[ring] or us[ing], without lawful authority, a means of identification of another person with the intent to commit, or to aid or abet, any unlawful activity that constitutes a violation of Federal law, or that constitutes a felony under any applicable State or local law.
18 U.S.C. § 1028(a)(7). This offense, in most circumstances, carries a maximum term of 15 years’ imprisonment, a fine, and criminal forfeiture of any personal property used or intended to be used to commit the offense.
Schemes to commit identity theft or fraud may also involve violations of other statutes such as identification fraud (18 U.S.C. § 1028), credit card fraud (18 U.S.C. § 1029), computer fraud (18 U.S.C. § 1030), mail fraud (18 U.S.C. § 1341), wire fraud (18 U.S.C. § 1343), or financial institution fraud (18 U.S.C. § 1344). Each of these federal offenses are felonies that carry substantial penalties ­ in some cases, as high as 30 years’ imprisonment, fines, and criminal forfeiture.

Dave

One has to wonder why Gleick hasn’t been arrested already since he admitted phishing data from Heartland. Isn’t that illegal?

Shevva

I’d just like to add my thanks to Patrick Juola, Ph.D, for his time and effect and let the courts decide the rest.

Louis Hooffstetter

Has Gleick been charged with any crime(s) yet? If not, why not?
If the Justice Department ignores this, that by itself is evidence of corruption at even higher levels.

Severian

Let’s see. “More likely than not” is, according to the IPCC and AGW folks, the same as “absolutely certain.”
Seems only fair to use the same standards eh?

MarkW

One point that seems to have been missed in the comments so far.
In addition to fingering Gliek as the probably author of the “strategy” document. This test specifically excluded Bass as a possible author.
Regardless of who actually wrote the document, it’s a forgery.

Nylo

If I understood the conclussions of the report correctly, they don’t say that it was Gleick, but that it is far more likely that it was Gleick than it being Bast. This certainly excludes Bast, but doesn’t mean Gleik is guilty, because in no way does it exclude or minimize the chance that it was neither of them, which is Gleik’s defense (someone else passed this fake document to him).
REPLY: See the section about preponderence of the evidence – Anthony

Coach Springer

[snip – lets not make suggestions that may incite others]

dwyoder

There is a concensus. The science is irrefutable. It was Gleick.

Mindert Eiting

‘One has to wonder why Gleick hasn’t been arrested already since he admitted phishing data from Heartland. Isn’t that illegal?’ Yes Dave, but sometimes it is better to leave a suspect on the street if you want to catch a bigger fish.

Bill

Let’s see. The company’s acronym is EViLL and it is called Juola. The J is probably silent, which means it is pronounced OIL-A. EViLL OIL. Gleick was right after all!!

theduke

It seems to me that the report does not deal with the possibility that a third person was the author.
Perhaps someone can comment on that. Gleick did not state specifically that Bast or anyone at Heartland was the author although, of course, he would love people to think that.
I think it’s still possible that another person wrote it and that person is known to Gleick. Hell, Gleick may have had his assistant write it if he were purposefully trying to conceal his involvement.

I think what would be really compelling – and a use for crowd source
Is save the setup used by Dr Juola and make it available to the crowd.
Members of the crowd can then go around finding other documents – documents which were not in the training or test sets – but which are known to be written by Gleick or Bast, and run Dr Juola’s test against them.
If we test 100 documents, say 50 written by Gleick, and 50 written by Bast, we can get some idea what percentage it identifies correctly. Of course, such a test would be on a non-hostile author – so easier than analysis the fake – , but if find the methodology is say 99% correct on a non-hostile author, it would be a lot more convincing than if it is say 51% correct on a non-hostile author.

Bill

I also like the “more likely than not” Reminds me of the IPCC language that says there is at least a 51% chance (+/- 50%??) of something or other.

mikegeo

Nylo
Gleick said he got the memo and then he phished from Heartland. This analysis says the memo contains cut and pasted phrases from the phished documents. Hence the memo was created afterwards and not received by Gleick beforehand as he tried to claim.
So perhaps Gleick left every bit of ethics he had at that AGU posting and hence couldn’t help himself?

DirkH

Tom in Florida says:
March 14, 2012 at 7:37 am
“My, my how what goes around comes around. So the analysis eliminates Bast and the only one left is Gleick. Much like the reasoning of the AGW crowd that says once we’ve eliminated natural causes for warming the only thing left is man made CO2.”
Tom, imagine I send a document to some bloggers that I scanned, and that looks like a bad parody of a DoD document, that says “we plan to bomb country XXX on that and that day”.
I later “confess” that yes, I scanned it, and I got the original in the mail, and I thought it could be genuine so I sent it to the bloggers without telling them how I got it, letting them also believe it’s genuine – let’s assume for the moment that those bloggers are so gullible they don’t question the document.
Wouldn’t that make me just a tiny bit suspect.

Gail Combs

Ric Werme says:
March 14, 2012 at 6:40 am
Preponderance of the evidence, also known as balance of probabilities is the standard required in most civil cases.
It’s worth noting, especially for non-US readers, that this is a much lower bar than the criminal standard “Beyond reasonable doubt”……
_____________________________________
In other words it is good enough to get a search warrant but not good enough to hang him.

Louis Hooffstetter says:
March 14, 2012 at 7:57 am
> Has Gleick been charged with any crime(s) yet? If not, why not?
Not yet. I think there’s enough evidence to charge him, however:
1) He’s not a threat to cause bodily harm to anyone.
2) He’s likely not a threat to cause financial harm in the immediate future.
3) This may need to go through a grand jury.
4) Affected parties are in multiple states, the investigation and interviewing witnesses is not just a matter of driving around town.
There’s no rush. It’s more important to get the charges right and the evidence lined up than it is to arraign him with an incomplete investigation.
> If the Justice Department ignores this, that by itself is evidence of corruption at even higher levels.
Or it’s evidence of more pressing matters. Perhaps you can research other federal white collar crime cases and come up with a frequency distribution of time between opening the investigation and arraignment. One thing that may reduce the time in this case is there’s likely no ongoing criminal activity. When there is, the investigation time can be lengthened in order to get well documented evidence, videos, etc.
Patience.

HowardG

If possible I think it would be a satisfactory end if Heartland would agree not to press charges against Gleick for the crimes related to impersonation of a Heartland board member if Gleick would confess to authorship of the fake memo and apologize. This saves the taxpayers some money and resolves the who done it question.