Can 'big data' make sense of the big sticky mess of 'climate change'?

Big-Data-Kitty[1]This press release via Eurekalert reads more like an advertisement than it does some serious science. But then,we are dealing with a science that in some cases has lost all sense of seriousness, such as the bonkers claim that “climate change” will start killing off felis catus en masse in just a few years.

Within a mere nine years, global warning could produce temperature spikes so elevated as to generate massive cat mortality? The idea is so ludicrous that I hardly know where to begin.

Source: Geocurrents. Eco-Authoritarian Catastrophism: The Dismal and Deluded Vision of Naomi Oreskes and Erik M. Conway

h/t to Bishop Hill for that one.The abstract of the sales pitch paper they are citing starts out like this:

Global climate change and its impact on human life has become one of our era’s greatest challenges. Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data. This is a stark contrast from other fields such as advertising or electronic commerce where big data has been a great success story.

As a result, big data–induced progress within climate science has been slower compared with big data’s success in other fields such as biology or advertising. The slow progress has been vexing given that climate science has become one of the most data-rich domains in terms of data volume, velocity, and variety.

Of course they are assuming the climate data is all valid, like so many people assume Mann’s interpretations of tree ring data is valid.

So please excuse me if I think that “big data” analysis might only lead to big ludicrous Oreskian style claims, especially when it is packaged as a sales pitch like this one.


 

Big_dataNew Rochelle, October 14, 2014 –Big Data analytics are helping to provide answers to many complex problems in science and society, but they have not contributed to a better understanding climate science, despite an abundance of climate data. When it comes to analyzing the climate system, Big Data methods alone are not enough and sound scientific theory must guide data modeling techniques and results interpretation, according to an insightful article in Big Data, the highly innovative, peer-reviewed journal from Mary Ann Liebert, Inc., publishers. The article is available free on the Big Data website.

In “A Big Data Guide to Understanding Climate Change: The Case for Theory-Guided Data Science,” James Faghmous, PhD and Vipin Kumar, PhD, The University of Minnesota–Twin Cities, explore the challenges and opportunities for mining large climate datasets and the subtle differences that are needed compared to traditional Big Data methods if accurate conclusions are to be drawn. The authors discuss the importance of combining scientific theory and First Principles with Big Data analytics and use examples from existing research to illustrate their novel approach.

“This paper is a great example of leveraging the abundance of climate data with powerful analytical methods, scientific theory, and solid data engineering to explain and predict important climate change phenomena,” says Big Data Editor-in-Chief Vasant Dhar, Co-Director, Center for Business Analytics, Stern School of Business, New York University.

###

About the Journal

Big Data , published quarterly in print and online, facilitates and supports the efforts of researchers, analysts, statisticians, business leaders, and policymakers to improve operations, profitability, and communications within their organizations. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address the challenges and discover new breakthroughs and trends living within this information. Complete tables of content and a sample issue may be viewed on the Big Data website.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
57 Comments
Inline Feedbacks
View all comments
October 15, 2014 7:15 am

Leveraging big data! First this is from a business school! Second, for a real theory of science, the hypothesis has to be strong enough to have been suggested with little data. Don’t forget, the “theory” we are ‘stuck with’ in climate science was promulgated at the beginning of all this in the 1980s. Now we have 30 years of Big Fiddled Data that came into being to shore up the original ‘theory’. So the idea is, if we manipulate the sea of ‘data’ we will tease out what we need to support the theory. If it doesn’t look right, we have an arsenal of novel statistical tools to effect it. 97% of climate scientists won’t except that the little data scenario (even squeezed and twisted as it is beyond recognition) has permitted the CO2 control knob to be falsified. Big data won’t help. Its already been bled dry.

Jimbo
October 15, 2014 7:23 am

“This paper is a great example of leveraging the abundance of climate data with powerful analytical methods, scientific theory, and solid data engineering to explain and predict important climate change phenomena,”

After billions of Dollars have been spent on climate research, super-computers, etc. the IPCC comes out will embarrassing failed projections. You could quadruple spending and data and you will still be wrong. Climate sensitivity does not care about data.

Harold
October 15, 2014 7:35 am

A non-solution in search of a non-problem.

Reply to  Harold
October 15, 2014 1:12 pm

Good definition.

David in Texas
October 15, 2014 7:35 am

“Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data.”
“Little impact”!? It produced ‘settled science’. What more could you want?

more soylent green!
October 15, 2014 10:15 am

GIGO
/’nuff said

David Small
October 15, 2014 11:35 am

“Big data” is only a tool, not an end to itself. The tools are powerful if used responsibly and intelligently. They are only as good or bad as the people applying them. They must be applied by people with a deep understanding of the underlying science and the statistics that are inherent to the methods. Unfortunately, very few climate scientists understand statistics and even fewer people who understand statistics know the first thing about the atmosphere. I fear that the method will be poorly used to “prove” catastrophic global warming and then discredited and discarded. That would be unfortunate because they have proven powerful for making short-range forecasts of extreme events.

AnonyMoose
October 15, 2014 12:12 pm

Instead of starting with theory-guided data, how about letting more of the statisticians who understand big data process the raw climate data to see what real statistics discovers? Don’t start by massaging the data to fit the theories.

TerryMN
October 15, 2014 12:53 pm

“Big Data” is a buzzword. The platform for it, 98% of the time, is Hadoop. And it’s just that – a data platform – that lets you (whether you’re an internet property, a financial institution, a climate scientist, a manufacturer, or whatever) store, process, and analyze a decent, large, or huge amount of data cheaper and faster. It was invented by Google in about 1999, and after two papers published by them in 2003 and 2004, re-implemented as OSS by Doug Cutting and Mike Cafarella.
It’s the platform that is the backbone Facebook, Twitter, LinkedIn, Yahoo, and pretty much all good sized web properties you can think of. In the last 5 or 6 years, it is being adopted by lots of enterprises and 3 letter government agencies because of its lower cost and better performance.
Lots of tools that ran in legacy environments, like SAS, R, etc. now run on top of Hadoop, and lots of statistical analysis are now run on this platform. That isn’t to say it does your stats for you, or anything like that – it just provides a cheaper and faster platform to run large scale data storage and processing on.
Last, a decent sized cluster has hundreds to thousands of cores, several terabytes of RAM, and room for several petabytes of data, co-located with compute, so is generally a great (in terms of economics and speed) platform to run climate models on – with all of the above mentioned caveats about garbage in/garbage out. Using an execution framework like Spark, they can probably run their models lots faster, on lots more data, for cheaper than what they’re doing today.

kenw
Reply to  TerryMN
October 15, 2014 1:51 pm

…so it is still GIGO…. just hi-res GIGO done really, really fast.

TerryMN
Reply to  kenw
October 15, 2014 2:19 pm

It can be. Again, it’s just a platform that lets you store and process more data cheaper and faster. For most workloads, it’s the same model/algorithm/whatever, just cheaper and faster. I work with a few companies who went from around a day to sequence a genome in their old environment to between 10 minutes and an hour. On a smallish cluster. Nothing new, but it allows them to do a few orders of magnitudes more of sequences, so that many, many things that wouldn’t have been explored before can be now.
Because of the economics and hardware architecture generally used in legacy HPC, there are some use cases that weren’t possible with a legacy environment that are now. But mostly think of more questions, not better ones – in a very general sense.
As with any HW/SW platform, GIGO is always a possibility. Just think of it as a bigger, faster, cheaper mouse trap or tool, depending on your level of cynicism. 🙂

Michael J. Dunn
October 15, 2014 1:17 pm

Awwww… And here I wanted to learn more about that adorable cat.

Tommy E
Reply to  Michael J. Dunn
October 19, 2014 12:06 am

Being a big data practitioner and having built several large gigaflop high performance compute clusters in a multi-megawatt data center to perform computational studies on petabytes of data developed from research in x-ray crystallography and nuclear magnetic resonance imaging to screen for candidate compounds suitable for drug development at a big pharma company, I can tell you that when I come home from work and have to explain how my day went, my twenty year old blind arthritic hypertensive hypothyroid weak kidney example of Felis Silvestris Catus will only purr if she (1) has a warm lap to sit in, or (2) is allowed to “listen” while curled up on a floor register. She doesn’t care how big the data is, or how it was sifted through, or what results were returned, unless there is more warmth at the end. Oh wait … she only cares about man-made warming! I might have been harboring a seven pound warmista all this time and never even knew.
It IS worse than we thought!
LOL, as I am writing this, John Gleese and Taylor Swift are sitting on Graham Norton’s couch, and John just insulted Taylor about her cat, which happens to be a Scottish Fold, just like the one in the above photo. You can watch it at … http://www.eonline.com/news/587587/taylor-swift-s-cat-insulted-by-john-cleese-on-the-graham-norton-show-watch-and-find-out-how-she-reacted

ROM
October 15, 2014 2:11 pm

The most prominent and accomplished of the climate catastrophe predicting scientists are very quietly vacating and fading from the scene as the Great Catastrophic Manmade Global Warming racket steadily unravels at an ever faster rate.
As room is now appearing at the top in the climate catastrophe sensationalist end of the media as the climate big wheels retreat from their previous dire predictions, the third, fourth and fifth rate wannabe “scientists” [ ?? ] with their breakfast cereal packet degrees are rushing in to get their share of the honours and glory as predictors of even more far fetched and extreme climatic futures.
Catastrophic “future” predictions, always “future” predictions, which finally guarantee them a level of prominence and public exposure in the lowest intellectual strata of the sensationalist end of the media.

JeffC
October 16, 2014 7:07 am

Big Data sells itself as the magic bullet that will find the signal in of the chaos … its pure BS … its a way to sell hardware and software to gullible managers … remember big data doesn’t measure anything … its just storage … the problem is and will continue to be not enough sensors evenly distributed on the planet … big data can’t solve that …

Reply to  JeffC
October 17, 2014 5:24 pm

Yes Jeff. You are so right. If we have trouble with forecasting the weather what does this tell us? Warnings are better than no warnings, but sometimes an extreme weather event can pop up so quickly and only effect say a small specific part of a region, it can’t be helped.