Students develop tool to predict the carbon footprint of algorithms

UNIVERSITY OF COPENHAGEN

Research News

On a daily basis, and perhaps without realizing it, most of us are in close contact with advanced AI methods known as deep learning. Deep learning algorithms churn whenever we use Siri or Alexa, when Netflix suggests movies and tv shows based upon our viewing histories, or when we communicate with a website’s customer service chatbot.

However, the rapidly evolving technology, one that has otherwise been expected to serve as an effective weapon against climate change, has a downside that many people are unaware of — sky high energy consumption. Artificial intelligence, and particularly the subfield of deep learning, appears likely to become a significant climate culprit should industry trends continue. In only six years — from 2012 to 2018 — the compute needed for deep learning has grown 300,000%. However, the energy consumption and carbon footprint associated with developing algorithms is rarely measured, despite numerous studies that clearly demonstrate the growing problem.

In response to the problem, two students at the University of Copenhagen’s Department of Computer Science, Lasse F. Wolff Anthony and Benjamin Kanding, together with Assistant Professor Raghavendra Selvan, have developed a software programme they call Carbontracker. The programme can calculate and predict the energy consumption and CO2 emissions of training deep learning models.

“Developments in this field are going insanely fast and deep learning models are constantly becoming larger in scale and more advanced. Right now, there is exponential growth. And that means an increasing energy consumption that most people seem not to think about,” according to Lasse F. Wolff Anthony.

One training session = the annual energy consumption of 126 Danish homes

Deep learning training is the process during which the mathematical model learns to recognize patterns in large datasets. It’s an energy-intensive process that takes place on specialized, power-intensive hardware running 24 hours a day.

“As datasets grow larger by the day, the problems that algorithms need to solve become more and more complex,” states Benjamin Kanding.

One of the biggest deep learning models developed thus far is the advanced language model known as GPT-3. In a single training session, it is estimated to use the equivalent of a year’s energy consumption of 126 Danish homes, and emit the same amount of CO2 as 700,000 kilometres of driving.

“Within a few years, there will probably be several models that are many times larger,” says Lasse F. Wolff Anthony.

Room for improvement

“Should the trend continue, artificial intelligence could end up being a significant contributor to climate change. Jamming the brakes on technological development is not the point. These developments offer fantastic opportunities for helping our climate. Instead, it is about becoming aware of the problem and thinking: How might we improve?” explains Benjamin Kanding.

The idea of Carbontracker, which is a free programme, is to provide the field with a foundation for reducing the climate impact of models. Among other things, the programme gathers information on how much CO2 is used to produce energy in whichever region the deep learning training is taking place. Doing so makes it possible to convert energy consumption into CO2 emission predictions.

Among their recommendations, the two computer science students suggest that deep learning practitioners look at when their model trainings take place, as power is not equally green over a 24-hour period, as well as what type of hardware and algorithms they deploy.

“It is possible to reduce the climate impact significantly. For example, it is relevant if one opts to train their model in Estonia or Sweden, where the carbon footprint of a model training can be reduced by more than 60 times thanks to greener energy supplies. Algorithms also vary greatly in their energy efficiency. Some require less compute, and thereby less energy, to achieve similar results. If one can tune these types of parameters, things can change considerably,” concludes Lasse F. Wolff Anthony.

###

FACTS:

  • Deep learning (DL) can be characterized as an advanced artificial intelligence method whereby a model is trained to recognize specific patterns in large amounts of data, and then make decisions.
  • The total energy consumption of developing a deep learning model is typically many orders of magnitude greater than the consumption of a single training session. Hundreds of previous versions frequently precede the final design of a given model.
  • The open source Carbontracker programme can be found here.
  • The research article on Carbontracker was authored by Lasse F. Wolff Anthony, Benjamin Kanding and Assistant Professor Raghavendra Selvan of the University of Copenhagen’s Department of Computer Science.

From EurekAlert!

44 thoughts on “Students develop tool to predict the carbon footprint of algorithms

  1. Automating the process of “carbon sins” accounting.

    Soon AI will be used against all of us to to rapidly and universally tax us for our carbon sins so reparations can be paid and indulgences purchased for those so so financially able to.

    Me: “Alexa, how many carbon credits have I used today?”
    Alexa: “You have used 100% of your daily allotment. Would you like to purchase more?”

    What is coming if we allow those political creatures in-charge to proceed with their climate religion.
    Daily-monthly Carbon “sins” allocation will be treated similar to data usage on your cell phone. For most of us, the cell company has your credit card on file for your monthly billing to charge you for more Gigabytes of data if you go over your allotment for a period. To do that, the government carbon authorities will know everything about the average persons daily life to apply appropriate “carbon” expenditure taxes.

    Of course in true Orwellian fashion, none of this will apply to top party Apparatchiks and the very wealthy. Because as George Orwell observed about the future of humanity, “If you want a picture of the future, imagine a boot stamping on a human face— forever. ”

    • They’ll be able to buy a carbon “indulgence” just like Algore did for his mansion.
      (The sale of such contributed to the Reformation.)

  2. First it was BitCoin that was going to suck up all the energy and now it’s AI. Where are these giant computers that are using all this juice? Are there any photographs of these monsters? There are photographs and locations for electric are furnaces that use gobs of electrical power.
    YouTube
    Dunno, maybe I’m really confused about this.

    • That’s because they are using the room sized computers of the 60’s with only 28K of memory, all that carbon is taken up by having to switch out the floppy disks all the time for the sucker to run.

    • Horrifying? Reassuring! You are doing your part to green up the life sustaining, food producing vegetation on this planet. And simultaneously helping to delay the next wave of life killing ice age glaciers. Log on, log on!

  3. If you look at a problem from only one standpoint, you can come up with stupid ideas.

    Electricity isn’t free. For some reason, folks think it’s advantageous to spend a bundle on electricity to power a computing process. Presumably they’re getting some kind of payback that makes the expenditure worthwhile.
    Here’s a non-computing example of the same thing.

    In the 1950s we built networks of microwave repeaters to carry phone calls and television signals. When geosynchronous satellites became possible, they were a much more cost effective way to carry such signals. That’s in spite of the fact that it takes an insane amount of energy to put a satellite into geosynchronous orbit.

    When we spend a bunch of energy to train an AI, it is entirely possible that there is a net energy saving for society as a whole. That’s the calculation you need to do.

    • It’s being trained to provide a useful service so that it can silently become Big Brother and be too useful to go away. I would have no problems with things like Alexa if I could run the AI on my own hardware on the inside of my firewall.

      • You got that right.

        My wife will often say something to me and the phone, which is tucked away in her purse, will say it doesn’t understand. Given that it’s always listening and processing, I can imagine many bad outcomes.

        Once upon a time I made a mistake filling out the part number on a requisition form. The bozo who processed the requisition didn’t even read the description and sent us a bunch of very inappropriate stuff. I wondered what would have happened if I had accidentally filled in the part number for battle tanks. Just replace the bozo at the stores depot with an AI and you see where I’m coming from.

  4. One of Apple’s many Data center in Prineville, Oregon.
    https://www.google.com/maps/place/Apple+Data+Center/@44.2900681,-120.8774385,4171m/data=!3m1!1e3!4m5!3m4!1s0x54b92df2efcec727:0x9f2c4cd9831a710d!8m2!3d44.2893465!4d-120.8750904?hl=en&authuser=0

    Note: the Google Earth-view shows it still under construction. It is finished now, and still has lots of room (acreage) for future expansion. Places like this are selected for access to inexpensive electricity and for room for growth. Spreading out across former farmland in well-run conservative leaning communities are what cockroaches do. Prineville provides an attractive medium size communities for Apple technicians and engineers and their families to live with affordable housing, good schools, and a safe community, leaving behind the social Hell of unaffordable housing, crime, and homelessness population explosions that are just west of I-5.

    • It’s not just the town itself – go south – 27 along the Crooked River – any number of campgrounds. In the mid 90’s the trout fishing was very acceptable.

      It just might be that a rural setting with outdoor recreation might give the employees a different view of the world.

  5. Are you kidding me?

    REALLY?

    This is absolutely ridiculous. I am learning AI algorithms now (I should say re-learning, a 20 year hiatus in my career has me starting over) and have learned quite a bit about inefficiencies in code. There is no 1 elegant solution to coding but what I’ve learned is to strive to make your code as efficient and elegant as possible.

    So I ask this question…..how efficient is their algorithm to begin with? How many hours of sitting at a computer in a lab sucking up “carbon juice” was spent in order to write such a ridiculous AI? And what was it written in? Efficient language or something cumbersome and excruciating in it’s complexity?

    • You have to consider the whole cost.

      Writing and maintaining software is often your biggest cost. In that case, programmer productivity is the issue you need to deal with.

      When I was a pup, some people were still arguing that writing in assembler produced the most efficient code. That clearly didn’t work.

      The other problem with ‘elegant’ code is that it isn’t necessarily readable by the next programmer who has to maintain it. My favorite example of that was a line of indecipherable code with the comment, “#Cute eh” or something like that.

    • Google cares very little about code efficiency. They have so many machines only scalability matters. A lot of the off line processing is done with relatively inefficient map reductions, written in C++ or Java, whose power comes from scalability as a parallel sorting method that can be spread across many thousands of processors at once. These continuously create the many lookup tables that makes returning results fast. It’s all done on a distributed computing system called ‘Borg’ which by now is likely to be the biggest distributed supercomputer in the world.

  6. We should expect Silicon Valley advanced AIs to start commiting suicide. I cant stand to exist with my large carbon footprint.

    Making them as smart as employment people made them sexist. Put in selfreflection with destroy them.

  7. Save energy. Don’t use AI- Modeling software. We see how accurate in climate modelling predictions it is. Should really lower the carbon footprint.

    • A few years back I concurred with Steven Mosher’s thought that the climate science community really needed to come up with some criteria to toss out the individual models that are essentially unfit for use from an engineering or clinical/medical decision making perspective. Given how much energy is needed to run the models(1) I would of thought this would of already happened.

      (1) https://www.hpcwire.com/2016/06/09/lawrence-livermore-facing-exascale-power-demands/

      ….The facilities team uses it for performance but also for looking at anomalies. Bailey shared that while they were bringing up Sequoia, they saw some large variations in the load, ….”SPECIFICALLY THERE WERE RECURRING INTER-HOUR VARIABILITIES THAT WERE EXCEEDING 8 MW BECAUSE THE MACHINE WAS DROPPING FROM 9.6 MW TO 180 KW”…..

  8. Quote:
    algorithms churn whenever we use Siri or Alexa, when Netflix suggests movies and tv shows based upon our viewing histories, or when we communicate with a website’s customer service chatbot.

    I don’t do any of those things.
    Am I missing out on something?

    • I’ve been buying things from Amazon for six years or so. I am not a big customer, I don’t order something on any kind of schedule frequency, but there are some patterns (in my mind) about items, type of items, and general areas of relationships between items.

      Every time I order there are suggestions made about what else might interest me. The suggestions are comically inept, like a neighbor’s dog trying to get me to play something with it but not having much of a clue about what I feel about the idea.

    • Those Netflix-type algorithms keep suggesting movies I’ve already seen and I get stuck in some self-referencing loop.

  9. I’ll throw in my 2 cents worth.
    First of all, “Green” is an opinion, not a “Fact”. So it can’t be “optimized” in the first place.
    The best way to optimize their power usage is to, if possible, choose the cheapest power available. It will have the least effect on the economy and the least effect on the world as possible. Trying to optimize by choosing the cheapest of the most expensive ways to run the program greatly limits the amount of development and delays development of the program, if the program is actually useful(kinda doubtful at this stage).
    On pattern recognition in general, the bigger the program and the bigger the dataset the more likely there are random repeated patterns that are meaningless. This is especially true of the climate because we know that there are many patterns that must affect it- rotation of the earth, the moon, the orbit around the sun, the multiple, barely understood cycles on the sun, the orbit of the sun in the galaxy, the hourly, daily, weekly, monthly, yearly cycles variations in the weather(the basis of Climate).
    The accurate data on climate behavior is extremely limited-barely 100 years for a few such as sea level and temperature, and, despite best efforts, paleoclimate beyond a hundred years is built a mountain of statistical studies each with its own limitations. Tree ring studies are a prime example of trying to extract information from barely defined source, subject to lots of possible errors.
    Using Google, Siri, and Alexa as models is going to lose. Google’s model is very adept at pointing ads at me that mimic a previous topic almost 100%. Same for Alexa. There is no way to predict what I am looking for when I am just “browsing” because I don’t know. But the search engines go completely off the rails after the 3-4 suggestions I put in the search bar.
    Siri and other navigation programs also get easily sidetracked. Siri, like Windows 10, seems to get updated at least once a week. Every update puts it off track and it has a hard time finding a route it suggested last week. Every possible destination has only 2-3 really good routes. If you deviate to a better route you know the program tries mightily to get back on its track eventually flops over to several routes that are 50 miles or more out of contention.

    3 Cheers for Modern Miracles such as MAIL-IN VOTING. What could possiby go worng?

  10. I’m sure this wonderful software based only on models and not the real world where this doesn’t even matter, can also answer the question of how many angels can dance on the head of a pin, and even whether they are dancing the boogaloo or a waltz…

  11. What happens when two algorithms start arguing with each other? Or more importantly: What happens if they don’t? Do we get the patter of tiny carbon footprints as a result?

  12. Next, calculate the carbon footprint of climate “science” students in University vs. able-bodied teenagers panhandling. You’d have to define the difference between the two activities, of course, which might be most serendipitous.
    One wonders if the researches used a slide rule and abacus in their calculations.

  13. Hey . . . coming next: the dangerous carbon footprint of the mathematical equation “1+1=2”.

    Remember, you read it here first. And by the way, just the mathematical proof of Fermat’s last theorem, publicized in 1995, may have added a whole .01 °C to global warming back then . . . who knows for sure?

    BTW, mea culpa for the carbon footprint it took to formulate, type out, transmit and post this comment.

  14. It would then follow that one of the drivers of Climate Change is … running Climate Change models.

    Seriously, the last paragraph is classic academic thinking, i.e., nonsense. Moving energy-consuming activities to locations with “greener” sources simply absorbs those finite sources. The increased demand is met by adding sources that are less “green”. The exception would be building geothermal plants in Iceland to power server farms, but I understand that is already being done.

    • “It would then follow that one of the drivers of Climate Change is … running Climate Change models.”

      +42 thousand intergalactic credits!

      • … and don’t forget that the carbon-tracking algorithms themselves emit CO2 … so we should develop carbon-tracking algorithms for carbon-tracking algorithms … and carbon-tracking algorithms for … etc., etc. Is this an infinite regressive loop?

  15. ok so the developed world gets more computers.
    How much extra energy does this computing power use vs the extra energy from items such as washing machines as the developing countries improve their quality of life.

  16. I wonder if anyone has computed the amount of power used for bit-coin mining.

    Given the finite number of bitcoins, and the (designed-in) increasing difficulty of generating new coins, it should not be impossible for someone with time and intelligence (that rules me out!) to at least get a rough approximation.

    Then multiply the last few years’ consumption by all the other bitcoin-look-alike crypto currencies.

    Then submit a quick grant application to fund the work you’ve already done in your spare time, and six months later announce the shocking truth!

  17. From the above article: “. . . the energy consumption and carbon footprint associated with developing algorithms is rarely measured, despite numerous studies that clearly demonstrate the growing problem.”

    EXACTLY! This is the prime reason that all mathematicians and scientists should seek closed-form solutions, and not to be so lazy as to resort to algorithms to obtain a converging approximation of truth. 😉

    • … let’s see … the ratio of the diameter of a circle to its circumference is 3.14159265359⋯ where should I stop in my approximation? Oh, yeah, the closed form is π.

  18. I gag every time I see academics publish code with zero test coverage. “Hey, our code solves major problems but we don’t think it needs any confirmation of its behavior.” Put another way, “Hey, this new bridge we built we’ve never actually driven over. Good luck, have fun, don’t die!”

Comments are closed.