NOAA to ‘release the code’

From the “all you need is a multi-million dollar supercomputer” department comes this sort of good news.

NOAA launches new approach to accelerate innovation and new science in weather modeling

Partnership will make weather and climate modeling code public

March 11, 2020 NOAA is accelerating the advancement of numerical weather prediction — the backbone of life-saving weather and water forecasts in the U.S. — by sharing the first batch of computer code behind National Weather Service models with the scientific community.  Today, NOAA released the first version of user-friendly code for medium-range weather prediction in an open, collaborative development environment. This new approach of collaborating across the Weather Enterprise is an effort to engage the community to improve NOAA models using the Unified Forecast System (UFS).

Sharing this code will enable academic and industry researchers to help NOAA accelerate the transition of research innovations into operations. This UFS code is being developed by a broad community and is openly available to the public, with documentation and support for users. In February 2019, NOAA and NCAR announced a partnership to design a common modeling infrastructure, marking NOAA’s shift toward community modeling. 

“Sharing NOAA’s model code with the broader scientific community will help us accelerate model advancements — with the ultimate goal of co-creating the best operational numerical prediction system in the world,” said Neil Jacobs, Ph.D., acting NOAA administrator.

“We invite researchers and modelers around the world to download and work with the code, so together we can advance numerical weather prediction to improve life-saving forecasts and warnings.”

The success of UFS will be bolstered by NOAA’s Earth Prediction Innovation Center (EPIC), made possible by Congressional authorization in 2019. EPIC will ensure that the UFS efforts are facilitating research and development in the research modeling community whose efforts can then be targeted to make improvements in operational forecasts.  

“At the National Weather Service, we are excited about this first step to make operational model codes available to scientists and students around the world, knowing that they will help advance our unified forecast  system that will provide the basis for all of our weather, water and climate forecasts,” said Louis W. Uccellini, Ph.D., director of NOAA’s National Weather Service. 

UFS will enable NOAA to simplify its production suite of forecasting models from many independent systems, each of which has to be improved and maintained separately, to a single seamless modeling system with fewer, more comprehensive applications. UFS applications, which each provide guidance for a particular forecast, span local to global domains and predictive time scales from sub-hourly analyses to seasonal predictions. 

The first release of a UFS application is the UFS Medium-Range Weather Application version 1.0, which targets predictions of global atmospheric behavior out to two weeks. The software is now distributed and maintained through GitHub, and the release of additional applications are planned in the coming year. NOAA will host workshops and provide supportive documentation alongside the applications to facilitate its use by the broader community. Collaborating researchers can use the application in real-time and promising research code will be considered for inclusion in future versions of the operational model. Future releases of model code will enable the research community to continue to advance them for operational use. NOAA and the modeling community also worked together to ensure the code is ready for use by students at the graduate level.

On the heels of a major supercomputer upgrade announced in February and an upgrade to its Global Forecast System last summer, NOAA is pressing forward with this next step in the effort to build a true community weather forecast model and improve forecast accuracy to save lives and protect property nationwide.


57 thoughts on “NOAA to ‘release the code’

  1. A good idea, but a bit limited initially.

    The notes say:

    There is no data assimilation capability.
    A verification package is not included in the workflow.
    The atmospheric model is not coupled to ocean, wave, or ice models.
    A limited set of developmental physics suites is supported.
    Future releases will address these limitations and add other improvements

    • Looks to be reams of consistency issues.
      Nor is there a rough plan what could be achieved; e.g. repeatable results for 1 day, 2 day, 3 day…

      I saw fortran length concerns; which suggests the code is still a mishmash.

        • Steven,
          Can you pass it? If not, maybe you should follow your own advice and just shut up.

          In the end, this is just software, so professional Software Engineers with decades of experience should be able to comment on basic software issues, regardless of the application. Though I suppose you with your, oh so important degree in English, feel qualified to disqualify us.

        • This is one reason why posting open code is difficulty. random dudes come in and make vague comments and then run away, rather than pitching in to address the issue.

          ROFLMFAO!!! We can’t have that! Random dudes making vague comments! Of course, you’ve never done that……..

        • Absolutely right.
          You would have to spend a great deal of time analysing the code to even make a half decent suggestion. You would also have to be very well versed in Fortran90 (It’s only 30 years out of date!)
          The biggest drawback to the project is that all the best minds are working on how to predict when you will buy your next can of dog food.
          I hope it’s a success. But it’s a niche area with what looks like a lot of legacy code.

        • OMG – It’s a code monkey cut and paste:

          This test assesses how easy it is to access and run a UFS code, modify a physics parameter, re-run the code and then compare results.

          So let’s see how proficient you are at clicking the right shiny thing.

  2. so this ‘supercomputer’ could tell me what to wear for my holidays in 2021…….wellingtons (jumboots), or get the shorts out again?? I will wait and see………regards, from Trevor in New Zealand.

  3. Too many cooks does what? I don’t think this is how the European model got to be better.

    Not the NOAA code or anything, I just like to say it. Hardly ever wrong.

  5. A step in the right direction.
    For my thesis I had to do (at that time) a lot of fairly complex matrix algebra coding (in Fortran) with very limited computer run time availability. It was possible to do a run and then spend a week or two revising (subtle bugs) , improving (closer to the theoretical math being simulated) , simplifying (speed up) the code for the next run. You don’t need a supercomputer to improve supercomputer code. You just need to know how a massively parallel supercomputer works.

    • Not many of us have our own supercomputer, or even access to one. However, it does offer the opportunity for the more experienced FORTRAN [note:Microsoft spelling!] programmers to review the code for obvious mistakes, and possible run some of the subroutines to see if they can be broken.

  6. On a brighter threadjacking note: (US) Folding@home is now able to distribute a “22” calculations project for GPU’s that is working on COVID19

    Just create a GPU slot and use “client-type” “advanced”

  7. How about they release all non-doctored data instead and we will build our own, verifiable models?


      • Mr. Stokes,

        1) GHCN-CONUS does not agree with USHCN, except prior to 1989. Interesting.
        2) 400+ stations blocked out of reporting since 1989
        3) massive redacting of recordings, marked -9999 in the .dly files
        4) Estimate records

        • USCHN was replaced in March 2014.
          At that time, all USHCN stations were in GHCN V3, and USHCN raw data was the same as GHCN unadjusted.
          USHCN stations made wide use of the Co-op network. People who maintained these discontinued for various reasons (getting old etc).
          Datasets are never perfect; there is always missing data. Rather uniquely, NOAA makes available the original B19 forms submitted by the observers. You can see where the missing data originates.

          I’ve no idea what “Estimate records” means.

          • USCHN was not replaced. You know that. It updates constantly, except short 400 stations which have been blacklisted. Those stations did not “drift away” or “get old.” That is your delusion. Why, when untold billions and billions of taxpayer money gets poured into climate study, would 400 stations, almost all of which had been proudly reporting for 80-100 years, “SUDDENLY” stop? Nope.

            Even that does not explain why USCNH does not agree with GHCN-CONUS.

            The so-called “missing data”, marked -9999 by NOAA, is not “missing.” It is redacted.

            In the USHCN there are about 6% records marked “e” for estimated. Those are gridded or pairwise homoginized.

    • Hi Oddgeir,
      Which original data set is unavailable? And what is stopping you building your own model now?

  8. If nothing else, it should provide a window into the assumptions which is quite important. An open, collaborative approach can identify shortcomings that result in erroneous forecasts. Of course, it’s a big jump from successful short-term forecasts and accurate and reliable climate models.

  9. This would down grade weather forecasting to 5th century technology – flat Earth, and 19th century technology – atmospheric carbon dioxide traps heat when it’s not dried to remove the water. And the code is a mess – it’s had to many cooks already. I would guess there’s no one left at NOAA who understands the code well enough to continue the hacks. But then how hard is to cool the past 1 degree and heat the present by one degree? Then use the middle segment of the resulting temperature range as the “average” and calculate the deviation ? The only reason they need a supercomputer is to adjust the resulting temperatures.

  10. Open source has proven to be worthwhile in many applications. There are quite a few outstanding coders that can provide a lot of help and maybe they will “think outside the box”. As bad as the US model is, it certainly won’t be worse and I think it is worth a shot.

    They just need to get some people on board that knows how to manage a large scale open source project. Call Red Hat (what’s left after the IBM merger). They built a multibillion company on open source.

  11. Gee, thanks.


    1) USHCN raw data prior to ‘adjustments,’ gridding, flagging, and “GoneMissing -9999”;
    2) Data from 400+ stations blacked out since MannDay in 1989;

    This is the taxpayers data. Hand it over.

  12. After specifying some metric defining accuracy of prediction, there should be two open-source teams working on this:

    1) One team headed by Anthony Watts
    2) One team headed by Michael Mann

    All glory, laud, and honor going to the victors. (No cheating, tricks, or fudge factors allowed.)

  13. Nick,
    Is it not time to agree that the customary historic daily temperature data collected around the world for decades, is simply unfit for purpose when that purpose is to assist in GCM type calculations that rely not only on accurate data, but data that are valid for the subsequent computational math?
    As noted before, the traditional Tmax +Tmin halved to give a Taverage do not compare well with (say) the average of 48 daily T observations taken a half hour apart, the latter method plausibly more relevant to climate modelling.
    You can make only so much progress with noisy, erroneous data before your assumptions become so heroic as to be misleading. There are so many, often uncorrectable errors in these data sets that as noted, they are unfit for comate modelling purposes or for global climate policy.
    You know as well as I do about these errors for Australia. You seem to accept the “near enough is good enough” approach. I prefer the ddep dig into the data to understand srength and weakness. There is a lot of weakness there, beyond the depth that most people have dug, or want to dig.
    Note that I am not claiming that these temperatures are direct inputs to models. They are, however, used to create impressions of past climate for testing model hindcast and in this they are misleading. Badly so.
    Deep digging supports about 0.4C of national Australian warming, not the official claim of 0.9C for the Century since 1910.
    Can we at least agree on that, for nobody seems able to repudiate. Geoff S

    • Geoff, I’m challenging your premise about direct measured temp. Far from agreeing that it is time to abandon the dataset, I suggest we agree to restore it to find out if there is any abnormal warming. Here’s my radical counter:

      Who cares if the 1-billion recordings of TMAX and TMIN* in GHCN are “erroneous?” Stop caring if you do.

      If this is the crucial question: “Is there indication of abnormal warming or cooling” in that record, then do not adjust anything. Do not redact anything. Do not homogenize anything. Do not substitute modelled recordings for actual. Do not blacklist 400 stations from USHCN.

      Just MapTheBillion. You will see the trend as a sine curve over 120-150 years. If there is abnormal warming, it will show up.

      *I agree that TMAX-TMIN/2 is a distortion. Just pick one or the other and map it. The curving (not linear) trend will emerge. [the actual number of TMAX recordings 1900-2019 is 450,396,126 by the way]

      • windlord-sun. –> “*I agree that TMAX-TMIN/2 is a distortion. Just pick one or the other and map it. The curving (not linear) trend will emerge. [the actual number of TMAX recordings 1900-2019 is 450,396,126 by the way] ”

        It not only is a distortion but it begins the hiding of information, i.e., variance of the temperatures. Averaging a station with a variance of 10 degrees with a station that has a variance of 25 degrees is absolutely ridiculous.

        Tmin or Tmax should be treated separately so we can see what is happening to day and night temps.

        • “Tmin or Tmax should be treated separately so we can see what is happening to day and night temps.”

          Yes. You can see both on my site linked in previous post. I am an unfriend of linear trend lines.

          • Your graphs bring the results of the climate model predictions into stark relief when compared with reality.

            Average global temps are only good for hiding data, especially when the measurement devices exist at different altitudes, latitudes, etc. Actual heat content, i.e. enthalpy, is related to both kinetic energy (think temperature), to pressure times volume, and to humidity (i.e. mass). Averaging temperature readings at 5000 ft above sea level with temperature readings at 0 ft above sea level makes little scientific sense, at least to me. It’s apples and oranges.

      • Winlord,
        The time of day that Tmax is shown on the recording LIG thermometer varies a lot. This is because of competition between various effects, some trying to increase T, some to decrease, until a maximum is reached for the day.
        Some of these competing effects, like change of whitewash properties of the screen, act in the long term and have nothing to do with climate. Another effect, say a nearby shade tree falling down, can be short term but again has nothing to do with climate. Arguing by extension, the Tmax sits in a sea of errors that are unrelated to climate. We can get some feel for the magnitude of short term errors by profiling the T change over a day to see how representative the simultaneous Tmax is, but this has not been done for the large majority of those 450 million observations. When it is done for modern times, the instrumentation is invariably different to an extent that we cannot recreate.
        One of the biggest influences on Tmax is local rainfall, which will often also involve local cloud effects of unknown size. One can use regression analysis to “correct” for rainfall effects, which are typically larger than the figure for a century of global warming.
        Because rainfall is related to climate, we can ask if we should use raw Tmax or rain-corrected Tmax in subsequent physics equations in modelling. Until we know an answer to that, we have no option to accept that Tmax is an approximation with large unknown errors that should not inform modelling that is used for setting policy on remediation etc.
        Given that each observing site has it’s own set of properties affecting Tmax, one cannot do stats by combining many sites, because they are not samples from one population. Good by law of large numbers.
        Much the same applies to Tmin, with some different competing factors not affecting Tmax. It is absolutely incorrect to average them for a Tmean, because the are measuring different influences.
        Now, refute any of this if you can. (You cannot).

        • You countered my post by ignoring all my arguments, then gave a lecture on Climate Orthodoxy 101, which no one here needs.

          So, there is nothing to refute.

          Here, I’ll make you a trade. You get NOAA to release the raw data that has been marred, and get them to post the ~4-million missing records from USHCN from blacklisted sites. In return, I’ll concede that improving accuracy of measurement is a good thing for modelling … but ONLY for the future. The trend of the past and present (no abnormal warming) stands.

          • Everyone here needs to understand that the lazy science of accepting these old temperatures as suitable for modern modelling calculations has to be questioned over and over until it is realized how misleading they are. I have been through the exercises that you mention. Few others have, they simply take T for granted. Not good enough. Geoff S

          • “the lazy science of accepting these old temperatures…”

            Is that why NOAA damages and hides them? Because it has decided to model on vapor data instead, and doesn’t want anyone to realized how misleading they are?

            You can model all you want, rooted in anything you choose, but if you make a claim other than what 1-billion direct measurements shows – that there is no abnormal warming – and claim instead there is, you have to explain why your warming did not show up when 40,000 stations measured the temp.

  14. There is no point to expensive code in the absence of high quality data for input. There are significant examples of this defect, for example the part it plays in GCM temperature projections that each year show substantial divergence from measured temperatures.
    There are better ways to spend big dollars. Geoff S

    • Trying to distinguish 0.01deg differentials from data that has, at best, +/- 0.5deg uncertainty is a losing battle. Even trying to distinguish 0.1deg differentials is impossible. It’s mathematical buffoonery. 2 + 2 is *not* equal to 4.0.

      Uncertainty intervals are never shown in the graphs of CGM outputs. That’s because they are off the page. If the scales were decreased to show them you couldn’t see any annual differentials on the page!

      • “Trying to distinguish 0.01deg differentials from data that has, at best, +/- 0.5deg uncertainty is a losing battle.”

        Agree, for the ranking of “hottest months/hottest years”. They are just MSM mentions. But for statistically/physically significant trends, we’re talking much larger changes. If you do the trend evals correctly, you find that the chances that they are directionally incorrect are too small to calculate on home computing equipment…

        • “But for statistically/physically significant trends, we’re talking much larger changes.”

          No way. If uncertainty of basic data is +/- 0.5 degrees, you can not reduce it statistically. These data are one time measurements and have no way to reduce uncertainty thru multiple measurements of the same thing.

          • “No way. If uncertainty of basic data is +/- 0.5 degrees, you can not reduce it statistically. ”

            I agree. The chances ARE often calculable on home equipment. Let’s look at UAH6 data, from 1980 to present. If you define “+/- 0.5 degrees”, as a 95% chance that any one monthly reading will be within those bounds, then the chance that that trend is not positive, is 2.99615E-39. Most iterations of this are even lower, but yes, you CAN calculate it. Thanks for correcting me.

            See Central Limit Theorem for more information…


          • Jim, rookie error on my part. I calculated the trend, but failed to include the standard error of that trend. If you evaluate using both the standard error of the trend and the proper standard deviation of each monthly data point, then the chances of that trend being negative jumps all the way up to 4.32E-28. I’ll try and be more careful in the future…

          • You need to review somenpast threads here about uncertainties and their propagation. The Central Limit Theory and dividing by (sqrt N) don’t apply to uncertainty propagation in this case.

          • “You need to review somenpast threads here about uncertainties and their propagation. The Central Limit Theory and dividing by (sqrt N) don’t apply to uncertainty propagation in this case.”

            I gave you the data. You can easily replicate my results on any calc freeware and a computer newer then 10-15 years old. But more likely, you’ve been listening to Pat Frank again. When one of the first tenets of Engineering Statistics 101 does not apply, in a textbook application in superterranea, that’s all I need to know about the few dozen circle enabling denizens of Watts Up….

          • bigoilbob,

            “When one of the first tenets of Engineering Statistics 101 does not apply”

            Jim Gorman is correct. Statistics, be they engineering statistics or something else, applies to a population that can generate a probability density, i.e. multiple measurements of the same thing using the same measurement device. Single measurements of different things by different measuring devices do *NOT* represent a population generating a probability density.

            As far as *trends*, you totally ignore the fact that the uncertainty is Plus AND Minus. If you can’t determine the true value exactly then you cannot calculate a trend either. It’s like I said, the uncertainty interval associated with the climate models is off the page – in both the positive and negative directions! Meaning it’s impossible to identify a trend due to the uncertainty!

          • Tim Gorman

            “As far as *trends*, you totally ignore the fact that the uncertainty is Plus AND Minus.

            Uh, no. I specifically used Jim’s +/- o.5 degrees to find the probability density function for every one of the 480+ individual monthly data points in my example.

            ” It’s like I said, the uncertainty interval associated with the climate models is off the page – in both the positive and negative directions! ”

            An opinion with nothing to back it up. “Off the page” is not a statistical term. Data is data, from whatever source. It all has uncertainty and correlability between -1 and 1 (all known), and can be statistically “aggregated” for both contemporaneous evaluation or for time trending. I agree that you can imagine/wish for a degree of data uncertainty out there that would make trend assessment of little use. But the +/- 0.5 degrees offered as an example of unevaluable data is not it.

          • In figure skating and gymnastics, you get interpretive scoring from nine judges. You throw out the highest and lowest, and take the rest into your calculation. In GHCN, you get constant measuring 1 billion times from 40,000 “things” each performing its protocol.

            You don’t care that one station in Yuma, AZ, has a TMAX of 71f in 1956 and another in Murmansk is 59f that same year. It does not even matter how precise the actual temp measurement is at any one given station. It matters that you have a long history of consistency over a huge sample. Stations that “go bad” at some point over 120 years are over/under randomly, and cancel each other out. [UHI issues notwithstanding.]

            What counts is the spaghetti. There are 40,145 stations in GHCN with TMAX records. The plot of each is a strand. “Seen” together, you get a sine curve, the organic trend.

            What can kill this is an cancer spreading through the raw data. Redactions, adjustments, changes by homogenizing, blacklisting of stations. Etc. It breaks the strands.

          • “probability density function for every one of the 480+ individual monthly data points in my example.”

            There is *NO* probability density associated with 480+ INDIVIDUAL monthly data points. These temperatures are independent data points that are NOT CORRELATED with each other.

            “It all has uncertainty and correlability between -1 and 1 (all known), and can be statistically “aggregated” for both contemporaneous evaluation or for time trending.”

            Uncertainty cannot be statistically “aggregated” because uncertainty doesn’t represent a probability.

            “But the +/- 0.5 degrees offered as an example of unevaluable data is not it.”

            If you treat uncertainty the same way you treat the variance of independent random variables then those variances add when you try to combine them. That means that the +/- 0.5deg uncertainty grows with each independent data point you add. That *does* make any trend you develop have an uncertainty far larger than any difference you can possibly discern from the trend.

            ““Off the page” is not a statistical term.”

            Of course it is not a statistical term. It is a term describing the physical graphing of so-called average differentials. Graphing differentials of 0.1deg to take up the entire graph hides the fact that the uncertainty is at least 5 times as large. So you really don’t know if the differential being graphed is anywhere near a “true” value or not. It’s a simple fact of hiding data to push an agenda.

  15. If the code for a “climate model” is released, it would give AGW skeptics an opportunity to find the assumptions embedded within the model, some of which may not be based on reality.

  16. When I compiled and ran it, three weeks went by and then it printed “Hello world.”

Comments are closed.