
Guest post by Dennis Ray Wingo
Introduction
The foundation of all observational science is data. This is true whether the data is temperature measurements from ground networks, satellites, or any other thing in nature that can be observed, quantified, and recorded. After data is recorded it must be archived so that future researchers who seek to extend or question conclusions drawn from that data can go back to the original source to replicate results. This is a fundamental premise of the scientific method, without it we can make no reliable statements about nature and call it science. This is true whether or not the subject is climate change, planetary motion, or any other scientific discipline. This missive is about the supremely important subject of data archival and how you the reader can support our lunar data archival project. First a historical digression.
The Importance of the Recording and Archival of Scientific Data
In the era before computers and the Internet, data archival was the responsibility of the scientist who obtained and recorded scientific observations. Johannes Kepler used Tycho Brahe’s archived records of meticulous observations of planetary motion to calculate the elliptical orbit of Mars and thus developed his laws of planetary motion. After the laws were published, anyone could check Kepler by going to the observatory and do their own calculations based on the archived data. The archived work of Brahe and Kepler underpinned Sir Isaac Newton’s formulation of his theory of gravity. Without archived data, Newton would have had no basis for his calculations. A scientist’s archives, stored at institutes of learning, has been the standard method of preserving data and results until the era of the computer.
Data Archiving in the Modern Age
In recent times a structural deficiency has emerged in the sciences related to the storage, archiving, and the availability of original data. Beginning in the world war two years and exploding afterward, scientific data in many fields of the physical sciences began to be obtained though electronic means. Strip charts, oscilloscopes, and waveforms from analog and digital sensors began to be fed into calculating programs, and results obtained. These results were and are used to develop and or confirm hypotheses. This exploded in the 1960’s and has continued to where today it is ubiquitous. However, there has been a decoupling in the scientific process regarding the recording and archiving of data and the ability to replicate results. The following example is just one of a legion of problems that exist in this realm.
In the 1960’s when data was obtained and fed into the computer, the data was often truncated due to memory limitations and computational speed of computers of the era. For example a paper was published by NASA as NASA TM X-55954 entitled:
The Radiation balance of the Earth-Atmosphere System Over Both Polar Regions Obtained From Radiation Measurements of the Nimbus II Meteorological Satellite;
This is probably the first definitive study of the radiation balance of the Earth-Atmosphere system published in the space era. Figure 1 is a figure from that paper:
Figure 1: Radiation Balance of the Earth-Atmosphere System From Nimbus 1966
This is an important paper in climate studies as it was the first paper to quantify the radiation balance based on data from satellites. However, the question is, where is the original data was fed into the computers to come up with these results?
Recovering the
HRIR Data
In the paper the primary data used to produce the temperature gradients was obtained from the Medium Resolution Infrared Radiometer (MRIR) that flew on the Nimbus I-III meteorological satellite, the first satellite to carry this high quality of sensor. Where is that data today? I actually don’t know much about the MRIR data but I do know quite a lot about the High Resolution Infrared Radiometer (HRIR) that was a companion experiment on the early Nimbus birds.
During the missions the data from the spacecraft was transmitted in analog form to ground stations where it was recorded and from there it was sent for processing at NASA Goddard Spaceflight Center in Greenbelt Maryland. Figure 2 shows the design of the HRIR instrument and the computerized method of processing of the data:
Figure 2a, 2b: HRIR Calibration and HRIR Data Processing
Looking at Figure 2a on the left you see that a laboratory calibration was done against a known blackbody target. An in flight calibration standard was measured at the same time and a reference calibration for the instrument obtained. The same in flight calibration reference blackbody (shown in the upper left) is scanned on each swath (a swath is a line of recording representing an 8.25 x 1100 km section of the Earth), providing a continuous means to maintain calibration of the instrument in flight. Figure 3 shows a trace of a swath of HRIR analog data:
Figure 3: Nimbus HRIR Swath Trace With and Without Calibration Stair Step
In 2009 my company, as a result of our work on the 1966 Lunar Orbiter data, was contracted by the National Snow and Ice Data Center (NSIDC) to take raw Nimbus HRIR data, correct errors, and translate it into a modern NetCDF-4 format so that it could be used in studies of pre 1979 Arctic and Antarctic ice extent. The HRIR data had been digitized by the diligent effort of NASA Goddard scientists who had retrieved the surviving tapes from the federal records center. Since no tape drives exist anymore that can read the tapes, a company was contracted to use an MRI type machine to read these low data density tapes. This worked remarkably well and the data from over 1700 of these tapes were provided to us. However, it turns out that the data tapes do not have the original analog data. It turns out that the original analog tapes no longer exist.
The digitized data that we used are, as best as we can tell, is an intermediate product derived from the IBM 1704 computer processing. The swaths no longer have the calibration stair step or sync pulses but each one does have a metadata file with geo-positioning data. We reprocessed the data and re-gridded it to comply with modern Net-CDF4 conventions. The HRIR images produced are then used by the NSIDC to find the edges of the polar ice. We took the files and translated them into .kml files for display on Google Earth with dramatic effect. Our work is described in an AGU Poster (IN41A-1108, 2009). Figure 4 is a .kml file mapped in Google Earth.
Figure 4: Google Earth .kml File of the Nimbus II HRIR Data, August 23, 1966
This image is centered near Indonesia. Bluer temperatures are colder and clearly show the Monsoon clouds. The contrast between the ocean and Australia is clearly evident. Colder temps in the Himalayas are seen as is the heat of the Persian gulf and the deep cool temperatures of the clouds in the upper right from typhoon Helen and Ida. The HRIR data can be used for many purposes but due to the loss of calibration, only a relative comparison with modern IR data can be obtained. This also renders replication of the findings of the radiation balance paper nearly impossible. So, what the heck does all of this have to do with Lunar images?
The Lunar Orbiter Image Recovery Project (LOIRP)
In 1966-67 NASA sent five spacecraft to orbit the Moon as a photoreconnaissance mission to scout landing sites for the Apollo landings. Today’s reader must remember that prior to these missions mankind had never seen the Moon up close. The first three Lunar Orbiters were in a near equatorial orbit and the last two in polar orbits for general mapping. Each carried two visible light cameras, a 24” focal length instrument obtaining images at about 1 meter resolution, and an 8” focal length instrument at about 5-7 meters resolution on the on the lunar near side. The images were recorded on 70mm SO-243 photographic film which was processed on board. This film was then scanned with a 5 micron spot beam that modulated an analog signal that was transmitted to the Earth. This is shown in figure 4:
Figure 4: Lunar Orbiter Image Capture, Scan, Transmit, Storage and Print Process
The images were captured on the Earth via two dissimilar processes. At the lower left, of the most interest to our project, was the recording of the pre-demodulated combined raw analog and digital data on a 2” Ampex FR-900 Instrumentation tape drive. The second process demodulated the signal to produce a video signal that was sent to a long persistence phosphor called a kinescope. The resulting image was photographed by a 35mm film camera. The 35mm film strip positives were then assembled into a larger sub-image that was filmed again to create a 35mm large negative that was processed to create a 35mm print that was used by the photo analysts to look for landing sites. However, as one might suspect, there was degradation of the quality of the images in going through this many steps.
I was aware of this quality reduction as I had worked with the film records in the late 1980’s at the University of Alabama Huntsville. At that time I had researched the tapes but was informed that the tapes were unavailable, though rumors were that someone was digitizing them. However, this never happened and all the archived images, such as the excellent repositories at the USGS in Flagstaff Arizona and at the Lunar and Planetary Laboratory (LPI) in Houston were derived from the films and were the only high resolution images of the Moon available.
In 2007 quite by accident I read a newsgroup posting that Nancy Evans, a retired JPL researcher, was retiring from her second career as a veterinarian and had a four FR-900 tape drives that she wanted to give away. I later found that she was the responsible official at NASA JPL in the 1980’s that had saved the original Lunar Orbiter analog tapes and that they were still in storage at JPL. I contacted Nancy and JPL and she was willing to donate the tape drives and JPL was willing to loan the tapes to NASA Ames were we had donated facilities to attempt to restore the tape drives and read the tapes. I raised a bit of funding from NASA Watch editor Keith Cowing. We loaded two trucks with the 1478 tapes weighing over 28,000 lbs and the four tape drives weighing a thousand pounds each and drove to NASA Ames.
The reason that previous efforts by Nancy Evans and engineer Mark Nelson from Cal Tech had been unsuccessful was that NASA was not convinced of the value of the original data. I had known of the tapes before but we had to quantify the benefits to NASA before we could obtain funding. We found the money quote as we called it in an obscure NASA memo from 1966. This memo said in brief (figure 5):
Figure 5: NASA Memo Regarding Superiority of Mag Tape Lunar Images
This had originally been suggested by NASA contractor Bellcomm employee Charles Byrne as a means to improve the methods that would be used to analyze landing sites for the dangers from large boulders and to analyze the slope of the landing sites. If rocks were too big or the slope more than eleven degrees, it would be a bad day for the crews seeking to land. With this memo in hand NASA headquarters provided us with initial funding to get one tape drive out of the four operational and to see if we could produce one image. We had three questions to answer.
1. Could we get a 40+ year old tape drive operational again?
2. Even if the tape drive is operational, is there any data still on the tapes?
3. Even if there is surviving data, is it of higher quality than the USGS and LPI archives of the film images?
Suffice to say we answered all three questions in the affirmative and in November of 2008 we unveiled to the world our first image, which just happened to be the famous “Earthrise” image of the Earth as seen from lunar orbit from August 23, 1966. The original image and our restored image is shown in figure 6:
Figure 6: Earthrise 1966 and Earthrise 2008!
The improvement in dynamic range we found from the documentation was a factor of four due to the reduced (250 to 1 on film vs 1000 to 1 on the tapes) dynamic range of the ground 35mm film. The raw data also preserves the sync pulses used to rectify each line of the data and when we used oversampling techniques (10x in frequency and bit depth) we can produce much larger images (the Earthrise image at full resolution is 60’ x 25’ at 300 dpi). With modern digitizing cards and inexpensive terabyte class drives this became a very manageable affair. For more information, this link is from a lunch presentation that I gave at Apple’s worldwide developer conference (WWDC) in 2009. Here is a link to an LPI paper.
Where We are in 2013
After our success NASA headquarters Exploration Systems Mission Directorate provided further funding. However, since ours was basically an unsolicited proposal that funding was limited. Each of the Lunar Orbiters (LO) acquired approximately 215 medium and high resolution images. The most important images are from Lunar Orbiter II, III, followed by LO-V, then I, then IV. The reason is that LO-II and III have the best high resolution images on the near side equatorial region. The digitized raw images best preserves the data in a form that can then be integrated into a multilayer dataset that best compares with today’s data which we have done on an experimental basis. In contrast to the Nimbus HRIR data the LO data fully preserves the calibration marks, which are on the tapes every 22 seconds. LO-I lost its image compensation sensor early in the mission resulting in blurred high resolution images. The medium resolution images are fine though they are less relevant for comparison purposes due to their lower resolution. LO-V has almost all of its high resolution images at 2 meters, thus being a good comparison to LRO. The lowest priority are the LO-IV images, which were obtained from a much higher altitude than the other missions and are thus of mostly historical value.
Our project has successfully digitized 98% of the LO-III images, with only six images lost to tape related causes (erased tapes), while we have found several images that are not in the existing USGS and LPI archives. We have so far digitized about 40% of the LO-II images, and about 10% of the LO-V, LO-IV, and LO-1 images.
We Need Your Help
We are today raising funds through the crowd funding site;
http://www.rockethub.com/projects/14882-lunar-orbiter-image-recovery-project
We are doing this as we do not expect further NASA funding and there is only a limited amount of time still available to digitize these tapes. The FR-900 tape drives use a head with four iron tips that rotate at 15,000 rpm. These heads are in direct contact with the tapes that are moving by at 12.5 inches per second, creating a sandpaper effect that quickly wears the heads down. Here is a video from a couple of years ago with a tour of the lab, which by is in an old MacDonald’s at the old Navy Base at Moffett field CA. Only a few dozen tapes can be played before the heads wear out, necessitating a refurbishment that costs well over $7000 each time.
We also have to pay our engineer to maintain the drive, our students to manage, assemble, and quality check the images as well as myself to manage the project, operate the tape drives (I worked in video production for years and thus do the operations and real time quality control during image capture). We are also preparing this data for subsequent archiving at the National Space Science Data Center though we also have the images archived at the NASA Lunar Science Institute and at our www.moonviews.com site where anyone is welcome to download them. We also have a Lunar Orbiter Facebook page that you are welcome to join.
Scientific Value
The images that we are producing and the raw data will be available to anyone for their own purposes. We have students who have been doing real science of comparing the LOIRP digitized images with the latest images from the NASA LRO mission. Why is this important? Since the Moon has no atmosphere, even the smallest meteors impact the surface and make a crater. With a resolution on both LO and LRO ~one meter we can examine the lunar surface in detail over thousands of square kilometers over a period of almost half a century. We can then see what the frequency of small impactors are on the Moon. Not only does this provide information for crew safety while out on the surface of the Moon, it provides a statistical representation of the asteroid risk in near Earth space. The bolide that exploded over Russia is thought to represent a risk of a one in one hundred year event. What if that risk is higher? Our images, coupled with the LRO LROC camera images can help to better bound this risk.
Our project has been honored by congress and our images were used in a presentation by NASA to the president in 2009 and were part of a package of NASA photos provided in the inaugural package this year. We have had extensive coverage of our efforts in what we have termed “techno-archeology” or literally the archeology of technology. Many of these links are at the end of this article. However, with all of that it is a very difficult funding environment and that is why we need your help.
What is on the Crowdfunding Site
We are offering a lot of stuff for your donation on the site. We have collectable and historical images that were printed back during the Apollo era for varying price ranges. We have models of the Lunar Orbiter with a stand, suitable for your desk. We have microfilm from the original photographs and if you cannot afford any of that, you can just make a donation!
This is what we call citizen science, the chance to have a part in an ongoing effort to archive data that can never been archived again. Our tapes are gradually degrading and the tape drives cannot function without heads. Our engineering team is comprised of retired engineers who won’t be around forever. NASA JPL in 2008 estimated that to recreate what we have would cost over $6 million dollars. We have done what we have done with a tenth of that amount of money and with your generous donation we will complete our task by the end of this September.
The Big Picture
Stories like ours regarding the actual and potential loss of valuable original data is not a rarity. Due to funding cuts to NASA on October 1, 1977 they turned off the Apollo lunar surface experiments that we spent billions putting there. The majority of the data that was obtained up until the experiments were turned off was in great danger of being lost. Retired scientists and interested parties at NASA recently put together a team that retrieved these records from as far away as Perth Australia and the NASA Lunar Science Institute has a focus group dedicated to this effort. Sadly some of this data is still in limbo and may indeed be lost forever due to poor record keeping and preservation of the original data.
For the reader of WUWT most of you are well aware of the issues associated with the adjustments of original data in the field of climate science. The integrity of science is preconditioned on the ability to replicate results and the archival of data and the preservation of that original data is one of the highest priorities in science. We are doing our small part here with the Lunar Orbiter images. One of our team members is Charles Byrne, who just happened to be the one who wrote the original memo that resulted in the purchase of the tape drives. In talking with Charlie he never in a million years thought that a generation later he would be able to work with the original data. He has developed several algorithms that we are currently using to remove instrument related artifacts from our images. Charlie is still doing original science with Lunar Orbiter images and is the author of the near side mega-basin theory.
One of the reasons that I started thinking about original data was that at the same time I was working with the forth generation lunar orbiter film in the late 1980’s Dr. John Christy was working just down the hall from me at UAH recovering satellite data from the 1970’s that for all practical purposes was the genesis of the era of the climate skeptic. Did he think that his work would have had such a long lasting effect? Just think, did Brahe in his wildest dreams think that his meticulous work would lead to the theory of gravitation? We don’t know what may come in the future from the raw data that we are preserving but we do know that having an original record from 1966-67 could not be replicated at any price and with your support we will preserve this record for posterity.
A selection of published Articles About Our Project
http://news.cnet.com/2300-11386_3-10004237.html
http://www.theregister.co.uk/2009/07/22/destination_moon/print.html
http://www.sciencebuzz.org/buzz-tags/dennis-wingo
http://news.nationalgeographic.com/news/2009/05/090505-moon-photos-video-ap.html
http://articles.latimes.com/2009/mar/22/nation/na-lunar22
http://www.nasa.gov/topics/moonmars/features/LOIRP/index.html
http://boingboing.net/2012/07/12/inside-the-lunar-orbiter-image.html
http://news.cnet.com/8301-13772_3-10097025-52.html
Apple Worldwide Developer Conference Slide Show
Wikipedia Page
http://en.wikipedia.org/wiki/Lunar_Orbiter_Image_Recovery_Project
LOIRP Gigapans
http://gigapan.com/profiles/loirp
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
Yes, I have.
The particular value of the original code isn’t necessarily in the “wading through” the code line-by-line so much as being able to see certain crucial lines and thus in elimination of sources of error.
When you have two people nominally applying exactly the same method and coming up with different answers, it is mighty handy to be able to “debug” and go “Aha! They used Runge-Kutta (or whatever), but they only used floats!” Therefore one can say “I’m more confident in my results and here’s why.”
Otherwise you end up with “Person A says X”, “Person B says Y”. If they’re black boxes, you have have quite a few people finding “Y” and hearing “You are doing it wrong. Somewhere.”
I don’t have a credit card.
You have a little over a month to make some other payment form that I can use.
DaveE.
Alan S. Blue says:
February 26, 2013 at 12:19 pm
being able to see certain crucial lines
And how to find those? and miss some obscure detail somewhere that makes a big difference? Or scale it up to millions of lines of code which modern systems approach. The only way to be sure is to replicate the analysis with new code. This may be impossible because meta data is missing. Granted that meta data can sometimes be extracted from the old code, but that can be extremely difficult. The code can be subtle. A real-life example [from a COBOL program I once saw]: MOVE A TO B, making a local copy of A. Problems was that A was signed and B was not. The code used -1 for missing data, so you see the problem: all the sudden B = 1 became good data even if A was missing.
There are companies that can coat the recorder heads with CVD diamond. The life of the heads is extended considerably.
“We loaded two trucks with the 1478 tapes weighing over 28,000 lbs and the four tape drives weighing a thousand pounds each and drove to NASA Ames.”
The bandwidth of this kind of data transfer surely beats the fastest link available on the internet today.
There are companies that can coat the recorder heads with CVD diamond. The life of the heads is extended considerably.
These are not that type of head. What you are talking about works on heads like for an 8 track player or other low frequency heads. These heads have to have a flat response out to 20 MHz.
If you have any links that say otherwise we are always willing to listen.
Folks
Here is an example of an image that we recently captured.
individual subframes
http://lunarscience.arc.nasa.gov/files/LOIRP/5041_H1.tif
http://lunarscience.arc.nasa.gov/files/LOIRP/5041_H2.tif
http://lunarscience.arc.nasa.gov/files/LOIRP/5041_H3.tif
assembled subframes
http://lunarscience.arc.nasa.gov/files/LOIRP/5041_FULL.tif
This is the same as this image at the Lunar and Planetary Institute website derived from the film data.
http://www.lpi.usra.edu/resources/lunarorbiter/frame/?5041
The difference is readily apparent.
On the very relevant question about “transmitting” data at Fedex speeds ….
The real answer for various storage devices is under the http://www.xkcd.com website, answering that very question about Fedex bandwidth: Read the complete explanation at http://what-if.xkcd.com/31/ The short-term answer follows:
FedEx Bandwidth
If you want to transfer a few hundred gigabytes of data, it’s generally faster to FedEx a hard drive than to send the files over the internet. This isn’t a new idea—it’s often dubbed SneakerNet—and it’s how Google transfers large amounts of data internally.
But will it always be faster?
Cisco estimates that total internet traffic currently averages 167 terabits per second. FedEx has a fleet of 654 aircraft with a lift capacity of 26.5 million pounds daily. A solid-state laptop drive weighs about 78 grams and can hold up to a terabyte.
That means FedEx is capable of transferring 150 exabytes of data per day, or 14 petabits per second—almost a hundred times the current throughput of the internet.
Sneakernet
Yep. Here are the specs for the FR-900
http://www.google.com/url?sa=t&rct=j&q=ampex%20fr-900%20brochure&source=web&cd=1&ved=0CDIQFjAA&url=http%3A%2F%2Fimages.spaceref.com%2Fnews%2F2011%2FFR-900.Brochure.pdf&ei=cD8tUdudNefZ2QXinIHAAg&usg=AFQjCNHNKzZSThx8X4wkzOEj2aCSn5FDlg&sig2=wUtNxzqLYJ4hKrnf15kX1g&bvm=bv.42965579,d.b2I
The recorder was the highest density data storage device of the 1960’s. When used in a digital mode it recorded 20 megabits/sec. 20 x 60 x 3600 = 73.913 gigabytes per hour. Each tape is a one hour tape so multiply by 1478 = 109.243 terabytes.
The drive from Moorpark (where the tapes were in storage) to Ames is about 420 miles and we did it in about 8 hours. Thus the Sneakernet rate was about 30 gigabits/sec.
A marvellous project, I’ve signed up and will be forwarding this link to friends and family.
Did you really say one meter resolution? Reading between the lines …
This had originally been suggested by NASA contractor Bellcomm employee Charles Byrne as a means to improve the methods that would be used to analyze landing sites for the dangers from large boulders and to analyze the slope of the landing sites. If rocks were too big or the slope more than eleven degrees, it would be a bad day for the crews seeking to land.
The bold fragment suggests to me the existence of elevation data. Is this project only about images, or is there any elevation data also, and if so are you planning to recover the elevation data too? A 1m DEM of the lunar surface, even just a part of it, mapped to similar resolution images mapped to that terrain … for someone with a hobby interest in digital elevation mapping, that would be a Very Good Thing.
denniswingo says:
February 26, 2013 at 2:12 pm
assembled subframes
http://lunarscience.arc.nasa.gov/files/LOIRP/5041_FULL.tif
at 2GB this is too big for my image viewer…
Brian H. and Dale E., both sans credit cards:
Are you also sans friends who actually do have credit cards? You slip’em the green, they make the equivalent donation. They might even throw in some extra for such a cause.
The bold fragment suggests to me the existence of elevation data. Is this project only about images, or is there any elevation data also, and if so are you planning to recover the elevation data too? A 1m DEM of the lunar surface, even just a part of it, mapped to similar resolution images mapped to that terrain … for someone with a hobby interest in digital elevation mapping, that would be a Very Good Thing.
Derek
The slope information was derived indirectly from only the images as we did not have laser altimeters at that time. Here is a great document related to the effort that Bellcomm was doing for NASA at the time. Google “project slope” for more information.
http://ntrs.nasa.gov/search.jsp?R=19660029202
We actually do have some marvelous 1 meter resolution LIDAR data that has come from the Lunar Reconniassance Orbiter in its mission. We have been doing some work in the polar regions where the data has the highest density and have been having a great time with it. Also check out the LROC imaging camera that has made some awesome pictures of the Moon, the only ones comparable with Lunar Orbiter.
at 2GB this is too big for my image viewer…
You can download the subframes and view them separately.
They are quite amazing…
As old as that tape is, it’s possible the lubrication has dried up resulting in additional wear on the heads. You might try contacting tape manufactures to see if they can suggest a solution to restore the lubrication on the tape before you attempt to read them.
Why bother archiving data? In another few decades the full effects of the economic and social damages done by King Obama and the EU technocrats on western nations will result in food & energy riots, anarchy, chaos and an end to liberal democracies. The complete break down of western society will render the question of science moot. Looking forward to the great asteroid strike, or what ever other natural calamity strikes, when we will find out how woefully unprepared human kind is to adapt to natural environmental change. Our brains are already significantly smaller than the much smarter early modern man, and our socialist education system is out to ensure that evolution in reverse continues by making competition an unheard of concept, We are being turned into HG Wells’ Eloi. The progressives seem to not be aware that being forced to adapt to rapid environmental changes was the driving force behind the evolution of our large brain. I would be very surprised if, when the time comes, more than a handful of people in our civilization will have the cognitive and physical capacity to adapt, and adapt they will have to, as the one constant in the earth’s history is change and the extinction of those who can not adapt.
lsvalgaard says:
February 26, 2013 at 12:54 pm
Alan S. Blue says:
February 26, 2013 at 12:19 pm
being able to see certain crucial lines
And how to find those?
==========
Without the code there is no possibility that you can find the error in the original work. The original result is your reference data. Your new result is what you compare to the reference. If there is a mismatch then the hunt is on. You are looking for either the error in your new code or the error in the old code.
A secondary issue is the problem with code revisions. Code is rarely static. So by archiving the code one of the first questions to be answered is whether the archived code can generate the archived result from the archived raw data. Or as more likely, was the result generated with an earlier version of the code than the version archived?
I’m with Alan on this. The code has huge value as does the data from an archive point of view. However, like old data there are problems in simply running old code. You may find that your CP/M version of VisiCalc with all the original data and formulas won’t run on any machine you have available.
Just wow! Thanks Dennis. I’m a plant phys guy by training but got roped into the AGW thing via my father’s collab with Fred Singer (Unstoppable Global Warming book). While far afield of my training, the discussions here always remind me of fundamental science basics. Data is key. Without it we’re lost. Truly at the mercy of ideologues. And keeping data for future scientists is just as critical.
A few years back, I had to nearly sue to get North Carolina to cough up its water quality data. Why did the NC DENR not want me to have the taxpayer-funded data? Because it showed that the hog farmers in the eastern part of the state had not ruined the water quality (it was better 10 years after 500% increase in hog population in the watersheds than before significant hog farming ops). When I tried to repeat the analysis in other states (to defend their farmers from scurilous eco-wacktivist attacks), I found that was impossible because the states had never collected the data in the first place (b/c it was another unfunded mandate of the Feds, so the states ignored!). But even NC was hard because they only had 8 years of data from 1970s. Without data, the politicos and wactivists can make up any lie they want and were getting away with it until we showed them there was no there there.
The funny thing is, farmers were afraid of the data and most told me if it’d been up to them, they never would have wanted the WQ data collected — and i just exploded with ernestness back to them, “The data is your only defense! Without it, you’re screwed!”
And that is SOOOOO true with the GW issue, as we here at Watt’s site know so well.
Again, THANKS Dennis! 🙂
From school days, I gained a love of science. I continue to read about science and its principles.
While this data endeavor is great, the entire post, and comments on the post, show weaknesses in the general conceptions of science as held by fellow science lovers.
First of all: “DATA” IS PLURAL.
Yes, I am shouting. Please stop showing your ignorance, or weakness in grammar. “Datum” is singular, and “data” is plural. To say, “Data is necessary…” is to be wrong. Wrong. Data are necessary. A datum is necessary, etc.
This error is repeated a lot here. I will post a couple other issues regarding science once I review this post to capture the obvious ones.
“I do have one question. In the paragraph that starts “For the reader of WUWT most of you are well aware of the issues associated with the adjustments of original data in the field of climate science.” and near the end states that Charles Byrne “has developed several algorithms that we are currently using to remove instrument related artifacts from our images.”, are these adjustments he is making not very similar to the adjustments climate scientists make for TOBs, instrument changes, station moves, UHI, etc.?”
Yes. If you have seen the raw data from temperature stations you’ll quickly understand that raw data is filled with errors. -15000C, the same figure repeated for days. Wrong units, about 10-20 different classifications of mistakes. Then there are the changes in operation.
changing the time of observation ( something almost unique to the US ) and changing the instrumentation.
What most of us have argued for is this.
1. a copy of the raw data where it exists. We term this level 0 data or the first report.
2. a copy of the tool,code,proceedure, used to make the adjustment
3. Proper treatment of the uncertainties.
With TOBS I can tell you that skeptics have looked at this adjustment three ways from sunday and the adjustment is required. JerryB, a commenter at ClimateAudit had his independent analysis posted at John Daly’s. That file is still there. I re ran his entire analysis back in 2007-08
when we discussed TOBS at Climate audit. The adjustment for TOBS is needed. When you change the time of observation it does change the min/max recorded. AT first I could not see how, but after going over the data prepared by a skeptic, I was convinced. Later I would look at CRN data and find the same thing. Change the time of observation from one time to another and you change Mins and Maxs. Almost 6 years later people still discuss TOBS as if it were a conspiracy. Yet the data sits there at John Daly’s for anyone to look at. To prove it to yourself.
The CRN data sits there, so you can prove it to yourself. But 99.9% of people refuse to lift a finger to prove it to themselves. They want somebody else to prove it to them
Let me tell you why. Most folks know that you can fight any argument anyone raises. Any argument. You cant convince me of something I refuse to be convinced of. But if I do the work myself. If I look at it myself, then I have no choice but to accept my work..
So.
Start here http://www.john-daly.com/tob/TOBSUM.HTM
Download the files. 190 stations with hourly data. The files will show you what happens if you change the Time of observation.
If you dont want to look at the data for yourself. Then ask yourself why not.
Back in 2007 I thought TOBS was a crock. It made no sense. I took the data.
I looked for myself. I didnt ask someone to prove it to me, I proved it to myself.
First step. recognize that an adjustment is needed. If you cant prove that to yourself by looking at the data, then there really is no point in discussing it
“The foundation of all observational science is data.”
I have no idea what “observational science” is, apart from science itself. Science necessarily requires observations. Those are compared to theory/hypothesis-based predictions to judge the degree that we ought to continue entertaining the hypothesis, or discard it altogether, or work with it (a la Hegelian dialectic).
Possibly, what is meant is observational analyses, where an investigator does not manipulate an independent variable, as contrasted with experimental analysis, where the scientist does manipulate the IV.
We cannot manipulate stars. We can.however, set up hypotheses about their nature, and figure out how any hypothesis might be tested by an observation, then conduct the oservation, then evaluate the fit of the observed data with what the hypothesis would have predicted, and so have a scientific test of phenomena we cannot manipulate.
That is the difference between dealing with observational and experimental data. I am guessing this is what meant by “observational science.” Much of climatology is observational analyses, while things such as the recreation of global warming with CO2 in soda bottles and various light exposures would be experimental.
Steven Mosher says:
“If you dont want to look at the data for yourself. Then ask yourself why not.”
“After data is [sic] recorded it must be archived so that future researchers who seek to extend or question conclusions drawn from that data can go back to the original source to replicate results.]
Sigh. Replicability does not refer to an ability to re-run some mathematical analysis on the numeric data; it refers to the scientific assumption that the nature of everything in the universe has some underlying essential nature that is physical, and that causes lead to effects in orderly, lawly, predictable ways, and so that if an observation was carried out, of natural phenomena, in once place, and was offered up as “knowledge,” then it ought to be replicable eslewhere – ultimately, at any locale in our universe.
Mathematically, if you take the same data and run the same operations, tautologically, and by definition, you wil get the same result. Proofs are proofs. Laws and rules of math are laws and rules. Unless you make a mistake with your pencil, or the computer program has been changed, the same analysis run with the same data by the same operations will yield the same results time after time.
All that to say that the “replicability” ethos of science is not a matter of re-running the same data, but of being able to gather yet another set of data that is predicted by a theory, and yet again testing the hypothesis.
This is fundamental, essential science.
IF you can get the ‘media’ (5 1/4″ or 8″ disks etc) read, “Virtual machines” can do it …
Google: CP/M “virtual machine” z80
.
When discussing science, this ethos or value is rarely made explicit, but is is essential: honesty.
The reason to make data available is so that others can figure out whether you are being honest, or deceptive.
In my mind, this is a big deal. It is flaunted a lot.
This is often described as “transparency.”
If you won’t “show your work,” in my humble opinion, you are not conducting “science.”
I believe that whole-heartedly. Even though that has huge ramifications. If you develop a secret formula – an adhesive, or a drug, perhaps – you cannot keep it private, and also consider it “scientific.”
A patentable drug: possibly, you could contract with an outside group to replicate your study while agreeing to not reveal your secret. That could move “authority” (“trust me”) toward “science,” but it still would not be science.
CERN, etc. Sure, you should be able to benefit from your original work. Why give away the data the day after you publish?
I naively believe that science requires this. Otherwise, “knowledge” and “evidence” are a matter of “authority,” not “science.”
“Trust us – we ran calculations and found the Higgs Boson – but you cannot examine our raw data or algorithms.”
That is not transparency. It is just a tough aspect od science that it does not have, built-in, a way to ensure career or commercial success for some observation. But science is not a means for building a career or commerical success. It is about gaining knowledge, and that alone.
Enough of my opinions for now.