Dave Burton
Have you noticed that, in most fields, technology tends to advance in fits and starts? For long periods, technology just creeps along, then there’s a sudden surge of progress. Well it’s happening again… I think.
For the last sixty years, computer scientists have been trying, with little success, to make programming computers more productive. The creation of high level languages in the 1950s & early 1960s yielded big productivity gains, but then progress slowed to a crawl. Despite astonishing increases in computing power (5-6 orders of magnitude increases in speed, RAM memory, disk storage, and display resolution, in my own home computers, over five decades!), and despite improved tools, and a series of fads in programming methodology, not much really changed, and software developers have remained about as (un)productive as we were in 1975, when Fred Brooks wrote The Mythical Man-Month.
Newer programming languages, like Python, JavaScript and Rust, helped a little bit, but programmers in those languages are only slightly more productive than programmers were in Pascal and C a half-century ago. StackOverflow on the Internet is more convenient than a shelf full of paper books, but that mostly just trims the learning curve a bit. Newer tools, like the VSCode editor, are very nice (certainly nicer than punched cards!), but they don’t make computer programmers wildly more productive.
But now that has apparently finally changed. The long anticipated sudden surge has finally arrived. Cutting edge programmers are now using “AI Assistants” like Microsoft/Github CoPilot to help them write code, and they are reporting dramatic improvements in productivity, and perhaps even improvements in code quality.
It does not shock me that tools created by computer programmers are helpful to computer programmers. But, surprisingly, this technology is also very, very good at parsing, responding to, and writing “natural language” (e.g., English).
Here’s an eye-opening example. I asked ChatGPT 3.5 this question, which I’m confident the vast majority of professional “climate scientists” would not answer correctly:
“With reference to the agronomy literature, quantify how elevated CO2 helps plants use water more efficiently, and increases crops’ drought resilience, by reducing stomatal conductance and water loss through transpiration.”
ChatGPT’s answer was remarkably good:
“In agronomy, the effects of elevated CO2 on plant water use efficiency and drought resilience are extensively studied. One of the key mechanisms through which elevated CO2 levels improve water use efficiency is by reducing stomatal conductance and, consequently, water loss through transpiration.
“Stomata are small pores on the surface of plant leaves that regulate gas exchange, including the uptake of CO2 for photosynthesis and the release of water vapor through transpiration. When CO2 levels are elevated, plants can maintain the same or higher rate of photosynthesis while reducing stomatal conductance. This reduction in stomatal conductance leads to a decrease in water loss through transpiration without significantly affecting CO2 uptake, resulting in improved water use efficiency.”
Wow! I was very impressed. That answer is exactly correct, and the prose is perfect. I’ve been explaining that to people for years, and I think ChatGPT’s explanation is better than mine.
But all is not rainbows and roses, because that doesn’t always happen. I then asked ChatGPT 3.5 what I thought was a much simpler question. It requires a trivial bit of physical and mathematical reasoning, but I think even most climate scientists could probably answer it correctly:
“How much is 5.7 PgC in ppmv of CO2?”
I expected that AI tools would have no problem answering that simple question. But they all failed. In fact, they gave me ten different wrong answers.
Conventionally, “1 ppmv CO2” is shorthand for 1 µmol/mol of dry atmosphere (“dry molar fraction”). 1 ppmv CO2 = 7.8024 Gt CO2 = 2.12940 PgC, so the correct answer to the question I asked is 5.7 / 2.1294 = 2.67681 ppmv. (Water vapor adds about 0.25% to the mass of the atmosphere, which is 0.41% by molar fraction, so the answer is very slightly larger if water vapor is not omitted.)
ChatGPT 3.5 completely fell on its face. It said:
“5.7 PgC is approximately equivalent to 61.9 ppmv of CO2 in the atmosphere.”
ChatGPT 4.0 isn’t free on the OpenAI website, but Popai has an option to use ChatGPT 4.0 as its engine, so I tried that. It gave a different wrong answer:
“5.7 PgC is approximately equivalent to 4.07 ppmv of CO2 in the atmosphere.”
I asked Mistral AI, and it also gave a wrong answer:
“5.7 PgC is approximately equivalent to 569 ppmv of CO2 when evenly distributed throughout the entire atmosphere.”
I asked Microsoft Copilot, and it also gave a wrong answer:
“5.7 PgC is approximately equivalent to 12.198 ppmv of CO₂.”
CoPilot is also integrated into the Microsoft Edge Browser. Oddly, that version gave a different wrong answer:
“Unfortunately, without knowing the specific atmospheric temperature and other factors, we cannot provide an exact ppmv value.”
I asked Llama 2, and it also gave a wrong answer:
“5.7 PgC is equivalent to 1220 ppmv of CO2.”
I asked Claude, and it also gave a wrong answer:
“5.7 PgC (petagrams of carbon) is approximately equivalent to 9.86 ppmv (parts per million by volume) of CO2 in the atmosphere.”
I asked Scite, and it also gave a wrong answer:
“5.7 PgC is equivalent to approximately 25.2 ppmv of CO2 in the atmosphere”
I asked Aria (in the Opera browser), and it also gave a wrong answer:
“5.7 PgC * 2.12 ppmv/PgC = 12.084 ppmv”
I also asked Google Gemini, and it gave the most hilariously wrong answer of all:
“5.7 PgC is equivalent to 5700000000000000000.00 ppmv of CO2.”
Two fervent advocates of the position that AI is transformative technology are Matt Welsh and Matthew Berman (both on YouTube, of course). Here’s a contrary view (profanity warning). IMO, the jury is still out, but…
👉 My first conclusion is that the AI tools are like the kids who pass their science classes by “memorizing the formulas.” They cannot “do science.”
👉 My second conclusion is that they can nevertheless be quite useful. Just always remember President Reagan’s advice for dealing with the Soviets: “Trust, but verify.”
👉 My third conclusion is that when using these tools it helps to lead them along by the nose. Be very, very specific with your questions. For instance, if you’d rather that they rely on the (robust) agronomy literature, rather than the (hopelessly politicized) climate science literature, then say so. Here’s a video with some suggestions about how to make best use of ChatGPT.
Dave Burton
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
I think a misconception among the public is that AI can actually think and reason. Even learn. And that AI is unbiased.
It isn’t unbiased and it can’t think or reason or learn.
It just runs programs.
(“Sonny” and “Johnny 5” are fiction.)
I use CoPilot and Jetbrains AI Assistant.
Its like pair programming with an eager intern who happens to lie a lot.
An experienced engineer can spot the API hallucinations, the language and framework version mismatches and use the tools to write guard language, test cases, comments, PR summaries, object mappings and fairly good at doing research that removes the step of having to hit up Stackoverflow or documentation for some guidance.
eg. “How do I discover the CLSID on a COM or OPOS object?”
The tools also help me write better method signatures, property, method and class names and find opportunities to optimize code and refactor.
You are using it as a tool, a keyword search on steroids, so to speak.
You know how to get around the obvious deficiencies.
What percentage of the general population would have the critical thinking skills and patience to do the same?
Exactly. The AI tools don’t ask questions. These machines fail at Theory of Mind. They aren’t even very good at using the sources they cite. I asked Edge CoPilot what were the five largest US embassies, and received answers that weren’t even close – just embassies that are mentioned more frequently in the news. The Edge CoPilot provides its sources, and when I checked, discovered that the ordered list in the citation was wildly different than CoPilot’s own list.
I’m guessing that social media and the dreck that is curated there provides the GIGO results.
Ah, Artificial Ignorance, just another appeal to authority within an echo chamber.
It can’t do anything truly new, it is all derivative from a source of golden truth.
Large language models are very good at reproducing natural language, because it is a rules-based communication process, and we are able to define the rules.
If you then give an LLM a set of words from sociology texts, it can write you a credible sociology paper. Which is what some of those early spoofs did.
If you have the hardware, you can feed it all the data on the Web. This is what is currently called AI – though there is no intelligence being exhibited.
What is happening is that the LLM is simply reproducing the ‘average’ of the data on the Web. When it is asked for the capital city of France, the Web average is going to be clear and right. When it is asked for something ambiguous or controversial, the answer is going to be much less useful. Much data on the Web is currently mildly left wing, so that will be the typical bias.
AI is good for averages. But don’t expect a blinding flash of intelligence – because average is all that it can do…
And it appears that the “average” is actually a modal value, rather than an arithmetic mean.
From the above article:
Really??? That’s a witch’s brew of mixed units, saying there is equivalence of volumetric concentration (expressed as ppmv) with absolute mass (Gt CO2, or PgC).
Therefore, the question asked of various AI per the above article “How much is 5.7 PgC in ppmv of CO2?” is, fundamentally, basically non-sensical without referencing Earth’s atmosphere. So, IMHO, it’s not all that surprising that different AI sources might have trouble responding to a such a confusing question.
As an independent check:
3.7 gigatonne (Gt) of carbon dioxide contains 1 Gt of carbon (C) and 2.7 Gt of oxygen (O2).
Also, one Gt of carbon is 10^15 grams of carbon, the same as one petagram of carbon (PgC) .
With dry air having an average molecular weight of 28.96 g/mol and CO2 having a molecular weight of 44 g/mol, 1 ppmv of CO2 in air translate to (44/28.96) ppmw of CO2 in air = 1.52 ppmw of CO2.
The mass of the Earth’s atmosphere is stated to be 5.14*10^21 gm, so if there was 5.7 PgC (5.7 gigatons carbon) present, that would be equivalent to a CO2 concentration of (5.7*10^15)*(3.7/1)/(5.14×10^21) = 4.10 ppmw of CO2 = 4.10*(1 ppmv/1.52 ppmw) = 2.70 ppmv.
That 2.70 ppmv is acceptably consistent with your asserted-to-be-correct value of 2.68, rounded off.
And this highlights one of the many weaknesses of the software. It lacks the ability to respond in the way you have.
This was one of the well known solutions to the Turing test. Given a question beyond human knowledge, a human will respond almost instantly with “I don’t know”. A computer program mimicking human behavior will take much longer to provide the same response.
Yes, “really” — but you’re right that I should have explicitly specified Earth’s atmosphere, rather than just assuming it. (It would have been funny if one of the AI tools had given me an answer for Mars.)
I also should have linked to the calculation of the precise conversion factors. It is here:
https://sealevel.info/carbon/carbonflux.html#massequiv
Here’s a better version of the question:
“How much is 5.7 PgC in units of ppmv of CO2 in Earth’s dry atmosphere?”
Hmmmm… I just tried that revised question in most of those tools. (I’ve apparently used up my quota of free queries at PopAI.) They all still got it wrong, but several gave different wrong answers.
Gemini did better than it did with the first version of the question. It (wrongly) said the conversion wasn’t possible, but that’s a lot better than giving an answer which is wrong by 18 orders of magnitude. 😏
…and, just now I was shown an ad which says:
“Ppm Of Co2 On Official Temu.com – Shop Ppm Of Co2 Online on Temu®” [Ad Choice]

That presumably means one or more of these tools (or perhaps one of the browsers I used) is harvesting my search data to sell to internet ad agencies.
Dave, thank you very much for your follow-up comment.
I found it interesting that your clarification of the original question to include reference to Earth’s dry atmosphere resulted in several of the AI bots giving yet again different answers.
And why weren’t these—ahem—”intelligent” bots smart enough to ask initially for clarification as to the phrasing of your original question?
Finally, I’ll just observe that the last three paragraphs of your comment are rather alarming if, under the claim of providing AI services, some businesses are really just doing data mining of human intelligence (“inputs”) for commercial purposes without first obtaining permissions to do such.
I’m not persuaded that “AI” exists, and for the following reason.
I watch a lot of tv news programs and documentaries in French, as I live in Quebec. Because my conversational French is not up to snuff, I always have captions on, and because it’s a pain to toggle captions on and off, I leave them on for English broadcasts as well.
So I end up watching BBC CBC PBS news and documentaries with captions. And I have seen no evidence of any intelligence in the captioning. The only progress, if one can call it that, is that the captions have become “intelligent” enough to often simply skip over (ie. omit) proper names of people and places. But they conintue to make the most astounding mistakes with everyday nouns, where the correct word would be obvious to any person of average intelligence.
Often the usage is simply senseless. The obvious word is replaced by the most improbable and used in a context where it’s nothing but meaningless noise.
Of course it’s unreasonable to expect artificial intelligence to be created by genuine stupidity. And stupidity in the English language is flourishing. Take the use of the word “literally”. This word which sadly, seems to be rearing its ugly head again, is almost never uttered in a rational way in newscasts I’ve heard.
It’s utterly senseless to use the word “literally” for a situation that could never be “virtual”. Yet it happens all the time. For instance “I literally cried”, or even “I literally died”. How would it ever be possible for a computer program to make sense of this? It would “literally” have to be just as stupid as its creator, and then the product would properly have to be called “artificial stupidity”…
P.S. — The APA has guidelines for citing ChatGPT, here:
https://apastyle.apa.org/blog/how-to-cite-chatgpt
and for how it can be used in academic publications, here:
https://www.apa.org/pubs/journals/resources/publishing-policies?tab=3
(I should have including that info in my article.)
Excellent! I’m now waiting for the first scientific paper authored totally by an AI bot to pass peer-review and then be published.