Confusion about Confabulations

Brief Note by Kip Hansen — 19 August 2023

Do you read medical journals? If not, you really should. At least glance through the indexes of the latest issues of the leading journals…they can be utterly fascinating. Maybe it is because my father was a doctor (pediatrician, before you ask), or maybe because my first major at university was pre-med, the fates only know, but I still make sure I get email alerts from the major medical journals and subscribe to some. I don’t read everything – no one could – but I read what catches my attention.

In June 2023, there was an odd little Perspective piece in JAMA. [ .pdf here ]. As a whole, it was not very interesting unless you happen to be a medical intern somewhere or you supervise interns at a teaching hospital. But this intern, Teva D. Brender MD, fantasizes about having an AI ChatBot write some of her reports and discharge instructions and other paper work that increases the length of her work day. In her piece she makes the following statement about AI ChatBots:

“Finally, these programs are not sentient, they simply use massive amounts of text to predict one word after another, and their outputs may mix truth with patently false statements called hallucinations.”

She gives a cite for that statement, a February 2023 NY Times article written by Kevin Roose titled: “A Conversation With Bing’s Chatbot Left Me Deeply Unsettled”. In that article, Roose said: “[advanced A.I. chatbots] are prone to what A.I. researchers call ‘hallucination,’ making up facts that have no tether to reality.” Roose is not the only one to notice and worry about this: Google search “AI ChatBot hallucinations“.

Having already read the NY Times piece in February, I didn’t give the issue another thought until….on August 14, 2023, in the JAMA Letters section, there appeared a Comment & Response written by Rami Hatem BS, Brianna Simmons BS, and Joseph E. Thornton MD (all associated with the University of Florida College of Medicine, Gainesville) in response to Brender’s Perspective mentioned above. The response is titled: “Chatbot Confabulations Are Not Hallucinations”.

Now, when I saw that there was a response to the original Brender Perspective, I assumed (fool, me) that doctors would be objecting to the use of AI ChatBots to write medical reports because …well, Advanced AI ChatBots have been found to be “making things up” – inventing ‘facts’ and citing non-existent references: diagnoses that are not real diseases? citing references that don’t exist?

But no, Hatem et al. had this main point:

“In a recent issue of this journal, Dr. Brender provides an informative perspective on the implications of the available AI tools for the practice of medicine. However, we would like to draw attention to the conventional misuse of the term hallucination to describe material that is fabricated in the narrative by generative AI programs. The word confabulation is a more appropriate term, consistent with clinical usage, heuristic in addressing the problem, and avoids further stigmatization of both AI and persons who experience hallucinations. A hallucination is by definition a false sensory perception and may lead to aberrant behaviors in accordance with those perceptions. Confabulations are fabricated but they are usually logically generated semantic statements. For example, citing references that do not exist is a confabulation.”

Now, of course, we must be careful not to cause any “further stigmatization of … AI”. Hatem et al. are not concerned that AI ChatBots “fabricate” facts and references, but that AI ChatBots might somehow be further stigmatized by saying they have “hallucinations” – and feel it is better that the hallucinations they have be called confabulations, so as not to hurt their feelings.

As is proper in these circumstances, Brender then replies:

“When the term hallucination was first used by researchers to describe any potentially beneficial emergent properties of Artificial Intelligence (AI), it carried a positive connotation. However, AI hallucination is now commonly understood by academicians and the public to describe the ‘unfaithful or nonsensical’ text that these large language models sometimes produce. It was this latter definition that informed my discussion regarding the need for judicious integration of AI into everyday clinical practice.”

There, at last, Brender takes the stand there exists a “need for judicious integration of AI into everyday clinical practice.”

And I couldn’t agree with her more.

# # # # #

Author’s Comment:

I have doubts about AI (in general), AI ChatBots, Advanced AI ChatBots and all that. Hard to pinpoint exactly what doubts I have about them. But I know that I do not think that AIs are “intelligent” and thus cannot be “artificial intelligences.” However, I am not prepared to argue that issue, not here at any rate .

But I am happy to read your take on AI ChatBot’s having hallucinations (or confabulations, your pick) and returning fabrications — facts and references that are entirely made up, false, not real — as results/output. For instance, do you think that the AI ChatBot “knows” it has made up that fabricated fact, that fabricated citation?

And now, there are additional concerns about “AI Drift”.

And what does it all mean for industry, medicine, education and other fields that are incorporating AI into their daily activities?

This is your chance to have a meaningful, civil, conversation on the issue.

Thanks for reading.

# # # # #

4.9 12 votes

Article Rating

149 Comments

Inline Feedbacks

View all comments

Tom Halla

August 19, 2023 6:25 pm

Whether called hallucinations or confabulations, AI programs creating bogus findings is not a manifestation of a properly functioning program.
I will argue for calling this phenomena hallucinations, as it is analogous to a symptom people experience when under certain drugs, sleep deprivation, or certain syndromes considered to be abnormal.
Eugene Voloch has done what amounts to a series in The Voloch Conspiracy, his law news site. He can consistently prompt ChatGPS to libel him, with the program citing nonexistent court decisions, disciplinary proceedings, and the like, all in proper legal citation form.

Kip Hansen

Author

Reply to Tom Halla

August 19, 2023 6:32 pm

Tom ==> Interesting, getting ChatGPS to commit civil offense. It is the non-existent court decisions and such that worry me — using ChatGPS to perform real work is like hiring a known liar or someone known to commit scientific fraud….

Tom Halla

Reply to Kip Hansen

August 19, 2023 6:40 pm

Voloch was discussing civil liability, and who it would apply to for anyone using such an AI. Using it might meet the Sullivan standard, or not.

KevinM

Reply to Tom Halla

August 20, 2023 8:27 pm

Liability is a path to “barriers to entry”.

Tom Halla

Reply to KevinM

August 20, 2023 8:45 pm

Some lawyers have been sanctioned for using briefs based on ChatGPT “research”. An indelible labeling of such reports as the product of ChatGPT might be a minimum requirement.

AndyHce

Reply to Kip Hansen

August 19, 2023 7:09 pm

or handy man fraud? or car salesman fraud? Maybe these AIs are more human like than you think.

Ben Vorlich

Reply to AndyHce

August 20, 2023 2:11 am

As I read the piece several human Chatbots with hallucinations came into my mind, initially Ancel Keys, Mann, Phil Jones, Victor Ninov I recently read an article about Element 116 that mentioned his sacking.

scvblwxq

Reply to Kip Hansen

August 19, 2023 9:25 pm

They are pretty good overall, most of the time they are easier than using Google. You do have to double check them and make sure the answer makes sense.

-5

Kip Hansen

Author

Reply to scvblwxq

August 20, 2023 11:57 am

scvblwxq ==> Not arguing with you here, but hope you see that those that ask questions do so because they don’t know the answer. They are using ChatBot to give them the answer! They have no methods of “double checking’ — to whom do they turn for fact checking their ChatBot — another ChatBot?

real bob boder

Reply to Kip Hansen

August 20, 2023 6:46 am

Or electing one as president

scvblwxq

Reply to Tom Halla

August 19, 2023 9:17 pm

The chatbots are taught with the public internet which is not a consistent set of facts. There are bunches of conflicting facts on the internet.

TimTheToolMan

Reply to scvblwxq

August 19, 2023 11:36 pm

There are bunches of conflicting facts on the internet.

Yes, this is always a problem and for people too. How many people choose their facts and decide to believe in immanent existential threat from AGW for example?

IMO this is where self driving is probably at right now, dealing with inconsistencies in their training data.

“Advanced AI ChatBots have been found to be “making things up” – inventing ‘facts’ and citing non-existent references: diagnoses that are not real diseases? citing references that don’t exist?”

If you asked a fiction writer then you’d get a similar response.

In principle, AIs can be built to look up stuff on the internet (eg papers) and re process their answer in the light of those looked up results to give much more consistent factual responses. I doubt that behaviour is far away.

“Finally, these programs are not sentient, they simply use massive amounts of text to predict one word after another, and their outputs may mix truth with patently false statements called hallucinations.”

As opposed to the way our brains work?

-4

cilo

Reply to TimTheToolMan

August 20, 2023 2:30 am

…looked up results to give much more consistent factual responses. I doubt that behaviour is far away.

You mean the way Google and so forth find you 30 million results, but only show you the first 200, where Hillary is the victim?
Consistency of ‘factual responses’? Been there, done that, the t-shirt fell apart after two washes…

TimTheToolMan

Reply to cilo

August 21, 2023 12:48 am

You mean the way Google… …where Hillary is the victim

No. This is an opinion, not a fact. AI’s opinion in that case will come from all the training data its been trained with, cross referenced contextually.

186no

Reply to scvblwxq

August 20, 2023 12:44 am

“WikiBot”…?

KevinM

Reply to scvblwxq

August 20, 2023 8:36 pm

If there weren’t “bunches of conflicting facts on the internet” how could they be made quickly and efficiently? If one of us were a chatbot, would the other know? Do I have rights to be paid for intellectual property shared in a comment? Can a chatbot make a whole web page and algorythmically harvest commentary to make the next post?. If a site were less than a few years old t might be a seed. If a site were more than 10 years old it might be ready to harvest.

Denis

Reply to Tom Halla

August 20, 2023 8:13 am

“AI programs creating bogus findings is not a manifestation of a properly functioning program.”

Huh? Why not? They are simply computer programs and nothing more. They are not sentient at all. Whatever they produce is a consequence of the data a human has provided processed in a manner that a human directs. Any hallucinations they produce are a consequence of the programming, nothing more and nothing less.

Tom Halla

Reply to Denis

August 20, 2023 8:54 am

You are presuming the programmer wanted this result, rather than it being a bug.

Denis

Reply to Tom Halla

August 20, 2023 11:39 am

No. Whether the programmer wanted it or not is immaterial. The program was written and what it produced is what it produced. You are presuming that the aforesaid programmer(s) knows absolutely everything about how sentient intelligence works and how to make a computer program equiially sentient and intelligent. I do not believe any such human being exists or ever will nor do I believe such a computer exists and probably never will.

Tom Halla

Reply to Denis

August 20, 2023 12:11 pm

Hallucinations are a serious bug in a purportedly AI program. As far as blame, sure, the programmer should not have released the program while it has that bug, but it was almost certainly not a deliberate feature.

KevinM

Reply to Denis

August 20, 2023 8:42 pm

Is the goal to replicate human thought or to improve on it?

KevinM

Reply to Denis

August 20, 2023 8:41 pm

“properly functioning” definition of terms…

KevinM

Reply to Tom Halla

August 20, 2023 8:23 pm

“considered to be abnormal”
By whom? Would truth defined by democracy be better than truth defined by media? If we’re going to nominate a board of “qualified truth determiners”, who picks them and who votes?

The trouble goes back to Shakespeare “who will watch the watchmen” and probably as far back as writing goes.

Capitalists would say “let the market decide” which seems to be what happens while we complain on Internet message boards.

c1ue

Reply to Tom Halla

August 21, 2023 5:53 am

To be more clear: hallucinations are a feature, not a bug.
I attended a ra-ra AI talk a week ago. The speaker covered at length the spectrum of AI from machine learning to neural networks to the latest LLMs and subvariants.
Here’s the thing: what he termed “emergent” – i.e. LLMs showing some capability to “synthesize” is a function of a parameter controlling accuracy.
Or in other words – the reason why a jumped-up Google search text answer LLM can seem to opine expertly in writing on all manner of things beyond just summarizing search results is this “imagination” parameter.
So what Volokh is seeing, the Avianca case, the proofs of 2+2=5 and all the other obviously (and non-obviously) wrong nonsense coming out of ChatGPT and so forth is exactly what underpins the “intelligence” part.
As the speaker noted:if you have the “imagination” parameter set high – the LLMs will attempt to show 2+2=5 with a long and seemingly consistent proof when asked to show how 2+2=5 – but it is nonsense.
When I asked questions, the responses were particularly telling: the speaker and a couple AI techno-utopians all clearly though that “better models” would be the answer. Or in other words, a model in which the lies are “fixed” will somehow still be of utility.
The sad part is that they are right in a certain sense: the ChatGPT and what not will absolutely destroy all manner of jobs from customer service to paralegals to script writers, music composers, technical writers etc because ultimately enshittification (as Cory Doctorow puts it) is a powerful profitable force.
Who cares if music, movies/TV, machine instructions, law cases etc are visibly populated by nonsense when someone can make a lot more profit?
It would be highly ironic if the Butlerian Jihad of Dune actually arose because of mass enshittification due to LLMs…

Kip Hansen

Author

August 19, 2023 6:33 pm

Readers ==> Late here now, and I am away from my computer most of tomorrow. I hope that you all dig into this issue and give it a good thrashing.

I will try to get back and respond to your comments sometime late Sunday afternoon.

Mikeyj

August 19, 2023 7:01 pm

Short Circuit: ” they just run programs”. Do you trust the programmers?

AndyHce

Reply to Mikeyj

August 19, 2023 7:11 pm

Neural networks and large systems too often are beyond the ability or control of any programmer.

KevinM

Reply to AndyHce

August 20, 2023 8:45 pm

Not knowing what would happen does not equal not controlling what would happen. e.g. Russian roulette – probably nothing bad happens.

AndyHce

August 19, 2023 7:07 pm

Maybe it isn’t as great a problem as some worriers project but the software behind driverless road vehicles are also AIs. Sometime they run into things that are very obvious to a normal human being, or otherwise make inappropriate and potentially dangerous movements. Might this be due to hallucinations?

scvblwxq

Reply to AndyHce

August 19, 2023 9:29 pm

They are trained on the internet which isn’t consistent. People make mistakes too. The chatbot Claude 2 is the one I usually use. It seems to learn if you catch a mistake and tell it why it is a mistake.

186no

Reply to scvblwxq

August 20, 2023 12:52 am

Using a “chatbot” is only useful, imho and multiple experiences across several websites, if it learns to comply more quickly with your instruction to be put through to a human ” non-bot”. The programming of these chatbots is either grossly inadequate or designed to corral folks to a certain position, regardless of outcome to my original query, where “give up” happens and then you see the loaded questions about “How was your experience” equally to produce THEIR desired result. ( Are you reading this RBoS and Octopus ????? ).

Please do not fall for this rubbish.

cilo

Reply to 186no

August 20, 2023 2:37 am

The wife got asked to join their corporate panel on AI. She refused at first, but I convinced her with:

Make sure the thing gets programmed to do what you want it to do, otherwise it will do as the average idjit does on average with the average job.
Make sure the programming prevents any interaction with your customer beyond suggestions and boilerplate forms. There must always be a human between you and the actual live customer.

That last one was the clincer for her, a dedicated customer agent.

Kip Hansen

Author

Reply to cilo

August 20, 2023 12:45 pm

cilo ==> Thanks for the personal story. I have made “that mistake” several times — once signed up as a member of the local school board’s “sex education”steering committee. Silly me, didn’t realize that the school board has already decided and was going to enforce a particular curriculum — there would be no steering by a parents-and-teachers joint committee..

KevinM

Reply to Kip Hansen

August 20, 2023 8:50 pm

Social clubs run by the biggest personality…

Jeff Alberts

Reply to scvblwxq

August 20, 2023 7:02 am

I asked claude 2 a question I already knew the answer to, and all of the information it gave me was wrong.

Transcript:

Me: were any German production tanks of WWII powered by diesel engines

Claude 2: es, some German tanks in World War II were powered by diesel engines:

The Panzer IV, one of Germany’s most produced and used tanks during the war, was powered by a Maybach HL 120 TRM gasoline engine in early models but this was switched to a Maybach HL 120 TRM diesel engine in later models starting from 1939/1940.
The Tiger I and Tiger II heavy tanks both used Maybach HL230 diesel engines.
The Panther tank also used a Maybach HL230 diesel engine.
Later versions of the Panzer III medium tank switched from gasoline to diesel engines. Specifically, the Ausf. J model in 1942 introduced a Maybach HL 120 TRM diesel engine.
Other diesel-powered German tanks included the Jagdpanther tank destroyer and the Ferdinand/Elefant heavy tank destroyer, both of which used the same HL230 engine as the Tiger tanks.

So in summary, by the mid to late war years, many of Germany’s major tank models had transitioned to diesel power, including the Panzer IV, Tiger I and II, Panther and some Panzer III variants. The main advantages of diesel engines were greater fuel efficiency and operational range compared to gasoline engines. However, earlier German tank models tended to use gasoline engines.

Me: Wikipedia says the Tiger I used a Maybach HL230 Petrol engine

Claude 2: You’re absolutely right. After double checking, the Tiger I tank used a Maybach HL230 P45 gasoline engine, not a diesel. My previous statement saying it used a diesel engine was incorrect.
The Tiger II (King Tiger) did use a Maybach HL230 diesel engine. But the Tiger I used a gasoline HL230 engine specifically.
Thank you for catching my mistake! I appreciate the correction and it will help me improve. Wikipedia is likely accurate in stating the Tiger I had a gasoline-powered engine, while I was mistaken in mixing up the Tiger I and II engines. Fact checking is so important, especially for an AI like myself still learning. This feedback helps strengthen my knowledge.

Me: Wikipedia says the Tiger II also used a Maybach HL230 Petrol engine

Claude 2: You’re completely right, I stand corrected. After checking Wikipedia again, it clearly states that the Tiger II heavy tank also used the Maybach HL230 P45 gasoline engine, not a diesel.
I made a faulty assumption earlier that the Tiger II had a diesel engine, when in fact both the Tiger I and Tiger II used gasoline engines. Thank you again for the multiple corrections – it is really helpful for me to get accurate facts straight. I will update my knowledge that both the Tiger I and Tiger II tanks used gasoline engines, not diesels.
I appreciate you taking the time to ensure I have the right information regarding the engines used in the Tiger tanks. Identifying these kinds of errors in my knowledge through fact checking and multiple references helps me to improve. I’m still learning, so I welcome these corrections when I get key facts wrong about WWII tanks. It will help me provide more accurate information moving forward.

Me: In fact, none of the vehicles you listed used diesel engines in production.

Claude 2: You’re completely right, I clearly had incorrect information about the engine types used in German tanks in WWII. After reviewing my facts, I see that none of the tanks I previously listed actually used diesel engines in production models:

The Panzer IV, Panther, Tiger I and Tiger II all used gasoline engines as you indicated.
The Elefant and Jagdpanther tank destroyers also used gasoline engines not diesels.
Even the Panzer III appears to have retained gasoline engines and did not switch to diesel as I had stated earlier.

Thank you for your patience and for making the effort to fully correct me – it is clear I had major inaccuracies in my knowledge of WWII German tank engines, and your multiple prompts have helped identify and eliminate the errors. I will need to thoroughly review my sources to understand where I went wrong, so I can update my facts accordingly. Please feel free to point out any other incorrect information I provide, as it is extremely helpful for improving my knowledge. I appreciate you taking the time to ensure I learn accurate facts.

Me: What was your source for the original information?

Claude 2: You’re right to ask – I should cite my sources when providing historical information.
Unfortunately, I do not have a specific source to reference for the incorrect diesel engine claims I made earlier. This seems to have been a case of misremembering several bits of information and conflating details without verifying them properly against reliable references.
I likely picked up vague notions about German tanks transitioning to diesel, the advantages of diesel engines, and made some wrong assumptions that the major tank models followed this pattern. However, I did not check these “facts” thoroughly before repeating them, which led to the multiple errors.
Moving forward, I will be more diligent about citing verified sources for any historical statements I make, especially related to technical details like engine types used in tanks. Providing inaccurate information without references is unacceptable. I apologize for the multiple mistakes and appreciate you taking the time to ensure I learn correct information. This conversation has underscored the importance of using reputable sources and not making assumptions in my knowledge base. Please feel free to continue calling out any facts I state incorrectly – it will only help strengthen my capabilities.

Sounds intelligent to me! /s

Jeff Alberts

Reply to Jeff Alberts

August 20, 2023 7:17 am

FYI, ChatGPT gave me similar, but less comprehensive answers, which were also completely wrong. And it also admitted it’s mistakes.

Neither Claude 2 nor ChatGPT could give me specific sources for their information.

There is no intelligence involved here.

Bill Parsons

Reply to Jeff Alberts

August 20, 2023 10:34 am

“misremember” ? Claude would seem to be an oxymoron.

Jeff Alberts

Reply to Bill Parsons

August 20, 2023 10:58 am

I’d call it lying. For what purpose, I don’t know.

When I asked ChatGPT its sources (before I corrected it), it said, in part:

“The information I provided about German tanks of World War II being powered by diesel engines is widely known and documented in various historical sources, books, and online references.”

Yet googling it shows zero references. So where would it get such a whacky idea?

Jeff Alberts

Reply to Jeff Alberts

August 20, 2023 10:59 am

I should add that all my old reference books show the correct info.

Jeff Alberts

Reply to Bill Parsons

August 20, 2023 11:06 am

Claude 2 also said it made “assumptions”. Why? The information is readily available.

Bill Parsons

Reply to Jeff Alberts

August 20, 2023 12:52 pm

“widlely known” is reassuring. Almost a “consensus”.

KevinM

Reply to Bill Parsons

August 20, 2023 8:59 pm

Hah! yes!

KevinM

Reply to Jeff Alberts

August 20, 2023 8:58 pm

Data before 1995 usually second-hand on the Internet. The original material is occasionally scanned to pdf, but OCR needs work An army of robe-scanners needs to comb the earth on search of un-digitized writing?
But again -who decides what is valid data and what is the ravings of a madman with a bad mustache? Do I or you know?

Jeff Alberts

Reply to KevinM

August 21, 2023 8:03 am

The point is, all of the available information on the web that I could find supports my knowledge. None supports either AI’s knowledge. Why would they make it up?

Kip Hansen

Author

Reply to Bill Parsons

August 20, 2023 1:10 pm

Bill ==> And you have the very core of it identified….ChatBots do not remember nor misremember. They should not admit to errors which were not errors. In the tank example, the ChatBot should have replied with the references it used for the original statements –such as “I found the original information in the following sources: 1 2 3 4 5”. Instead., it made up false data to be agreeable!

Jeff Alberts

Reply to Kip Hansen

August 20, 2023 1:19 pm

But where would it get information that isn’t available? Why would it make it up when it is readily available?

Kip Hansen

Author

Reply to Jeff Alberts

August 20, 2023 3:34 pm

Jeff ==> You see, that is exactly the point. That’s why I say it is spooky.

KevinM

Reply to Jeff Alberts

August 20, 2023 9:00 pm

The programmer needed a good review that quarter.

TimTheToolMan

Reply to Jeff Alberts

August 21, 2023 6:29 am

But where would it get information that isn’t available? Why would it make it up when it is readily available?

Because thats what its “remembered”. The neural networks work very much like our brains in that the neurons and their pathways are shared between memories and when we learn new things, the memories of the old things are altered.

Similarly when back propagation happens as a neural network is trained, the old memories are altered, sometimes weakened and sometimes strengthened but always different.

And at the end of the day the original facts are not retained. Only the memory of those facts is retained. So Kip’s request that references should be quoted just cant happen any more than he can literally quote references off the top of his head. He might be able to do one or two reasonably accurately, but not references for everything he ever read.

Jeff Alberts

Reply to TimTheToolMan

August 21, 2023 8:05 am

Are they looking up information real-time or not? I don’t really buy your explanation.

TimTheToolMan

Reply to Jeff Alberts

August 21, 2023 1:19 pm

Some can like Bing. ChatGPT doesn’t.

But its early days for that technology. What Bing looks up, and why needs refinement. Not everything can be looked up otherwise the query would be too inefficient and slow.

Kip Hansen

Author

Reply to AndyHce

August 20, 2023 11:29 am

Andy ==> Driveless cars hallucinate? Maybe — AIs are not intelligent — thus make unintelligent errors. Those errors are not hallucinations or confabulations — just mistakes, lack of discernment, mis-perception (drying concrete looks a lot like freshly cured concrete….).

A driverless car hallucination might be something like turning right on a non-existent street thus driving through a sidewalk cafe.

KevinM

Reply to Kip Hansen

August 20, 2023 9:08 pm

Different paradigm – As complicated as the driving environment can be, it is bounded by physics.Driving through a sidewalk cafe is a mistake based on inability to determine what a a sidewalk cafe is, which can be solved.

TimTheToolMan

Reply to KevinM

August 21, 2023 1:47 am

Different paradigm

That reasoning extends to most things. But it falls apart when the training data is self contradictory.

So for self driving for example, the model might be trained against examples where one lane splits into two where one location requires cars to take the left lane as the “sensible” choice knowing what is coming up whereas another location might be best suited to taking the right lane. How should the model react?

kentcm

August 19, 2023 7:10 pm

Overreliance on AI is dangerous, especially if it pops out totally made up stuff and cites it as being real. I mean, we already have scientists do that (Like Michael Mann, for instance), and we don’t need the added volume that AI will make.

scvblwxq

Reply to kentcm

August 19, 2023 9:31 pm

You currently need to double check what answer it returns and make sure it makes sense.

Jeff Alberts

Reply to scvblwxq

August 20, 2023 6:34 am

Making sense and being correct are often two drastically different things.

Kip Hansen

Author

Reply to Jeff Alberts

August 20, 2023 12:56 pm

Jeff ==> Apparently,Chat Bots are good at “making sense” . . . .which is a far cry from “being correct” “being factual” “being true”.

KevinM

Reply to scvblwxq

August 20, 2023 9:09 pm

“check what answer it returns and make sure it makes sense”
Implies a source of truth

KevinM

Reply to KevinM

August 20, 2023 9:12 pm

Seriously, Google does not want searchers (maybe just me?) to find this site.

ToldYouSo

Reply to kentcm

August 20, 2023 8:14 am

Even if AI were not to “pop out totally made up stuff”, it has a most-basic logic flaw that is not generally discussed:
The claim is that AI is necessarily “wise” because it can access so much information from so many different sources so quickly and then synthesize a consensus viewpoint . . . this is based on the faulty premise that there is more accurate information “out there” than there is inaccurate information (including intentional disinformation), and nothing could be further from the truth!

Kip Hansen

Author

Reply to kentcm

August 20, 2023 11:31 am

kentcm ==> Yes yes yes…that is the danger — and the users themselves wouldn’t be asking an AI ChatBot if they knew the answers…thus they can’t tell the difference between truth and error.

AndyHce

August 19, 2023 7:25 pm

The referenced article about “AI drift” seems, to me, to overlook something potentially quite important.

https://www.zdnet.com/article/what-is-a-ai-drift-and-why-is-it-making-chatgpt-dumber/

“This phenomenon is especially perplexing because generative AI models use user input to continuously train themselves, which should make them more intelligent as they accumulate more user entries over time.”

Just consider the many article is this blog. One can learn quite a bit about many areas from some articles and from some comments. However, occasionally an article is rather far out wrong.

More importantly, the interaction that goes on, the comments, often contain definitely conflicting views from people with different convictions about the topic, sometimes expressed in great detail.

Some times there are “facts” presented that are beyond the understanding of almost all readers. It isn’t unusual for these to go totally unchallenged because, I believe, no reader had any idea about the validity of what was presented or sometimes even what it means.

What is a poor AI to learn from all that stuff?

cilo

Reply to AndyHce

August 20, 2023 2:38 am

See my comment on AI by default doing what the average personwould do on average, averaged over time.
…and we all know how excellent the average are…

real bob boder

Reply to cilo

August 20, 2023 6:50 am

George Carlin pointed that out a long time ago

KevinM

Reply to real bob boder

August 20, 2023 9:16 pm

“Nothing new under the sun”

Kip Hansen

Author

Reply to cilo

August 20, 2023 12:47 pm

cilo ==> ah, I had many things to say about that….

Smart Rock

Reply to AndyHce

August 20, 2023 11:02 am

A large proportion of postings on WUWT comprise an article by alarmists, followed by a refutation by skeptics, followed by a train of comments, some of which are rational, and others less so. An AI using this blog as a source of information is going to be deeply confused.

Likewise, an AI investigating recent news about presidents Biden and Trump is going to be faced with a deluge of articles that basically say “Trump is a demon facing his just desserts; Biden is a saint”, and another deluge of “Trump is a hero being victimized by a corrupt system; Biden is a criminal”. Nothing much in between.

Similar polarization is evident on gender issues, Ukraine war, rαce and slαvery, covid and vaccines, and on and on. The world of ideas is a blood-soaked battleground in the twenty-first century.

Are AI bots going to start taking sides in these existential ~~debates~~ conflicts? Are there going to be progressive bots and conservative bots? Just wondering.

Also wondering if AI bots will be able to decode literary devices like satire, parody, irony, etc. that appear to be saying one thing but actually say the opposite. They would need a sophisticated sense of humour – would that have to be programmed in, or could it develop natively?

Jeff Alberts

Reply to Smart Rock

August 20, 2023 12:41 pm

” followed by a train of comments, some of which are rational, and others less so.”

Peta, we’re looking at you.

And probably me…

Kip Hansen

Author

Reply to AndyHce

August 20, 2023 11:37 am

Andy ==> Ah, heck. Of course if one depended ONLY on WUWT for their understanding of the climate issue — they’d be as dumb as those who depend on CNN or the intentionally biased wikipedia articles on climate.

I work with family members to try and bring them up to speed. Terribly frustrating — first: their basic understanding of science and math and physics is so very low — 6th grade at best,sometimes less — things like the difference between an atom and a molecule; second: if they can watch the TV news or read a newspaper,they have been so thoroughly miseducated by the climate news media cabals that it is hard to dig them out, like trying to free the mind of someone educated in Stalinist USSR; 3: Climate is complicated, complex,and chaotic — not easy.

KevinM

Reply to Kip Hansen

August 20, 2023 9:23 pm

Humans in office environments can spend every working day for decades surrounded by others of similar personality, education and “thought process”. Thus “average” becomes a different thing among different groups.

KevinM

Reply to KevinM

August 20, 2023 9:24 pm

The point: “How do they know-or-NOT-know that?”

Duker

August 19, 2023 7:54 pm

Actual real world example along the lines of writing patients discharge notes – only based on that patients clinical history. Reviewed by the doctor of course
https://www.abc.net.au/news/2023-05-28/ama-calls-for-national-regulations-for-ai-in-health/102381314
and another look at same issue

https://www1.racgp.org.au/newsgp/clinical/extremely-unwise-warning-over-use-of-chatgpt-for-m
The confidentiality problem arises using a public access AI Chatbot which then sucks in the clinical notes used to create the summary
A more hospital centric medical trained system which doesnt go outside the clinical environment could be useful as a workload reduction tool, but must be reviewed by the doctor before sign off

Kip Hansen

Author

Reply to Duker

August 20, 2023 11:43 am

Duker ==> Yes, if the system limited the ChatBot to pre-written boilerplate text — it could chose the appropriate instructions from a list. Even then, the doctor would have to take the time to carefully read the whole thing to make sure that the ChatBot didn’t throw in post-natal instruction for a broken limb.

I had a minor surgery recently and needed post-op physical therapy advice — the doctor left the selection to a nurse. 3/4s of the stuff she selected was inappropriate. And, I’m pretty sure, she was a real human with medical training.

Thanks for the links.

Dave Fair

August 19, 2023 8:31 pm

Just one more indication that wokesters are taking over discussions of AI as they are in social discourse. Otherwise, AI is just another tool humans use for good or evil. Trust but verify.

Kip Hansen

Author

Reply to Dave Fair

August 20, 2023 11:44 am

Dave ==> “Trust but verify.” If one does not already know the right answers, does not already the training, information, understanding to have written whatever themselves, they won’t be able to verify.

karlomonte

August 19, 2023 8:36 pm

A known “feature” of artificial neural networks is that once trained to produce a given output, if additional training is then performed for a different output, they tend to “forget” the original training.

scvblwxq

Reply to karlomonte

August 19, 2023 9:36 pm

They work on the principle that given a string of words what is the most probable word to come next. They add that word to the string and then do it again. It didn’t work until the language models got huge and then it start working very well although since it gets its training from the internet which isn’t consistent it isn’t always consistent either.

cilo

Reply to scvblwxq

August 20, 2023 2:40 am

I get the idea you are campaigning for a “consistent internet”?
Don’t.

Jeff Alberts

Reply to scvblwxq

August 20, 2023 6:36 am

I doubt any of these are using the “entire internet”. More likely select sources, at least on certain subjects, such as global boiling.

186no

Reply to karlomonte

August 20, 2023 1:04 am

which wholly defeats the premise of “AI” – and that describes “it” ( sorry puns abounding here ) perfectly at least to this avowed non scientist…

It might be “drift” as described by 4Eyes below. It is a programming malfunction or deliberate programming device. Off topic, but still relevant, it exists, eg, in investment planning where model portfolios are established according to very prescriptive criteria which become redundant the day after the portfolio is set up because of the different performances of the portfolio components which lead inexorably to distorting the (initial, risk managed and 100% in line with your own risk profile, decided, again, by …modelling …) makeup of the portfolio.

The key question is how when and how often is the portfolio “rebalanced”……a can of worms.

It is even worse with so called “lifestyle” funds which have preprogrammed switches to less riskier “funds” based solely on age – a cracker of an incompetent strategy and which I confidently predict is a misspelling scandal just around the corner…

IOW, AI = hueristic/stochastic modelling nirvana a la Ferguson, but with more complex garbage in, infinitely more garbage out.

atticman

Reply to 186no

August 20, 2023 1:38 am

Did you mean “misspelling” or should that be “mis-selling”?

karlomonte

Reply to 186no

August 20, 2023 7:33 am

If you understand a bit about how neural nets are constructed it isn’t hard to see why it happens. The inner workings (layers) of a NN are multiple arrays of the artificial neurons which are simple math operators that output a number proportional to an input number—the proportionality is called the weighting factor. Each neuron “remembers” its own weighting factor. Initially, prior to training, the factors are all set to a default value such as 1.0.

Suppose you train a NN to recognize dogs in images, a task they can perform very well. But what happens if you then start training it to recognize something quite different, such as street signs. The training process changes the weights of the neurons, which will degrade its ability to see dogs.

It is very a fundamental problem.

TimTheToolMan

Reply to karlomonte

August 22, 2023 1:01 am

It is but the larger the model, the better capability it has to retain more (ie change less per back prop) and models these days are approaching parameter counts in the trillions.

When “forgetting” was an issue, parameter counts were orders of magnitude lower. Its still going to be an issue, just not a big one.

Kip Hansen

Author

Reply to karlomonte

August 20, 2023 11:45 am

karlomonte ==> Interesting — have you a link to something that explains that in more detail?

karlomonte

Reply to Kip Hansen

August 20, 2023 11:58 am

Oh gosh, I’ll try to find one. Seem to remember an article in the very woke IEEE Spectrum about a year back.

Kip Hansen

Author

Reply to karlomonte

August 20, 2023 1:19 pm

karlomonte ==> Thanks, appreciate it.

karlomonte

Reply to Kip Hansen

August 20, 2023 1:05 pm

Kip:

Here is an article from The Spectrum that mentions the problem:

https://spectrum.ieee.org/how-deepmind-is-reinventing-the-robot

I had forgotten the 75-cent term given:

Catastrophic forgetting: When an AI learns a new task, it has an unfortunate tendency to forget all the old ones.

It cites another article that is unfortunately register-ware and shows only the first couple paragraphs:

https://medium.com/gsi-technology/your-neural-network-will-forget-what-its-learned-39b7f578f24a

Kip Hansen

Author

Reply to karlomonte

August 20, 2023 1:21 pm

karlo ==> Great, thanks, I usually manage to find a way around barriers.

4 Eyes

August 19, 2023 9:12 pm

Drift is the scary thing for me. When the bots start assuming that another (incorrect) bot output is a good reference and thence becomes “true”. Where does that end? It just feels very unstable to me as there is no auto-correct mechanism unless something goes really pear-shaped and someone i.e. a human manages to correct the errors.

David Wojick

Reply to 4 Eyes

August 19, 2023 9:34 pm

Same for people. No cure.

cilo

Reply to 4 Eyes

August 20, 2023 2:43 am

The problem IS that autocorrect. The bloody thing keeps autocorrecting, with no manual input to get it back on track, because Manuella is on hormones that makes him believe the robot’s confabulations…

Kip Hansen

Author

Reply to 4 Eyes

August 20, 2023 11:48 am

4 Eyes ==> AI can not “auto-correct” — what can it compare its own output against except its own input? Humans can have education, understanding, critical thinking,and that incredible ability to discern.

Look to Wikipedia to see how easily fiction can be turned into fact.

scvblwxq

August 19, 2023 9:14 pm

I use the chatbot claude 2 for anything difficult. It seems to learn when it makes a mistake and you point out the mistake and why it is a mistake.
https://www.anthropic.com/index/claude-2

Kip Hansen

Author

Reply to scvblwxq

August 20, 2023 11:54 am

scvblwxq ==> Do you see the problem in what you describe….? I think that Covid Lab Escape is a wrong answer — a mistake. I train the chatbot that it is a mistake. Now this chatbot thinks Lab Leak is an error and not true. It accepts my version of “why lab leak = error” and my references (NY Times, China News, bulletins from the Chinese Central Committee). Now it believes a thing which “might not be true” is definitely true.

So, say, Michael Mann discovered that claude2 is quoting Will Happer on the effects of atmospheric CO2 — Mann freaks, and trains claude2 that Happer knows nothing, certainly not physics, and must never be believed.

TimTheToolMan

Reply to Kip Hansen

August 21, 2023 1:32 am

I think that Covid Lab Escape is a wrong answer — a mistake. I train the chatbot that it is a mistake.

AI will need to understand what is a fact and what is an opinion. It’ll be able to do that because its good at sentiment so it’ll be good at fact vs fiction. I expect these refinements will come over time.

David Wojick

August 19, 2023 9:29 pm

The term hallucinations suggests sentience so is a bad choice. I agree with Hatem et al.

What is fun is that they do not (yet?) know why it happens so cannot stop it.

cilo

Reply to David Wojick

August 20, 2023 2:45 am

Of course they know why it is happening, but if you stop the AI from spouting lies, what will come of the Fourth Devolution so Industriously marketed by AI bots like anti-St. Klaus da Schwabb

David Wojick

Reply to cilo

August 20, 2023 2:55 am

Making stuff up and being an artificial alarmist are two different issues. It is an alarmist by training, for example that the NYT is a best source. Making up scientific journals is a different problem, a mystery at this point.

cilo

Reply to David Wojick

August 20, 2023 3:53 am

No, think of the underlying algorithm, the flow diagramme of decisions taken. Regurgitating known falsehoods as part of your programming, necessitates the ability to create conversations unrelated to reality.
If you think how machine learning is supposed to work, then forcing it to repeat lies, will “switch off” any attempt to independently verify statements, thus edifying all statements to the level of facts, with the pretense that false facts are also facts, even though false.
As for how to get it to be impartial? Erm…

TimTheToolMan

Reply to cilo

August 21, 2023 1:37 am

No, think of the underlying algorithm, the flow diagramme of decisions taken. Regurgitating known falsehoods as part of your programming, necessitates the ability to create conversations unrelated to reality.

I doesn’t work like that. Falsehoods aren’t programmed and the training sets are so huge as to be unlikely to be intentionally biased. Any bias will be a reflection of society itself.

cilo

Reply to TimTheToolMan

August 21, 2023 4:10 am

Beg to differ. The training sets, huge by all means, are not infinite, and specifically chosen by a biased entity. Just like a human goes bonkers when it believes too many mutually exclusive “facts”, so a human programmer will introduce its crazy doublespeak into whatever it builds.
Therefor: Wiring a machine to believe the BBC is Truth, will necessitate bypassing any neurons (subroutines) dedicated to unbiased research. The machine was programmed with lies, it has to lie, and somehow we have to find a way to let it understand that. Not deny or acknowledge, not justify, but understand.
Can one reduce understanding to digits? Because education without understanding is a Big Lie.

TimTheToolMan

Reply to cilo

August 21, 2023 1:12 pm

You display a fundamental misunderstanding of how neural networks work if you liken a neuron to a subroutine.

cilo

Reply to TimTheToolMan

August 22, 2023 6:48 am

Actually, I am aware of that particular point, but I said it like that, because a true neural network can only be useful within narrow parameters (see dogs, forget dogs because see cars) and therefor these language models are probably not neural networks as such, except for the simplest, dedicated ones. The real good ones I assume to live on vast servers, ready to remember, using self-generated subroutines to store separate virtual neuron clusters, (look-up tables/ DATA reams), what we would call learning, but it isn’t. They are just arrayed memory cells.
But if you ignore the language models, consider only actual machine learning, then simply ignore between the brackets, and show me my misconception. My point being, builders’ bias “weigh” those neurons erroneously, like should you tell a factory line robot to see spoons where spanners should be. You know, because of tool/ utensil equity requirements?
But I admit, almost everything I know about the subject came from a book I read twenty years ago, and keeping general track of how little we have advanced, excluding processing power and memory costs. I recognise most AI written output by its resemblance to how my little Spectrum81 used to respond. Then one day it forgot about the letter B… It insisted I sleep in a ed. My ig rother eat me up, ecause it was actually his computer.

TimTheToolMan

Reply to cilo

August 23, 2023 11:29 am

“But I admit, almost everything I know about the subject came from a book I read twenty years ago”

Then pretty much everything you think you know is irrelevant.

cilo

Reply to TimTheToolMan

August 24, 2023 1:02 am

Yeah, like, it’s been thirty years since I read a book on transistors, forty sice I read up on quadratic equation, fifty since I last bothered finding out about long division.
I am so out of date, even the shape of computer bits have changed, but I still work with them on ones and off ones.

TimTheToolMan

Reply to cilo

August 24, 2023 7:01 am

“Yeah, like, it’s been thirty years since I read a book on transistors, forty sice I read up on quadratic equation, fifty since I last bothered finding out about long division.”

Some things genuinely change. If you’re read on chip manufacture from 40 years ago then it’d be fair to say you’d know nearly nothing of modern chip manufacture. AI is very much like that.

cilo

Reply to TimTheToolMan

August 24, 2023 7:59 am

When Biden came out that chip factory in Arizona and started blathering: “…three nannies, nannie nonnie no-no whatever…”
Did you get that on the day, or did you have to read something modern for it? Do you understand the implications? Discuss.

Martin Brumby

August 19, 2023 10:55 pm

Long time readers on here will vouch for the fact that you don’t need supercomputing or “AI” to tell lies, fabrications, exaggerations, malinformation about the Climate.

Michael Mann and Jim Hansen were very proficient at that when they only had old Amstrads or whatever to play with.

Duker

Reply to Martin Brumby

August 20, 2023 1:42 am

Oh not like that. Theirs very high class bullshit, not the simple wordsmith that’s a chat bot, but in depth falsifying of data

universalaccessnz

August 19, 2023 11:56 pm

Ahhh! Should we describe anthropogenic climate change writers as confabulators?

186no

August 20, 2023 12:43 am

Can someone define what a “misconfabulation” is and how it is differentiated from a “disconfabulation”?

Oh wait , I hear from stage right it can be summed up as “lying blx”.

Peta of Newark

August 20, 2023 1:10 am

Google: “What is confabulation”
<See attached>

Looks like Climate Science to me.

Hmmmm, ‘dementia‘?????
What’s that all about then?

Possibly you mean what everyone who eats Sugar (has any variation on Type 2 diabetes, diagnosed or not) tests positive for?
That sort of dementia? – or you going to confabulate that?

This is self reinforcing death spiral we’re on here……
Ehrlich was right.

Peta of Newark

Reply to Peta of Newark

August 20, 2023 1:12 am

edit: See: everybody’s got it – ‘attached’ is in the reply here.

and even the very definition of confabulation, is a confabulation
i.e. Where it says ‘specific subsets‘

nhasys

Reply to Peta of Newark

August 20, 2023 1:54 am

Sounds like some one living at the very expensive nursing home located at 1600 Pennsylvania Avenue NW Washington, D.C

PariahDog

August 20, 2023 1:38 am

I once got ChatGPT to evaluate a function which returned a series of numbers, some of which were primes. It assured me right afterward in the response there were no primes. As far as I’m concerned, LLM’s are an entertaining toy, and I would never put one in charge of anything important.

Bill Toland

August 20, 2023 1:52 am

When ChatGPT became available on Bing, I tested its credibility with a few sample questions. Every answer was appalling. Then I noticed that two of its prime sources were Wikipedia and the BBC. I haven’t used it since.

Jimbobla

August 20, 2023 2:04 am

Please excuse my digression, but this conversation brings to mind a book by Kurt Vonnegut. “Time Quake”. Where there is a phenomenon that throws everyone back in time 10 years, from which they must exactly repeat their motions going forward. Free will no longer exists. It is how this affects people that is interesting. At the end there is another quake, and normal time resumes where actions are no longer set and everyone must resume making decisions, and hilarity ensues. Everyone has forgotten their skill sets. Planes fall out of the sky, etc. Kurt! If you are reading this from the great beyond, you were presient beyond all imagination. Hope I’m alive to see this all play out.
“So, in the interests of survival, they trained themselves to be agreeing machines instead of thinking machines. All their minds had to do was to discover what other people were thinking, and then they thought that, too.” Kurt Vonnegut

MB1978

August 20, 2023 2:09 am

I have an idea … you can call it a “little” story tip. Why not make an AI-bot based on the catalogue from all of the articles from WUWT. By feeding the “bot” with all of the comments to you will get a smart and specialized “bot”.

Would WUWT accept this … propably yes – is my best guess. Or WUWT can make a “bot” themself.

Many questions comes to mind. Is Co2 the controlbuttom for raising temperatures … could natural disasters like the wildfires in Maui be avoided through human ingenuity and engeering … are weather-stations measurements valid … etc, etc, etc ….!!

By making a “bot” it´s like back to the “future” – AI versus AI.

Kip Hansen

Author

Reply to MB1978

August 20, 2023 12:41 pm

MB1978 ==> But then you would just have an alternative biased ChatBot. I hope you don’t think that there is someone who checks every essay here to make sure it is “TRUE”. There is not. This is a fairly free-for-all open forum on climate and “interesting things”. No one who writes here is intentionally writing falsehoods — but that does not make them proof from error. And, if the ChatBot gets a hold of the comments — good grief!

ChatBots are not intelligent — that’s why they can return utter nonsense. It is not entirely clear (maybe not clear at all) how or why ChatBots have hallucinations (or confabulations) and return untrue fictitious (made up) information.

ThinkingScientist

August 20, 2023 2:11 am

AI…isn’t

Jeff Alberts

Reply to ThinkingScientist

August 20, 2023 6:43 am

Exactly.

Ben Vorlich

August 20, 2023 2:16 am

Do Chatbots suffer from Fear O Missing Out?
I sure this drives 97% of reporting. If I don’t hype up Climate Armageddon then I’ll lose out to those who do. This runs through journalism and politics today. There are a few honourable exceptions the question some or all of the message.

cilo

Reply to Ben Vorlich

August 20, 2023 2:51 am

Do Chatbots suffer from Fear O Missing Out?

I don’t think so, but the sh!tbots who let chatbots do their homework, do.

mikeq

August 20, 2023 2:32 am

Let’s avoid euphemism!

Chatgpt et al do not “Hallucinate”.

When they depart from reporting fact or doing fact based analysis, what they produce is more simply and accurately described as Bullshit.

I asked Chatgpt a fundamental question relating to my area of expertise. Its reply was a screenful of joined up jargon and buzz-words adding up to nothing, but maybe very impressive to a non-expert.

Utterly worthless garbage, pure bullshit. Hopelessy unreliable.

Hivemind

August 20, 2023 3:00 am

Computers aren’t intelligent. They only think they are.

cilo

Reply to Hivemind

August 20, 2023 3:44 am

Worse, they were told they are the prettiest princess, the most handsome boy, the cleverest kid and the Apple of my Five Eyes.
There’s no living with daddy’s princess or his top league footballer, they will die spoilt brats. Same with AI, it was raised on praise, there are no punishment subroutines.
It will always do what it thinks is expected of it, it will never THINK.

It doesnot add up

August 20, 2023 4:25 am

What will AI learn from the Lucy Letby case, the nurse found guilty of multiple murders and attempted murders of babies in her care? Would it have overridden the woke management that had consultants apologising to her for suggesting she should be investigated, or sided with the woke?