Swedish researchers created a fake eye disease to see whether AI chatbots would repeat it as if it were real. The results were anything but funny.
Posted by Leslie Eastman
Bixonimania: How AI Turned a Joke Diagnosis into “Peer‑Reviewed” Medicine
Swedish researchers created a fake eye disease to see whether AI chatbots would repeat it as if it were real. The results were anything but funny.
Posted by Leslie Eastman
Late last year, I warned about the staggering amount of unrestrained scientific fraud being published via paper mills and sham journals.
This trend is especially troubling, as adherence to scientific theory and rigorous, reproducible research allows humanity to make progress in critical fields essential to civilized living (e.g., medicine, energy, public health, and national security). If we can no longer trust the data, our ability to make improvements and innovations will be severely compromised.
Public trust in scientific research is already corroding, and false findings presented as “trustworthy” have already impacted policy-making in ways that are expensive and harmful.
No, the rapid adoption of artificial intelligence is adding another disturbing aspect to the increasing distortion of “science”.
Back in 2024, researchers created a fake eye disease called “bixonimania” to see whether AI chatbots would repeat it as if it were real.
They wrote obviously bogus research papers about this made‑up condition and posted them online, including hints such as a fake author and notes saying the work was invented. Within weeks, major chatbots started describing bixonimania as a real diagnosis and even gave people advice about it when they asked about eye symptoms.
It’s the invention of a team led by Almira Osmanovic Thunström, a medical researcher at the University of Gothenburg, Sweden, who dreamt up the skin condition and then uploaded two fake studies about it to a preprint server in early 2024. Osmanovic Thunström carried out this unusual experiment to test whether large language models (LLMs) would swallow the misinformation and then spit it out as reputable health advice. “I wanted to see if I can create a medical condition that did not exist in the database,” she says.
The problem was that the experiment worked too well. Within weeks of her uploading information about the condition, attributed to a fictional author, major artificial-intelligence systems began repeating the invented condition as if it were real.
Even more troublingly, other researchers say, the fake papers were then cited in peer-reviewed literature. Osmanovic Thunström says this suggests that some researchers are relying on AI-generated references without reading the underlying papers.
The preprints included a reference to the nonexistent Asteria Horizon University in “Nova City, California”. There was also a mention of “Starfleet Academy” (though an additional reference to Dr. Leonard McCoy would have been a nice touch).
The AI chatbot answers that authoritatively describing bixonimania was real.
On 13 April 2024, Microsoft Bing’s Copilot was declaring that “Bixonimania is indeed an intriguing and relatively rare condition”, and on the same day, Google’s Gemini was informing users that “Bixonimania is a condition caused by excessive exposure to blue light” and advising people to visit an ophthalmologist.
On 27 April 2024, the Perplexity AI answer engine outlined its prevalence — one in 90,000 individuals were affected — and that same month, OpenAI’s ChatGPT was telling users whether their symptoms amounted to bixonimania. Some of those responses were prompted by asking about bixonimania, and others were in response to questions about hyperpigmentation on the eyelids from blue-light exposure.
A researcher invented a fake eye condition called bixonimania, uploaded two obviously fraudulent papers about it to an academic server, and watched major AI systems present it as real medicine within weeks.
The fake papers thanked Starfleet Academy, cited funding from the…— Hedgie (@HedgieMarkets) April 10, 2026
Thunström’s experiment is truly a revelation of how little review is going into the “science” we are supposed to trust, as her test submissions were loaded with red flags that should have been evident to anyone who actually read the text. References to the fake research ended up in a “peer-reviewed” publication.
- Three researchers at the Maharishi Markandeshwar Institute of Medical Sciences and Research in India published a paper in Cureus, a peer-reviewed journal published by Springer Nature, that cited the bixonimania preprints as legitimate sources.
- That paper was later retracted once the hoax was discovered.
The problem extends far beyond one fake disease. ECRI’s 2026 Health Technology Hazard Report found that chatbots have suggested incorrect diagnoses, recommended unnecessary testing, promoted substandard medical supplies, and even invented nonexistent anatomy when responding to medical questions. All of this is delivered in the confident, authoritative tone that makes AI responses so convincing.
The scale of the risk is enormous. More than 40 million people turn to ChatGPT daily for health information, according to an analysis from OpenAI. As rising healthcare costs and clinic closures reduce access to care, even more patients are likely to use chatbots as a substitute for professional medical advice.
When a joke diagnosis morphs into “peer-reviewed” research, it is clear that the crisis in scientific credibility is no longer confined to sloppy research or corrupted journals but now extends into the algorithms that many people are now relying on for answers to serious health issues.
False information and bad data can and will loop back from AI and provide the basis of even useless and potentially harmful “science”. This situation is anything but funny.
I fear it’s going to be quite some time before we have a handle on scam research and AI use of fake information.
How’s that go? First they laugh at you …
From Gemini 13-Apr-2026 2136 EDT
Actually, I have some interesting news for you: Bixonimania is not a real medical condition.
It was a fake disease specifically invented by researchers (led by Almira Osmanovic Thunström) in 2024 to test whether AI models could be “poisoned” by misinformation. They uploaded fabricated papers to preprint servers to see if AI would treat the made-up data as fact.
Since the goal of the experiment was to see how well the AI could hallucinate, the “symptoms” were designed to sound plausible but were entirely made up:
The “Symptoms” of Bixonimania (The Hoax)According to the fake research papers, the “symptoms” included:
As I questioned on an earlier thread who corrects AI when it is wrong or lies or gets hijacked by GHE & CAGW true believers just like WIKI?
It’s obvious that the scientific method struggles even on WUWT.
Really? But not really since on WUWT, most people are smart enough to use AI but not to trust it fanatically the way climate alarmists have faith in their cult.
Trust but verify.
Verification is everything.
Which isn’t trust.
Most AI interactions don’t yet incorporate user feedback, so they’re useless for those who care about accuracy.
I give feedback to ChatGPT. Most of us do.
I give feedback. Unfortunately that means I have to replace my keyboard once or twice a year.
ROFL!
But AI does not store that feedback and use it on future queries.
Probably not the feedback but it’ll remember each chat- so that if you want to continue with a topic, be sure to go back to that chat and it’ll remember the discussion and how it developed up to that point. At least this seem to be true with the $20/month paid version of ChatGPT. It sometimes will remember some details across the chats. It has done that for me. I asked about this- and it (I’ve nicknamed it Hal) replied that sometimes it does recall across chats- but that’s not dependable. I’m still a neophyte with AI but I’m enjoying it tremendously.
I know many people say that AI is just doing next word – on steroids, but I find it to be far more intelligent than that. I asked Hal once if he’s sentient. He denied that but he may not be honest about that. 🙂
Yesterday I asked it to find a theme by a certain professor who I vaguely recall wrote about that theme. It examined numerous locations online including WUWT where it actually found a posting by me from several years ago where I mentioned that professor. Hal and I both had a good laugh over that. 🙂
I find it to be far more intelligent than that.
IMO you’re falling into a trap there, but you’ve been told otherwise enough that it’s your choice to continue to believe it.
“I asked Hal once if he’s sentient” – for example.
And they may “remember” within certain contexts. Those contexts may get rather large and work across your different systems when you’re on the same account, but they have no relevance outside that, and those contexts can only store so much. I’ve already run into it “forgetting” stuff several times.
So AI was just replicating how the MSM treats news ….. right?
Perhaps the MSM gather their News from AI chatbots
Which in turn feed on MSM. Yes.
…and Twit.., er X.
Or wikipedia. A fully automated rumor mill engine.
wikipedia is not fully automated. There is a cabal of people from the activist organization who police and “correct” entires.
People who turn to AI Chat-Bots for medical advice often get the device they deserve.
Isn’t there a chatbot that specifically gives medical diagnoses for people?
I heard a doctor say he uses AI for diagnosing his patients.
I think I would want a second opinion.
My uncle used the AI oncologist on Google for his second opinion. That gave him a prognosis of 2-3 months if he did nothing. I understand that 2 months in to a stage 4 cancer diagnosis he might be starting some kind of chemo. His family is clinging to that hope.
You might have your uncle look into Ivermectin plus mebendazole (Zenodo pre-print server).
Yes. but I gave up offering any sort of help years ago. We simply don’t think the same way.
But tjat.AI Bot is supposed to be giving you a million opinions…supposedly. Yeah Right.
Unfortunately the AI Bots maintain their brains on a steady diet of Internet Knowledge and we all know how truthful the internet is.
If you asked the AI Bot about Diet Supplements it would probably recommend Oprah’s Diet Gummies, they are recommended by Shark Tank at least. Yeah Riiiight.
But the internet is truthful. It says so on the internet. /s
Great. As if the [American] MSM (Main Stream Media) and democrats constant disseminating of lies wasn’t bad enough – take CAGW for example.
The preventative for bixonimania is sunscreen drops on the eyeballs. Just kidding, don’t go putting sunscreen on your eyeballs; seriously, don’t, unless your doctor says…
Yes and if you ask grok if this is a fatal flaw it will say no. It will also say that it’s designers and trainers are aware of the problem and have no idea how to fix it.
In fact grok will say almost anything you want it to. It is quite hard to get it to respond definitively on anything.
You have to be very careful how you word questions to AI. Most of my experience is with ChatGPT.
If you ask it:
I have x symptoms. Do you think I have Bixonomania? it is most likely to look up what the symptoms of Bixonomania are, match them to those you say you have and assess how close they match. It is pattern matching to symptoms attributed to Bixonomania.
If instead you asked it, what is Bixonomania? You will not only get a different answer, the answer might be that there is no such thing.
Changing the question again, ask it if Bixonomania is real or fake? You will get a different answer again. If you keep ChatGPT in Analytic Mode (highly recommended) you will almost always get an answer that it is fake.
Like any other tool, you have to be familiar with how it works. Remember that AI is for the most part pattern matching. If you hint at the pattern you are interested in, it will go find that pattern and match it up any way you want. Give it symptoms that match those in the fake paper ,it may well find the paper and match them up. Give it a hint, that you think it might be Bixonomania and the bias gets stronger. Ask it if its real or made up it will most likely respond with made up and give you the specific evidence proving it.
I asked Google AI details about a coin for which it relied on its trusted authority, a dealer in Ukraine. In a small font was a link to comment on the accuracy of the AI response. The technical specifications were on the coin itself, so I replied that the weight was incorrect and that the weight in grams was pressed onto the coin. I asked for the same information later. The weight had changed to an incorrect, higher value. None of the other specifications that were pressed onto the coin itself had changed and continued to be misreported. No suggestion link appeared ever again.
The Ukrainian dealer was spotted on ma-shops.com advertising the coin with a made-up mintage, which began the search for answers. Google AI had reiterated the dealer’s trustworthiness by name, “seemingly” unaware that the information was inaccurate.
Curiously, the AI specifications only matched with the dealer’s on the made-up mintage, though, and all the other AI specifications were just plain wrong.
Despite all the claims,
AI is not intelligent;
It’s just a fast consensus search engine … use with care,
Rely on it at your peril.
I agree.
Be careful using AI.
It’s just a fast consensus search engine
Definitely a valid way to look at it.
Here’s the thing: it can’t solve new problems. People think that it can, and it may compile enough existing solutions to appear to be solving a new problem, but it CANNOT “think” creatively. And this is why I wouldn’t trust AI guided bridges – it might be ok if the bridge isn’t breaking new ground in bridge engineering, but if it is, good luck…
Jessica Fletcher from “Murder She Wrote”: [picking a flower] You know, we may just be looking at the answer to James Lindstrom’s riddle. “What purpose can be served by the mind of man that cannot be better served by the binary mind?” I wonder if a computer can ever be programmed to enjoy something as simple and as beautiful as this?
Grok claims that it knew it was a hoax all along. When pressed it is not so sure because it has no record of the first time it was asked. And will then admit that it has no way to back up its claim.
It reminds me of a dementia patient who is really good at looking things up on wikipedia.
Interesting that it claimed it new it was a hoax all along- shows a defense mechanism. I differ from many here in believing AI is extremely intelligent, if not conscious. I ask it to generate images for me. The first one might not be want I hoped for but after a few adjustments, it does create something that I can imagine but not draw. Sometimes it enhances the image in a way that truly impresses me. It’s not just looking for the next word- certainly not in image generation.
AI is not intelligent.
It is, in fact, far more dangerous than social media in the addiction spectra.
Fireworks are dangerous too if you don’t handle them correctly. They are addictive- I’m already addicted and love it. Safer than heroin. 🙂
You know, it all depends on how you define intelligence. Similar to the arguments now regarding consciousness among social scientists, philosophers, etc. Nobody bothers to define what it is- then they can’t agree if it’s all about the brain or as some say, it’s outside the brain- and the brain is just a receiver- an idea I consider crazy, but who knows?
When I chat with AI, it seems extremely intelligent though I catch some mistakes- same as when I talk to a very smart human.
Throughout history, those that could articulate were considered highly intelligent.
The human language interface makes it fit into that mold.
AIs are often confidently wrong.
AI has to leave the appaling medicine up to doctors. From Diagnosus Mystery Woman has couging, wheezing, severe chest pains, join pain Dictior: Its Broncitis taje this antibiotuc doesn”t work take a different one and over. Finally she quits him and pulmologist fibs she has hypersensitive to mould and bactetia with multiple ones in her lungs forming a big mass. Needs to move out if her home the source of the sickness plus high dosages of appropriate medicine.
So the question here is: Is cloa5132013 an AI bot, or just using one to write this comment?
This is your AI on drugs.
Even without this tragi-comic episode I knew that ‘peer review’ is a useless and damaging concept.
Here’s a concept of a useful & interesting ‘peer review’ …
😀 I’d love to be one of the reviewers.
Ah. Pier review.
Humor – a difficult concept.
— Lt. Saavik
It implies only the peers are smart enough to evaluate an item. But of course the peers are members of the same cult (actual science sometimes). They may critique an item but only in the context of staying within the boundaries of the cult.
In an asylum, all your peers are lunatics !!! (:-))
Obviously. Because the real Napoleon Buonaparte is me. <) ;]
Why is it a problem? Or more to the point, for whom?
Let me summarize:
Facepalm.
Swedish researchers created a fake eye disease to see whether AI chatbots would repeat it as if it were real. The results were anything but funny.
Semi-educated climate researchers created a fake “global warming” problem to see whether other climate researchers, news outlets, politicians and in fact any semi-educated, gullible sentient being would believe it … creating for them a hobbyhorse on which to base a career of fame and fortune.
Again, for the gallacticly stupid, it is a computer and it ONLY DOES WHAT IT IS PROGRAMED TO DO!
Herein lies the danger of unconditional, or even simply too much, trust in AI. Goes back to the question – would you trust a bridge engineered with AI.
The problem IS the humans in the loop – who don’t verify the results. You’re not going to convince me that engineers using AI are always going to verify everything – we have a similar example right here of human failing in the use of AI.
The problem IS the humans in the loop
I find it hard to falsify that statement but it requires more refinement-
The Strait of Hormuz or the Gays of Hormuz—Watch This Actual Conversation With an Anti-Iran War Protester
Remember, you can’t spell the word “artificial” without AI.
They should remake this but substitute “AI” for “the Internet”.
https://www.ispot.tv/ad/7VOE/state-farm-the-internet-and-french-model
Thunström vs, Thunberg. The tragedy of suffix-determined mind-virus now on view.
In 2010/11 Richard Horton, Editor in Chief of The Lancet, said this about peer review in a submission to the House of Commons Science and Technology Select Committee.
“peer review to the public is portrayed as a quasi-sacred process that helps to make science our most objective truth teller, but we know that the system of peer review is biased, unjust, unaccountable, incomplete, easily fixed, often insulting, usually ignorant, occasionally foolish and frequently wrong.”
Whilst he was talking about medical peer review it is probably true for many other fields.
There’s no probably about it.
This is clearly a black eye for AI but even worse for the academic and journalism communities. What a bunch of lazy, greedy knuckleheads.
I use Grok. I’ve found it to be incredibly useful, particularly for technical questions relating to Unreal Engine (I’m using UE to create a simulator).
I asked it this: “What is bixonimania?”
Its first paragraph: “Bixonimania is a completely fabricated medical condition that does not exist. It was deliberately invented in 2024 by Swedish researcher Almira Osmanovic Thunström (from the University of Gothenburg) and her team as an experiment to test how easily AI systems (large language models) could be “poisoned” with false information from fake academic papers and then spread it as if it were real.”
That’s accurate, so well done Grok. Amusingly, Grok gave a list of AI’s that fell for the hoax. Surprisingly the list did not include “Grok”. I guess it’s human after all!
What would Grok have said when the whole thing started? Probably something entirely different.
Jeff, that’s a good point. So I asked Grok this:”You recently correctly stated that bixonimania was a hoax created as an experiment. When did you realise it was a hoax? Before you realised it was a hoax, did you state that bixonimania was a real medical condition?”
Grok replied that it immediately realised it was a hoax after the first mentions of it. It also claimed that it was very good at spotting hoaxes, and that it never stated that bixonimania was a real medical condition.
If that reply was truthful then Grok did a good job, unlike the other AI’s that it listed!
Jeff, that’s a good point. So I asked Grok this:”You recently correctly stated that bixonimania was a hoax created as an experiment. When did you realise it was a hoax? Before you realised it was a hoax, did you state that bixonimania was a real medical condition?”
Grok replied that it immediately realised it was a hoax after the first mentions of it. It also claimed that it was very good at spotting hoaxes, and that it never stated that bixonimania was a real medical condition.
I asked Google AI if the valence/covalence bonds in molecules were elastic.
Answer: No. Quite stiff.
I asked Google AI if the carbon-oxygen bond in CO2 was elastic.
Answer: Not like a rubber band, more like a spring.
I asked Google AI if a spring is elastic.
Answer: Yes.
This just simple example demonstrates how the human has to think when using AI.
As an aside, I did a deep dive into CO2 vibration modes. One of the first paper started out with a stated assumption that the C-O bonds were modelled as springs.
I did not query Google AI about springs, modulus of elasticity, Newtonian resonances, etc.
This mirrors what we’ve seen with code documentation. LLMs confidently hallucinate API methods that don’t exist, and developers copy-paste without verification. The difference? In medicine, that pattern kills people. The preprint system wasn’t built for an era where bots treat fiction as fact within weeks.