Okay, one more time for the people in the back.
The "AI" () craze of the past few years is all about Large Language Models. This immediately tells us that the only thing these systems "know" is trends/patterns in the ways that people write, to the extent that those patterns are expressed in the text that was used to train the model. Even the common term, "hallucination," gives these things far too much credit: a hallucination is a departure from reality, but an LLM has no concept of reality to depart from!
An LLM does exactly one thing: you give it a chunk of text, and it predicts which word will come next after the end of the chunk. That's it. An LLM-powered chatbot will then stick that word onto the end of the chunk and feed the resulting, slightly longer chunk back into the model to predict the next word, and then do it again for the next, etc. Such a chatbot's output is unreliable by design, because there are many linguistically valid continuations to any chunk of text, and the model usually reflects that by having an output that means, "There is a 63% chance that the next word is X, a 14% chance that it's Y, etc." The text produced by these chatbots is often not even correlated with factual correctness, because the models are trained on works of fiction and non-fiction alike.
For example, when you ask a chatbot what 2 + 2 is, it will usually say it's 4, but not because the model knows anything about math. It's because when people write about asking that question, the text that they write next is usually a statement that the answer is 4. But if the model's training data includes Orwell's Nineteen Eighty-Four (or certain texts that discuss the book or its ideas), then the chatbot will very rarely say that the answer is 5 instead, because convincing people that that is the answer is a plot point in the book.
If you're still having trouble, you can think of it this way: when you ask one of these chatbots a question, it does not give you the answer; it gives you an example of what—linguistically speaking—an answer might look like. Or, to put it even more succinctly: these things are not the Star Trek ship's computer; they are very impressive autocomplete.
So LLMs are fundamentally a poor fit for any task that is some form of, "producing factually correct information." But if you really wanted to try to force it and damn the torpedos, then I'd say you basically have two options. I'll tell you what they are in a reply.