@Harold Wilson
You cannot believe a language-model based AI. Ever.
I don't think that decisions about the usefulness of AI are going to hang on 100% reliability. Humans aren't 100% reliable, and we still count on them for critical jobs. The question, to use Ethan Mollick's term, is whether they are better than the BAH - Best Available Human.
For low risk jobs, that's a standard that the current models meet in many cases. As an example, as a hobbyist writer, I am happy to use ChatGPT, Claude, and Gemini as "book coach / developmental editor." I'll paste in the chapter and ask them for feedback and suggestions. Some of the suggestions don't strike me as particularly useful, and occasionally some even seen antithetical to what I'm trying to do in the story. But often enough, they make points that I find worth considering. I have even asked for examples of how they would change my words in a way to match their recommendation. I usually ask for multiple examples. I can't think of a time when I have simply taken what was suggested and used it, but often it has made me rethink my writing and redraft toward what I thought was a better result.
That is using AI because it is better than the best available human. Sure there are humans who could do that, but they aren't available right when I finish a chapter, even if I could afford them. As the AI models continue to improve, this sort of use will get even more appealing.
They can also be a great help to authors in other ways. For example, I find that I often have difficulty adding descriptions of settings into my scenes, something that was pointed out to me in a writers group. I was writing a scene the other day and wanted to do better, so I asked ChatGPT for an image of the setting and got back something that was extremely helpful. It let me pick out a few key elements that I would not have thought of, drop a mention of them into the story, and go on. When I took that chapter to my writers group, those passages got positive comments.
I frequently see the current crop of large language models described as bright, eager, pleasant, but occasionally sloppy interns. In fact, many who observe the field seem to think that an early significant effect of these tools on the workforce will be in reducing the numbers of entry level precisions as senior people are able to do more with these tools taking the place of work that had been done previously by beginners.
So the question isn't whether to ever believe, it's how to use these tools in the most helpful way for a particular task. In that regard, evidence is mounting that providing access to the current generation of tools for professionals in fields from medicine to law to consulting results and improved performance. Actually, it seems to put it really does is help the middle tier boost their performance to match that of the top performers, which raises its of own set of issues.
However, for folks like me, these tools are useful right now. One thing that Ethan Mollick said he did in writing his current book was to take paragraphs that he was having trouble getting them the way he wanted and have one of the models rewrite it and several widely varying styles. For example, an author could take a passage and ask the model to rewrite it in styles raging from Hemingway to Shakespeare. The results might well help that author see ways to improve their own style. Plus, it would probably just be fun. I haven't done it yet, but it's on my list.