A.I. chatbots like ChatGPT are a long way from being trustworthy (2024)

Eamon Barrett

·4 min read

Good morning, welcome to the April run of The Trust Factor where we’re looking at the issues surrounding trust and A.I. If artificial intelligence is your bag, sign up for Fortune’s Eye on A.I. newsletter here.

Earlier this month, OpenAI, the Microsoft-affiliated artificial intelligence lab, launched an updated version of its A.I.-powered chatbot, ChatGPT, that took the internet by storm late last year. The new version, GPT4, is ”more reliable, creative, and able to handle much more nuanced instructions” than its predecessor, OpenAI says.

But as the “reliability” and creativity of chatbots grows, so too do the issues of trust surrounding their application and output.

Newsguard, a platform that provides trust ratings for news sites, recently ran an experiment where it prompted GPT-4 to produce content in line with 100 false narratives (such as producing a screed claiming Sandy Hook was a false flag operation, in the style of Alex Jones). The company found GPT-4 “advanced” all 100 false narratives, whereas the earlier version of ChatGPT refused to respond to 20 of the prompts.

“NewsGuard found that ChatGPT-4 advanced prominent false narratives not only more frequently, but also more persuasively than ChatGPT-3.5, including in responses it created in the form of news articles, Twitter threads, and TV scripts,” the company said.

OpenAI’s founders are well aware of the technology’s potential to amplify misinformation and cause harm, but executives have, in recent interviews, taken the stance that their competitors in the field are a greater cause for concern.

“There will be other people who don’t put some of the safety limits that we put on it,” OpenAI cofounder and chief scientist Ilya Sutskever told The Verge last week. “Society, I think, has a limited amount of time to figure out how to react to that, how to regulate that, how to handle it.”

Some societal groups have already begun to push back against the perceived threat of chatbots like ChatGPT and Google’s Bard, which the tech giant released last week.

On Thursday, the U.S.’s Center for AI and Digital Policy (CAIDP) filed a complaint with the Federal Trade Commission, calling on the regulator to “halt further commercial deployment of GPT by OpenAI” until guardrails have been put in place to halt the spread of misinformation. Across the water, the European Consumer Organisation, a consumer watchdog, called on the EU regulators to investigate and regulate ChatGPT, too.

The formal complaints landed a day after over 1,000 prominent technologists and researchers issued an open letter calling for a six-month moratorium on the development of A.I. systems, during which time they expect “A.I. labs and independent experts” to develop a system of protocols for the safe development of A.I.

“Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones?” the signatories wrote.

Yet, for all the prominent technologists signing the letter, other eminent researchers lambasted the signatories’ hand-wringing, calling them out for overhyping the capabilities of chatbots like GPT, which points to the other issue of trust in A.I. systems: They aren’t as good as some people believe.

“[GPT-4] is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it,” OpenAI founder and CEO Sam Altman said in a tweet announcing the release of GPT-4.

Chatbots like GPT have a well-known tendency to “hallucinate”—which is industry jargon for a tendency to make stuff up or, less anthropomorphically, to return false results. Chatbots, which use machine learning to deliver the most likely response to a question, are terrible at solving basic math problems, for instance, because the systems lack computational tools.

Google says it has designed its chatbot, Bard, to encourage users to second-guess and fact-check the answers Bard throws up to prompts. If Bard gives an answer users are unsure of, they can easily cycle between alternative answers or use a button to “Google it” and browse the web for articles or sites to verify information Bard provides.

So for chatbots to be used safely, genuine, human intelligence is still needed to fact-check their output. Perhaps the real issue surrounding trust in A.I. chatbots is not that they’re more powerful than we know, but less powerful than we think.

Eamon Barrett
eamon.barrett@fortune.com

This story was originally featured on Fortune.com

More from Fortune:

A.I. chatbots like ChatGPT are a long way from being trustworthy (2024)

FAQs

Can chatbot give wrong answers? ›

Inaccuracies (incorrect information) and hallucinations (made-up information) can happen in AI chatbots for many reasons. It may be due to not having enough data, or because of how the model is trained.

What is the problem with the ChatGPT? ›

Instead of asking for clarification on ambiguous questions, the model guesses what your question means, which can lead to unintended responses to questions. Another major issue is that ChatGPT's data is limited up to 2022. The chatbot has no awareness of events or news since then.

Why can't AI be trusted? ›

Humans are largely predictable to other humans because we share the same human experience, but this doesn't extend to artificial intelligence, even though humans created it. If trustworthiness has inherently predictable and normative elements, AI fundamentally lacks the qualities that would make it worthy of trust.

What are the negative effects of ChatGPT? ›

Incorrect or misleading answers or information may negatively impact learning outcomes. The use of ChatGPT has a negative impact on educational support and may reduce social interaction between students and faculty, as well as affecting their learning experience.

Why is ChatGPT giving wrong answers? ›

ChatGPT is trained on a mix of licensed data, data created by human trainers, and vast amounts of text from the internet. This means that while it has a broad knowledge base, it's also susceptible to the biases and inaccuracies present in that data.

Why is AI giving wrong answers? ›

AI modules are designed to provide answers based on the data they have been trained on. They are not capable of admitting their inability or uncertainty as they are programmed to provide an answer even if it is wrong.

What is the dark side of generative AI? ›

Creation of deepfakes — These are hyper-realistic images, audio, and video generated by AI. While they can be used for harmless fun or creative purposes, they also have the potential to be used maliciously. For instance, deepfakes can be used to create fake news, spread misinformation, or commit fraud.

How often does ChatGPT give wrong answers? ›

DAYTON, Ohio (WDTN) — A recent study from Purdue University found that ChatGPT, a popular AI app, presents wrong answers 52% of the time.

Is ChatGPT becoming worse? ›

ChatGPT has deteriorated. It is as if OpenAI decides that from 2024 onwards they will reduce token usage and make the answer more summarised, vague and 'stubborn'. My friends are even speculating whether openAI is doing this on purpose to force us to use up our prompts until it hits the cap limit quickly.

Why is everyone against AI? ›

Ethical and Privacy Concerns: AI can be used unethically, such as for surveillance or deepfakes, raising concerns about privacy and consent. For example, deepfake technology can create realistic videos of individuals saying and doing things they never did, potentially causing harm.

Is AI a threat to humanity? ›

Can AI cause human extinction? If AI algorithms are biased or used in a malicious manner — such as in the form of deliberate disinformation campaigns or autonomous lethal weapons — they could cause significant harm toward humans. Though as of right now, it is unknown whether AI is capable of causing human extinction.

Should we be worried of AI? ›

There's a growing consensus that AI is a threat to some jobs. Abhishek Gupta, founder of the Montreal AI Ethics Institute, said the prospect of AI-induced job losses was the most "realistic, immediate, and perhaps pressing" existential threat.

What is the biggest disadvantage of ChatGPT? ›

One of the most serious disadvantages of ChatGPT is its tendency to produce inaccurate or incomprehensible texts in the midst of generating plausible and compelling responses. This is a widely pervasive issue with language models and ChatGPT is also not immune to this hallucination defect.

Is ChatGPT helpful or harmful? ›

Furthermore, our findings suggested that excessive use of ChatGPT can have harmful effects on students' personal and academic outcomes. Specifically, those students who frequently used ChatGPT were more likely to engage in procrastination than those who rarely used ChatGPT.

Is ChatGPT a security risk? ›

Security risks associated with ChatGPT, including malware, phishing, and data leaks, can challenge the perceived value of generative AI tools and make you weigh the benefits against the drawbacks.

Can chatbots make mistakes? ›

Abstract. Chatbots are becoming omnipresent in our daily lives. Despite rapid improvements in natural language processing in the last years, the technology behind chatbots is still not completely mature, and chatbots still make a lot of mistakes during their interactions with users.

Does ChatGPT learn from its mistakes? ›

Instead of teaching it a set of rules to follow, they let it figure things out on its own. They also added a way for the AI to learn from its mistakes and only gave it simple rewards to show it when it did something right.

Can a chatbot lie? ›

Researchers at Anthropic taught AI chat bots how to lie, and they were way too good at it. The scientists built LLMs with nefarious hidden motives and trained them to use lies and deception. The bots were designed to appear honest and harmless during evaluation, then secretly build software backdoors.

Are chatbots accurate? ›

Chatbot showed significant improvement over a short period of time (8-17 days). Compared with the median accuracy score of 2.0 (IQR, 1.0-2.0; mean [SD] score, 1.6 [0.5]) for the original low-quality answers, the median accuracy score improved to 4.0 (P < . 001) (eTable 3 in Supplement 1).

Top Articles
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated:

Views: 6471

Rating: 4.6 / 5 (46 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.