Episode 13: “The Value of Saying ‘I Don’t Know’”
Listen to this episode:
Episode description:
In this episode I give some of my thoughts on the current (July 2023) state of artificial intelligence’s value.
Carissa Véliz on LinkedIn: https://www.linkedin.com/posts/carissa-v%C3%A9liz-a5781555_how-does-chatgpt-decide-what-to-say-next-activity-7084929111537672192-Maap
Benedict Evans, “AI and the automation of work”: https://www.ben-evans.com/benedictevans/2023/7/2/working-with-ai
Similarweb, “ChatGPT Drops About 10% in Traffic as the Novelty Wears Off”: https://www.similarweb.com/blog/insights/ai-news/chatgpt-traffic-drops/
Washington Post, “The FTC is investigating whether ChatGPT harms consumers”: https://www.washingtonpost.com/technology/2023/07/13/ftc-openai-chatgpt-sam-altman-lina-khan/
Shopify announces Sidekick: https://twitter.com/tobi/status/1679114154756669441
Morgan Pansing Prints: https://morganpansingprints.com/
The Verge, “Google’s medical AI chatbot is already being tested in hospitals”: https://www.theverge.com/2023/7/8/23788265/google-med-palm-2-mayo-clinic-chatbot-bard-chatgpt
Reuters, “Hollywood actors to strike at midnight, join writers on picket lines”: https://www.reuters.com/world/us/hollywood-actors-union-sets-strike-vote-thursday-talks-break-down-2023-07-13/
Episode transcript:
Hello everyone. My name is Scot Pansing and this is AI Quick Bits, a podcast that breaks down various artificial intelligence topics of my choosing, into snackable brief episodes. Today I’m going to share some of my thoughts, in July of 2023, on the current state of value that AI brings to society. As always, links to everything I reference are in the episode notes.
Someone everyone interested in artificial intelligence should follow on LinkedIn, Carissa Véliz, recently wrote a post comparing and contrasting large language models and Socrates. She references Plato’s “Apology” – and notes that it is widely accepted that Socrates’ wisdom came from his awareness of his limits, often summarized in the saying "I know that I know nothing" – meaning, it’s okay to not know, and even better to ask questions. It’s a fantastic framework for acquiring knowledge.
Chatbots and large language models, on the other hand, have no idea what they do not know. They are not databases; they are predictive and sequential models. What does this mean? It means pattern matching and recognition on a massive scale. Benedict Evans wrote recently that we can imagine the process behind the scenes as if the chatbot is thinking like so: “What sort of answers would people be likely to produce to questions that look like this?”
Where this gets problematic is that the responses are not framed this way. A chatbot does not say, “this response has a certain chance of being correct based on how people might choose to answer this question.” That’s a mouthful and a marketing nightmare. Instead the responses are presented with a confident tone, with a tiny disclaimer somewhere in the interface that says some responses may be inaccurate. And they never say, “I don’t know.” The industry calls these false statements “hallucinations.”
So what is the effect on the value of answers when they must always be cross-checked and verified? It seems that customers are weighing in on this with their time spent, as multiple sources have recently reported that traffic to ChatGPT is down significantly. The United States Federal Trade Commission (FTC) has even opened up an investigation into OpenAI to determine if, among other things, ChatGPT’s hallucinations have caused “reputational harm” to consumers.
So have the past months just been all hype? Should you ignore the headlines and forget about experimenting with these new tools? In a word, no.
Here’s the deal. If you go through the trouble of the signup process to use a large language model like OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Bard, or something similar, you need to go beyond just asking questions or having it summarize an article. Find a task that you know will take you a long time, that you’d love to have done in minutes. Better yet, maybe it requires some coding or software usage you aren’t sure how to complete. In episode eight of this podcast, I documented my frustrations as I went through hours and hours of working with ChatGPT to create a simple website that needed some JavaScript. The other day, I needed to update the JavaScript on the website, and honestly I had been putting it off due to the prior experience. The progress in three month’s time that these applications have made with creating code is huge, and to my delight, I had successfully implemented the website update in less than five minutes. And I didn’t even use “Code Interpreter” – the new plugin for the paid version of ChatGPT. The immediate future of chatbots is helper applications, not factually correct response machines.
The e-commerce platform Shopify is fully on board the personal assistant train, recently announcing their upcoming product Sidekick. The CEO says it will be a deeply competent companion that knows everything there is to know about Shopify’s platform, as well as macro business trends that affect commerce. It will take things off of your to-do list by doing them for you, redesign or manage products on your website on command without you needing to trudge through the UI. Now that’s real value! I run my wife’s photography print shop website on Shopify, and I can tell you that I am super excited about this launch. If it does half of what they are saying, it will be an immense time saver for the business.
Even with some persistent accuracy issues, Google feels good enough about its Med-PaLM 2 chatbot, an AI tool designed to answer questions about medical information, to put out press releases about testing it at the Mayo Clinic. Medical advice is certainly an area where I would personally prefer, “I don’t know” to an inaccurate response. But clearly there is consistent progress in the field.
It’s true, we’ve already come a long way since the generative AI cultural explosion began with ChatGPT and Midjourney grabbing headlines in the Fall of 2022. Like most people, I was blown away by these developments. After a decade of disappointment in hyped up chatbots from big tech that could do little more than pull up a song or recipe or the day’s weather, here we finally reached a stage where humans could actually have a real conversation with a machine. And on top of that, we could also give simple prompts to programs that would return incredible imagery.
This is another huge use case for generative AI: content creation. Whether it be text, static images, 3D models, or video, it is remarkable what output can result from simple prompts. However, this raises some controversial issues. One is that these models are trained almost exclusively on content created by humans, and we are only in the early stages of content owners, lawyers, AI platforms, and generative AI content creators working out who owns what and how everyone should get their fair pay. I’m based in Los Angeles, where currently the writers’ strike is still in full effect, and the actors’ unions have just joined. Both writers and actors have multiple causes for striking, but one of them is their concerns with how artificial intelligence will displace their labor and compensation. It’s a good time to be a lawyer in technology, to say the least.
And of course, another big issue with the scale at which generative AI can create content is the potential for mass creation and distribution of misinformation. Personally I think the distribution is the larger problem – meaning if someone creates some fake news, even a ton of it, there’s no real harm if they keep it to themselves. It’s what they do with it that is potentially the problem. But beyond fake news, generative AI – usually of the open source variety – is also being deployed to create rapid improvements and increases in fraud, scams, and phishing, as well as generative child sexual abuse material, deep fake pornography, scaled harassment, and other harmful content. Bad actors have always been a concern when new tools and technology are introduced. But this time, the speed and scale involved are deeply concerning. Be safe out there!
Well, on that sobering note, that’s about it for now. To sum up, it would be great if large language models could be aware enough to say, “I don’t know.” But even with these flaws, get in there and try to use them more as helper applications for productivity, because for one thing, they can code pretty well, and even for seasoned developers they can be a huge time saver. And if you want to use generative AI to create content, just know that if you intend to monetize it, you are entering a legal minefield! Thanks again for your time; I hope everyone has a wonderful rest of the day, or night, wherever and whenever you may be listening. Bye!