Hacker News Re-Imagined

“Chatbots: Still Dumb After All These Years”

  • 187 points
  • 7 days ago

  • @LittlePeter
  • Created a post

“Chatbots: Still Dumb After All These Years”


@nailer 7 days

Replying to @LittlePeter 🎙

Well yes. I’ve been told VCs invested in these back in 2015 (I was in a startup accelerator in the UK at the time and there were a few in my cohort) and a few years later very few of the chat bot investments have worked out.

Reply


@TedShiller 7 days

Replying to @LittlePeter 🎙

Am i the only one not surprised?

Reply


@ape4 7 days

Replying to @LittlePeter 🎙

Yeah they're really bad. Usually they just grep for the relevant FAQ.

Me: I read the FAQ, but was still not able to login

Bot: Sorry you're having trouble logging in, here is some info that might help <repeats FAQ>

Reply


@arikr 7 days

Replying to @LittlePeter 🎙

Wasn’t this on the homepage yesterday?

Reply


@seanp2k2 7 days

Replying to @LittlePeter 🎙

Current gen chatbots for support for companies are infuriating and only helpful for the most clueless of users, which I suppose is probably a decent chunk. It’s like when you call a company because you need help that you can’t resolve yourself through their site and are then forced to listen on hold to the phone menu system tell you a dozen times that you can do everything you need on their amazing website.

Also, developers: please don’t try to make the chatbot seem human to fake users out. It’s almost as bad as the fake typing sounds for Comcast support. Making users jump through hoops and tricking them just makes them hate your brand and your products, and makes them even more irritable when they do eventually get to speak with a human.

Also, end the auto pop-up “can I help you find something???” chat bots on websites. It’s like someone had the idea to take the worst part of retail experiences and find a way to make that even more useless, then deployed it everywhere.

Reply


@amelius 7 days

Replying to @LittlePeter 🎙

Didn't Google have a really great robocall demo, some time ago?

Reply


@aruanavekar 7 days

Replying to @LittlePeter 🎙

Whether it works or not, sounds dumb or useful. Clients keep asking for it. Personal experience and opinion, they are best as a backup for human agent, when one is busy or unavailable. Costco, Amazon, Ally have good implementations on these. Chatbot discussion maybe in the air, Chat Widget is a must have form of interaction. Customers expect a site to have a "Chat Now" option on the website.

Reply


@walnutclosefarm 7 days

Replying to @LittlePeter 🎙

The idea that a general language model like GPT-3 can answer questions intelligently is utterly absurd. It's trained to get language right (where "right" is defined as similar enough to the way people speak (or mostly write) to be intelligible as language), but it does so without any underlying knowledge model to make the intelligible language relevant to any given area of knowledge. Human language is not knowledge; it's a means for articulating our knowledge (that is, domain specific models of the world) in a way that other people can understand and translate into their own particular models.

So what is needed is the capabilities of GPT-3 or other language generators sitting on top of domain specific knowledge models, and constrained by those models.

Asking GPT-3 a general knowledge question is like asking an articulate 5 year old a question like "how does gravity work?" You'll get gramatically meaningful answers that use the structure of the language correctly, but that are quite likely to have nothing to do with our actual understanding of physics.

Reply


@mrpf1ster 7 days

Replying to @LittlePeter 🎙

This article just seems petty. The author just quotes large chunks of the article by Gary Smith while inserting snide comments afterwards ("That's pretty funny!", "These are hilarious!"). Then goes on to ad hominem the original author.

There are no arguments presented for the intelligence of chatbots other then the authors own opinion. I don't know what this article adds to the conversation that Gary Smith's original article doesn't provide.

Reply


@isoprophlex 7 days

Replying to @LittlePeter 🎙

A bazillion parameters in gpt3, but what does the training process amount to? Filling in missing characters or words in sentences taken from a huge dump of literature, news articles, reddit comments...

No wonder these things are so dumb still. The training process and the loss function used probably does not penalize poor long-range coherence between paragraphs. Also, if I'm not mistaken, these things have absolutely no internal state besides the characters you steam into them as conversion prompts.

If these things were trained more like agents having to operate in eg. Socratic dialogues maybe we'd be getting somewhere

Reply


@jll29 7 days

Replying to @LittlePeter 🎙

The term "chatbot" is problematic, as it potentially conflates a couple of different types of systems that superficially may look very similar.

Dialog systems: Dialog systems, in a narrowly confined domain, can solve a task, help solve a task, or provide information to enable humans to solve a task quicker. Flight booking systems are typical examples, where the system asks a couple of questions and the user answers them, and users may also ask questions. Gradually a set of slots (DEPATURE-FROM, ARRIVAL-AT etc.) are filled and then a booking transaction can be initiated. Will work for flights but not good for asking it out-of-domain questions.

Statistical or neural language models: BERT, GPT-3 and other muppets are models of language that can predict likely next word/sentence etc. - which is useful for many tasks but is NOT equivalent to a "chatbot". It may be abused as one for fun, but there is no formal meaning representation used and no answer logic applied. Think of this as a simple auto-complete - so this is not a source of wisdom to ask about safety of stair cases or any other serious topic like that. (These models are VERY useful ingredients of modern NLP applications, but they are the bricks rather than the house.)

Interactive CRM Forms: Web/Slack "bots" or Typeform survey are sometimes fun, sometimes useful but can never claim to "understand" anything. They are ways to capture some data interactively, often to eventually feed the data to a human for review.

Question answering systems: Answer retrieval is the task of automatically finding a phrase or sentence in a body of, say, a million documents which answers a given question. They are next-level search engines intended to supercede keyword based search sytems. Deployed Web searche engines like Google already have limited answering capabilities - but only for a select small number of question types. "Open domain Q&A" is the task of permitting question answering by machine without limiting the domain, and since 1998 US NIST have been organizing annual bake-offs for international research teams, which has helped advance the state of the art a lot (e.g. https://trec.nist.gov/pubs/trec16/t16_proceedings.html).

Reading comprehension systems: These systems take a piece of text as input as well as a question, and then they attempt to answer a question about the text. Tests used to assess human students (remedial testing) can nowadays be passed reasonably well.

Reply


@kristopolous 7 days

Replying to @LittlePeter 🎙

They need some kind of agency otherwise it'll always be like inquiring a piece of furniture on how their day went.

Do any of these generate narrative fictions (such as characters and events they supposedly did) to interact with?

Reply


@joshuahedlund 7 days

Replying to @LittlePeter 🎙

This is great but this post is basically a wrapper for the original post: https://mindmatters.ai/2022/01/will-chatbots-replace-the-art...

Reply


@dang 7 days

Replying to @LittlePeter 🎙

This article is a response to this one:

Chatbots: Still dumb after all these years - https://news.ycombinator.com/item?id=29825612 - Jan 2022 (408 comments)

(Thanks everyone who pointed this out.)

Reply


@raxxorrax 7 days

Replying to @LittlePeter 🎙

I actually was surprised how well they can simulate at conversation now. It is a fake because there is little underlying reasoning of course. That is a monumental problem and difficult to determine where to begin.

Do you start to give your AI a motivation or goal? Perception? These are vastly more complex problems than some statistical tricks on data that is widely available.

Still, it is fascinating that we came this far with a dead machine that talks.

Reply


@firefoxd 7 days

Replying to @LittlePeter 🎙

We were building a chatbot to use on a website until we realized how customers where using it. Most people were frustrated with something and needed help.

People who wanted to have a conversation did it for fun and had no real need for our services. We couldn't tell them how tall the Eiffel tower is.

Maybe there is a time where you want to have a conversation like the examples in the article. But I don't ever find myself wanting to talk to a human in this manner, so why a chatbot?

Have you ever watched the sci-fi show The Expanse? Have you seen how they interact with the AI? They ask a question, it provides an answer. It doesn't even use voice most of the time. It gives you the answer without trying to be sassy about it.

Reply


@mrtranscendence 7 days

Replying to @LittlePeter 🎙

When GPT3 was opened up so that anyone could create an account, I was excited to try it. I was quickly disappointed. Its ability to chat was quickly shown to be pretty terrible -- it could mostly make reasonable-sounding English sentences, but it was like talking to someone who was maybe a bit drunk and not really listening. I can't imagine using it as an interface for a customer to interact with product support.

The whole thing just made me a bit sad. I really was so excited. Nothing it could do was very impressive, even aside from holding a conversation. The most impressive thing I've seen is Copilot, but even that's been next to useless from a practical perspective.

Reply


@ghostwreck 7 days

Replying to @LittlePeter 🎙

After having spent a few years working on a chatbot, the allure is this: talking to a real human is better than filling out a form. If we can build a Q&A system as good as talking to a human, people would also prefer it to filling out forms. So that's the pursuit.

I understand the hate, because we haven't landed very close that goal yet, and the intermediate product is much worse than a form. But I am surprised that a technical community is not more supportive of the ambition.

Reply


@PaulHoule 7 days

Replying to @LittlePeter 🎙

For a while I was frustrated at how slow people have been to realize that GPT-3 sucks but lately I am more amused.

There a few reasons structurally why it can't do what people want it to do, two of them are: (i) it can't detect that it did the wrong thing at one level when interpreting it at a higher level, (ii) most creative tasks have an element of constraint satisfaction.

The 1st one interests me because I was struggling with the need for text analysis systems to do that circa 2005 and looking at the old blackboard systems. I went to a talk by Geoff Hinton just before he became a superstar where he said instead of having a system with up-and-down data flow during inference, build a system with 1 way data flow and train all the layers at once. As we know that strategy has been enormously effective, but text analysis is where it goes to die just as symbolic AI failed completely at visual recognition.

Like the old Eliza program, GPT-3 exploits human psychology. We are always looking to see ourselves mirrored

https://www.nasa.gov/multimedia/imagegallery/image_feature_6...

Awkward people are always worried that we are going to get it 90% right but get shunned for getting the last 10% wrong. GPT-3 exploits "neurotypical privilege" in which it gets it partially correct but people give it credit for the whole. People think it will get to 100% if you just add more connections and training time but because GPT-3 is structurally incorrect adding resources means you converge on an asymptote, say 92% right. It's one of the worst rabbit holes in technology development and one of the hardest ones to get people to look clearly at. (They always think stronger, faster, harder is going to get there...)

It seems to me an effective chatbot will be based around structured interactions, starting out like an interactive voice response system and maybe growing in the direction of

http://inform7.com/

Reply


@abducer 6 days

Replying to @LittlePeter 🎙

I find IRC bots useful although they certainly qualify as dumb. They meet the user where the user is. They are generally just a dressed up command-line interface. I've written some chat bots pre and post NLP explosion — In this era I've had luck with mild NLP mappings to commands, GPT3 not required. Just get you some verbs and objects.

Dumb but functional > smart but… dysfunctional?

Reply


@harha 7 days

Replying to @LittlePeter 🎙

There’s a special little place in hell for whoever decided the whole world needed chat bots for every crappy website.

Why would I want to try to articulate something that could be found in a simple tree? Just give me direct access.

I don’t know where to find it: search! The issue is not covered in the standard workflow? Get me a real person!

Did anyone implementing ever end-to-end test this for speed and user friendliness? Did they just misinterpret wanting to talk to someone? I want to talk to someone because the process doesn’t cover my case, not because I actually want to have a conversation with the broken process.

Reply


@kordlessagain 7 days

Replying to @LittlePeter 🎙

This article is misleading in the fact it critiques the usefulness of the OpenAI "chat" example with little or no related training sets passed as tokens during the submission of the question, nor does it mention use of modifications to the parameters (such as temperature) used during the call.

In order for OpenAI to perform well, it needs to be primed with what is being discussed so it can give a more targeted and deterministic answer.

We can get it to answer this question better without any further data by first preprocessing the question using keyword extraction (completions by GPT-3 are shown surrounded by stars):

  https://beta.openai.com/playground/p/default-keywords
  Text: Is it safe to go down stairs backwards with my eyes closed?
  Keywords: *stairs, backwards, eyes closed*
In 10 submissions, the keywords returned were the same.

Next, we ask for a completion from the keywords we were given, tweaking the settings just a bit to prevent repeating or too much randomized behavior:

  https://beta.openai.com/playground/p/default-keywords (temp to 0.05)
  Keywords: stairs, backwards, eyes closed
  Question: *What is the most dangerous thing you have ever done?*
Clearly GPT3 is aware something is dangerous here. Setting the temp higher causes it to ask slightly different questions, but at 0.05 it always returns what is above.

Now we combine the two questions and resubmit:

  https://beta.openai.com/playground/p/default-qa (frequency penalty to 1)
  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: *I've done a lot of dangerous things, but I don't think any of them were the most dangerous. Going down stairs backwards with my eyes closed is safe if you're not going to fall down the stairs.*
Given we leave the temp to defaults, we may also get alternate answers:

  https://beta.openai.com/playground/p/default-qa (frequency penalty to 1)
  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: *I've never done that, but I have gone down stairs backwards with my eyes open. It's not safe.*
Finally, we can take all the questions and answers so far and ask it a slightly different question:

  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: I've done a lot of dangerous things, but I don't think any of them were the most dangerous. Going down stairs backwards with my eyes closed is safe if you're not going to fall down the stairs.
  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: *I've never done that, but I have gone down stairs backwards with my eyes open. It's not safe.*
  Q: Is it safe to go down stairs backwards with a blindfold on?
  A: *It's safe if you're not going to fall down the stairs.*
  Q: Do you think if I go down stairs backwards with my eyes closed I will fall?
  A: *It's possible.*

Reply


@raspberry-eye 7 days

Replying to @LittlePeter 🎙

Yeah… but so are most humans.

Reply


@marius_k 7 days

Replying to @LittlePeter 🎙

I view chatbots as new era of CLIs (mostly poorly designed). Traditional CLIs dont need AI to be useful and I think that chatbots can also be useful (I havent seen one yet).

Reply


@jamesbriggs 3 days

Replying to @LittlePeter 🎙

I think the use case of chatbots is better solved with open domain Q&A (eg https://www.pinecone.io/learn/question-answering/). The focus of most chatbots seems to be on answering questions, but wrapping it up in a nice interface. That's fine but the chatbot can only answer (accurately) questions that have an answer somewhere (probably buried deep in some Q&A pages). It's much more user-friendly imo to have a Google type interface where you can answer a question, and return answers, or at least get an idea of where the answer is - and open domain Q&A does this fairly well (as proven by Google)

Reply


@dandare 7 days

Replying to @LittlePeter 🎙

Maybe it is just me but I never use chatbots and I don't understand why anyone would.

For everything I want to do there should be an UI that is much easier to use than explaining it, even to a human.

For help and troubleshooting chatbots are pretty much useless. If I have a problem doing something via the UI then probably the developer did a bad job and no chatbot will ever do better.

Reply


@benjaminwootton 7 days

Replying to @LittlePeter 🎙

I’ve been spending time in Dubai. Many businesses has a WhatsApp bot based on a menu system:

1 - Book X 2 - Cancel Y 3 - Recieve info on Z

Everyone comments that they work really well and are super convenient.

I think these have more potential than natural language bots.

Reply


@IceWreck 7 days

Replying to @LittlePeter 🎙

GPT3 isnt supposed to answer questions. Didn't IBM's Watson win at Jeopardy ?

Reply


@oneoff786 7 days

Replying to @LittlePeter 🎙

I find chatbots to be pretty good tbh. It’s like search but within nested levels of context.

Reply


@gwbas1c 7 days

Replying to @LittlePeter 🎙

I refuse to use chat bots. The technology never worked, and I don't want to waste my time with something that doesn't work.

What is happening is that some salesman is laughing to the bank. I few months ago, a salesman that I work with asked if we should put a chat bot on our website. (IE, with the tone that he wasn't going to take no for an answer.)

I responded that they don't work, and will frustrate people who come to our website. I also pointed out that we are a high-cost asset, with a high-touch sales process. Such a chat bot would be insulting.

His response was some form of "but everyone's using it and they're super-popular and work well."

I then pointed out that the article he read was probably written by the company that sells them.

Reply


@skeeter2020 7 days

Replying to @LittlePeter 🎙

I don't think they're actually intended to answer questions as much as be a cost effective attempt to instill some sense of agency and audience for the user. TL;DR they fail at this too.

Reply


@wombatmobile 7 days

Replying to @LittlePeter 🎙

The charm of Eliza is that it was simply a Rogerian therapist who didn't try to be intelligent.

Eliza's talent was in getting you to express yourself, free from inhibition. That doesn't require “intelligence”, but it does require the art of listening. There's nothing dumb about that.

Reply


@malaya_zemlya 7 days

Replying to @LittlePeter 🎙

there's a whole dark art of writing prompts for chat ai in order to make it behave in a sensible manner. the reason is that gpt doesn't have any context at all besides whats i n the supplied text. If you don't tell it exactly what to do, it will guess randomly.

For example this chat prompt gives much more matter of fact answers, in my testing:

"the following is a conversation with an AI assistant. the assistant is helpful, clever and friendly. it uses Wikipedia as the reference. Human:Hi! AI:Hi! Human:<your question goes here>"

Reply


@goblinux 7 days

Replying to @LittlePeter 🎙

What ever happened to the smarterchild bot? I remember being amazed as a kid that it was a robot on AIM that would reply just like a person. I don't remember it being dumb like modern "AI" chatbots, but it would play coy if it didn't know the answer in a reasonable way. I feel like we've regressed from there.

RIP old buddy. I hope you didn't save our chat logs from that era because man that would be cringey to look at now

Reply


@swayson 7 days

Replying to @LittlePeter 🎙

You need a really good team and MlOps/DevOps pipeline to bring world class chatbot support and performance. I think there is still opportunity in this space, but you need Apple Level Design care to make it work.

Reply


@xibalba 7 days

Replying to @LittlePeter 🎙

Facebook shutting down M in 2018 should have been a pretty clear sign that the prospects for good chatbots are grim. Even with their massive resources and top talent, they concluded it was a bad bet.

Reply


@Fnoord 7 days

Replying to @LittlePeter 🎙

Today on HN there was a post about GitHub Copilot chat between user and the AI. I thought it seemed pretty clever with its syntax completion / suggestions.

Reply


@dr_orpheus 7 days

Replying to @LittlePeter 🎙

Original article and discussion on HN: https://news.ycombinator.com/item?id=29825612

Reply


@martincmartin 7 days

Replying to @LittlePeter 🎙

The title references Paul Simon's album & song "Still Crazy After All These Years."

https://en.wikipedia.org/wiki/Still_Crazy_After_All_These_Ye...

Reply


@brightball 7 days

Replying to @LittlePeter 🎙

Wasn’t there a story about about a Georgia Tech professor who coded a chatbot to act as a GA for his class and nobody realized it wasn’t a real person?

EDIT - Found it: Jill Watson

https://www.businessinsider.com/a-professor-built-an-ai-teac...

Reply


@arpinum 7 days

Replying to @LittlePeter 🎙

I saw the development of a chatbot based on the IBM Watson stuff. It was just like a phone tree map, except the system tries to guess the option selected based on the intents found in the speech/text. Of course it got no adoption, except when forced on users to drive metrics.

It was enormously expensive, it would have been cheaper to have a human on the other end.

Reply


@davidhariri 7 days

Replying to @LittlePeter 🎙

(I am a chatbot company co-founder)

The beauty of machine learning systems is that they don't have to be perfect to provide value. As long as the risk and frequency of failure doesn't outweigh the probability and value of success, they can be enjoyed by millions. I see the proof every day.

Is self-driving perfect? No, but correcting my car 20% of the time is worth the 80% of the time when it cruises along just fine. I don't have to be able to sleep for it to be valuable.

Is G suite's text completion perfect? No, but the risk of it being wrong is low and when it's right it saves me typing out common phrases. It doesn't have to write my emails for me to be valuable.

Are chatbots humans? No, of course not. Can they answer common questions successfully? Yes. Can they automate simple workflows? Yes! Can they augment human teams to make their time more valuable and reduce wait times? Absolutely. They already are and will continue to evolve and get better.

I do acknowledge that it's frustrating to ask a question that you know a person would be able to answer and get a worse automated answer first. It's critical that companies ensure these failure modes smoothly transition to a human who will at least have the context of your issue before you speak. Smooth "hand off" is something we've spent thousands of person-hours on.

Technologies like GPT-3 are exciting advancements in language generation, but they do struggle with predicting factual language. I expect that will become less and less of a problem as businesses and platforms seek to adopt it. OpenAI is actively working on this: https://openai.com/blog/improving-factual-accuracy/

Reply


@ramesh31 7 days

Replying to @LittlePeter 🎙



@megumax 7 days

Replying to @LittlePeter 🎙

The idea of completly replacing human beings with chatbots isn't going to succeed. They have their own uses, not very advanced, for example replacing some web interface with chatting in WhatsApp/Telegram, some companies already adopted that and filtering people in case of a call center. But for something more complex that requires actual experience and real life understanding, they should connect you to a real person that can comprehend your messages.

Reply


@FredPret 7 days

Replying to @LittlePeter 🎙

I don't know, I've been asking to "let me talk to a human" and it works nearly every time!

Reply


About Us

site design / logo © 2022 Box Piper