Microsoft Research study shows how AI bots fail at long chats

Microsoft Research study shows how AI bots fail at long chats

Top AI research labs have released sophisticated AI models and subsequent chatbots to cement their brand names in the ever-evolving landscape, which is honestly becoming difficult to keep up with. However, users often lodge complaints about these offerings, citing hallucinations or outrightly wrong responses to queries.

A research paper by Microsoft Research and Salesforce analyzed 200,000+ AI conversations from the most advanced Large Language Models (LLMs), including GPT-4.1, Gemini 2.5 Pro, Claude 3.7 Sonnet, o3, DeepSeek R1, and Llama 4, and revealed that these tools often get “lost in conversation” when tasks are broken into a natural, multi-turn conversation (via NeuroNad).

Source link

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *