الثلاثاء، 14 فبراير 2023

A Closer Look Shows Microsoft’s ChatGPT Search Is Just as Flawed as Google’s Bard

Google has been getting raked over the coles since its unveiling of Bard, a ChatGPT competitor that is eventually coming to Google search. Google’s chatbot flubbed one of the few queries posed to it in a demo image, but Microsoft’s GPT-powered Bing didn’t perform perfectly either. A closer analysis of Microsoft’s demo revealed myriad errors, raising the question: Can we trust these machines? Search engine researcher Dmitri Brereton has put Microsoft’s Bing demo under the microscope, revealing that the supposedly more advanced chatbot made more than its fair share of mistakes.

One of the queries in the Microsoft demo included researching pet hair vacuums. According to Brereton, Bing incorrectly claimed one of the models it singled out was loud and had a short cord. However, the sources it cited say it’s quiet and cordless. When helping to plan a trip to Mexico, Bing offered some suggestions for places to enjoy the nightlife, but it claimed several of the recommended bars didn’t have reviews when, in fact, there are hundreds. It also recommended a popular bar without mentioning that it’s a gay bar.

So, Bing missed some important things, but that’s probably fixable. More troubling is how it handled summarizing a PDF. In the demo, Microsoft asked Bing to generate the takeaways from Gap’s Q3 2022 financial report. Here, Bing made up some numbers–for example, claiming an operating margin of 5.9%. That number doesn’t appear anyplace in the document. It got even worse when Bing was asked to compare data from Gap and Lululemon, inventing even more numbers from thin air and making the comparisons meaningless.

Bing confidently and inaccurately summarizes Gap’s financial report. Credit: Microsoft

Microsoft got away with this at the event because no one knows off the top of their head what Gap’s financials look like. Likewise, there aren’t many people who are sufficiently familiar with Mexico City nightlife to spot errors when they’re only on the screen for a moment. However, these answers are just as wrong as Bard’s high-profile flub when asked about the James Webb Space Telescope.

The new chatbot-powered Bing is available to a small number of testers. You can sign up for the waitlist, but if the demo is any indication, the new Bing will need much more testing before it’s worth believing.

Now read:



sourse ExtremeTechExtremeTech https://ift.tt/xRa7SiO

ليست هناك تعليقات:

إرسال تعليق