Artificial Intelligence

Out of context: Reply #629

  • Started
  • Last post
  • 1,341 Responses
  • canoe0

    "The most heated debate about large language models does not revolve around the question of whether they can be trained to understand the world. Instead, it revolves around whether they can be trusted at all. To begin with, L.L.M.s have a disturbing propensity to just make things up out of nowhere. (The technical term for this, among deep-learning experts, is ‘‘hallucinating.’’) I once asked GPT-3 to write an essay about a fictitious ‘‘Belgian chemist and political philosopher Antoine De Machelet’’; without hesitating, the software replied with a cogent, well-organized bio populated entirely with imaginary facts: ‘‘Antoine De Machelet was born on October 2, 1798, in the city of Ghent, Belgium. Machelet was a chemist and philosopher, and is best known for his work on the theory of the conservation of energy. . . . ’’

    L.L.M.s have even more troubling propensities as well: They can deploy openly racist language; they can spew conspiratorial misinformation; when asked for basic health or safety information they can offer up life-threatening advice. All those failures stem from one inescapable fact: To get a large enough data set to make an L.L.M. work, you need to scrape the wider web. And the wider web is, sadly, a representative picture of our collective mental state as a species right now, which continues to be plagued by bias, misinformation and other toxins. The N.Y.U. professor Meredith Whittaker, a founder of the watchdog group AI Now, says: ‘‘These models ingest the congealed detritus of our online data — I mean, these things are trained on Reddit, on Wikipedia; we know these skew in a specific direction, to be diplomatic about it. And there isn’t another way to make them.’’

View thread