It turns out, Google published a research paper back in 2021 detailing how an AI-based information retrieval system could be an alternative to traditional search engines. This predates ChatGPT and search services build on top of it like Perplexity.ai. The paper begins with some background on why this would be a worthwhile development:

Given an information need, users often turn to search engines for help. Such systems point them in the direction of one or more relevant items from a corpus. This is appropriate for navigational and transactional intents (e.g. home page finding or online shopping) but typically less ideal for informational needs, where users seek answers to questions they may have… The very fact that ranking is a critical component of this paradigm is a symptom of the retrieval system providing users a selection of potential answers, which induces a rather significant cognitive burden on the user.

[…]

State-of-the-art pre-trained language models are capable of directly generating prose that may be responsive to an information need. However, such models are dilettantes – they do not have a true understanding of the world, they are prone to hallucinating, and crucially they are incapable of justifying their utterances by referring to supporting documents in the corpus they were trained over.

Taking a step back, it is true that search engines, like Google, are best for retrieving a list of relevant documents, not answering questions. This is intuitive but easily forgotten among all of the hype around systems like ChatGPT.

The paper goes on to list some areas in need of further research:

  • How to implement continuous, incremental learning
  • How to make the AI “forget” specific pieces of information, for legal or privacy reasons
  • How to ensure the model is predictable, interpretable, and debuggable
  • How to lower inference costs (i.e. the cost to run each query)

With all of the fervor around AI-based search, it will be interesting to see how many of these points are still open problems a year from now.


In related news, Reed Albergotti at Semafor reported that GPT-4 might appear in Bing soon?

Microsoft’s second-place search engine Bing is poised to incorporate a faster and richer version of ChatGPT, known as GPT-4, into its product in the coming weeks

[…]

OpenAI is also planning to launch a mobile ChatGPT app and test a new feature in its Dall-E image-generating software that would create videos with the help of artificial intelligence.

[…]

The most interesting improvement in the latest version described by sources is GPT-4’s speed

This is such a strange set of rumors; especially the fact that the only noted change in GPT-4 is it’s speed. Plus, it is being launched as a Bing integration? A new version of Dall-E for video would be super exciting; I am really skeptical about everything else in this report.