Chris Martin

As was widely rumored, OpenAI officially announced GPT-4 yesterday.

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks. For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%.

[…]

We are releasing GPT-4’s text input capability via ChatGPT and the API (with a waitlist).

Language improvements:

In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.

[…]

While still a real issue, GPT-4 significantly reduces hallucinations relative to previous models (which have themselves been improving with each iteration). GPT-4 scores 40% higher than our latest GPT-3.5 on our internal adversarial factuality evaluations

Visual inputs:

GPT-4 can accept a prompt of text and images, which—parallel to the text-only setting—lets the user specify any vision or language task. Specifically, it generates text outputs (natural language, code, etc.) given inputs consisting of interspersed text and images. Over a range of domains—including documents with text and photographs, diagrams, or screenshots—GPT-4 exhibits similar capabilities as it does on text-only inputs.

I have to admit, I am disappointed GPT-4 can not output images and can only accept them as input. I wouldn’t be surprised if this changes before too long, though. Regardless, this is a huge new feature. I am going to have a lot of fun thinking of projects I can try this with as I wait for API access.

Miscellaneous:

GPT-4 generally lacks knowledge of events that have occurred after the vast majority of its data cuts off (September 2021), and does not learn from its experience.

This is disappointing and leads credence to those that say OpenAI is having difficulty filtering AI generated text out of potential training material.

gpt-4 has a context length of 8,192 tokens. We are also providing limited access to our 32,768–context (about 50 pages of text) version, gpt-4-32k

As general prose output improves, the next avenue for major language model development will be increasing context length. The 32,000 token model is especially exciting for that reason.

Rather than the classic ChatGPT personality with a fixed verbosity, tone, and style, developers (and soon ChatGPT users) can now prescribe their AI’s style and task by describing those directions in the “system” message. System messages allow API users to significantly customize their users’ experience within bounds.

I have been having a lot of fun experimenting with altering the “system” message through the GPT-3.5 API. It is great that they will be bringing that capability to the ChatGPT web interface.

ChatGPT Plus subscribers will get GPT-4 access on chat.openai.com with a usage cap. We will adjust the exact usage cap depending on demand and system performance in practice, but we expect to be severely capacity constrained… To get access to the GPT-4 API (which uses the same ChatCompletions API as gpt-3.5-turbo), please sign up for our waitlist. We will start inviting some developers today, and scale up gradually to balance capacity with demand… Pricing is $0.03 per 1k prompt tokens and $0.06 per 1k completion tokens.

My prediction was correct. GPT-4 is available today to ChatGPT Plus subscribers, everyone else must sign up on the waitlist. Additionally, the API will cost much more than the gpt-3.5 API.

Okay, one more thing: Microsoft confirmed that Bing AI has been using GPT-4 under the hood since launch.

Google should be embarrassed.

In a desperate attempt to insert themselves into the conversation, a few hours before OpenAI’s announcement, Google announced that “select developers” will be invited to try their PaLM language model API with a waitlist coming “soon.”

It would be fair to say that I am more than a bit suspicious of Google’s recent AI efforts. They have been silent about Bard since its announcement more than a month ago where they said the chatbot would be widely available to the public “in the coming weeks.”

Google’s messaging around AI does not sound like it is coming from a company that is excited about building public-facing generative AI technology. More than anything else, their announcement today felt defensive — as if Google is concerned the public will forget they have historically been a leader in AI research. If they stay on their current path, that is exactly what will happen.