Ranking GPT-5 against LLMs

AI benchmarks have created a false impression about how to evaluate AI models: test AI for complex questions that several humans can’t answer. Even if AI does well, they conclude that AI has not matched vital human intelligence.

If AI gets things right that several humans would — on average — not get right, should that not be considered a significant leap? If there is quality knowledge of professional value that AI was trained on and can accurately reproduce, what does it matter that a human being can do something intelligently in the real world — which AI can’t — but has no economic value?

Yes, proportioning human intelligence with economic value is inappropriate, but society seems to do that in how resources are allocated and how rewards are split.

When prompted, AI can do many things to solve certain problems that would have taken a lot more to achieve without AI. AI is showing its competitive ability in two principal spheres: against the brain and for economics [supplying intelligence].

It does not matter if AI passes a benchmark test or not. It does not matter what model is prior or latest; if AI outputs intelligence that is useful and valuable, it is doing what the brain can do, and showing its ability to come close, match or surpass it [for intelligence].

There are lots of recommendations that world models, for AI, are necessary to exceed transformers. However, there are so many human activities that AI can’t do yet that seem so simple for humans, but are not considered smart. Some say AI can’t solve new major problems. There are major fields of science where many experts know as much but can’t solve problems there also. While the comparison is not balanced, AI has access to the knowledge. It is almost impossible to have someone just learn all about a major field in a shrunk timeline. AI is currently in possession of a lot of important knowledge. Brains may differ.

GPT-5

The chatter among some is that GPT-5 is not a transcendent leap in AI architecture and that scaling laws are obsolete. In conceptual brain science, it is unlikely to make such a conclusion. AI has come for the human brain. Some can see other AI models, AGI [artificial general intelligence], superintelligence and so on, but any good enough AI [say GPT-4o range] may tad depose vital [intelligence] aspects of the brain’s functionality.

Gloating over GPT-5 solves nothing about how the brain works. It contributes nothing to mechanisms that can be used to explain neurological observations. It is better to extensively explore ways to improve natural human intelligence; how to shape the mind against unwanted emotional outcomes by AI. Then develop intense AI alignment and AI safety models that can be adopted across the industry.

The reality of AI is here. It is already adopted socially and productively. It does not matter its flaws, it is doing much for many already. The AI companies can be criticized for several of their unethical, unfairness and more heavy-handed methods. They can be criticized for not having human intelligence research labs. However, what can be done towards solution?

Human Intelligence

How does human intelligence work? If someone is taking a new advanced class, why is grasping sometimes hard? Why is the school system graded, requiring years and several curricula? What should be understood about how humans learn, to prospect better path to understanding and problem-solving? If humans went to school to increase the chances for work, but now, AI can do some of the work, how should new learning purposes be structured?

GPT-5 would make errors — as sampling — to disdain it, but against a regular brain for knowledge of value, the story might be different.

There is a new [August 12, 2025] spotlight in The New Yorker, What If A.I. Doesn’t Get Much Better Than This?, stating that, “GPT-5, a new release from OpenAI, is the latest product to suggest that progress on large language models has stalled. If building ever-bigger models was yielding diminishing returns, the tech companies would need a new strategy to strengthen their A.I. products. They soon settled on what could be described as “post-training improvements.” The leading large language models all go through a process called pre-training in which they essentially digest the entire internet to become smart. But it is also possible to refine models later, to help them better make use of the knowledge and abilities they have absorbed. One post-training technique is to apply a machine-learning tool, reinforcement learning, to teach a pre-trained model to behave better on specific types of tasks. Another enables a model to spend more computing time generating responses to demanding queries.”

There is a recent [August 13, 2025] article on ZDNet, Why GPT-5’s rocky rollout is the reality check we needed on superintelligence hype, stating that, “In the days since it was released, the new AI model has received a fair amount of negative feedback and negative press — surprising given that, the week before, the reception to the company’s first open-source models in six years was widely acclaimed.”

1 thought on “Ranking GPT-5 against LLMs”

Brian James August 25, 2025 at 12:10 pm at 12:10 pm


The article reads like a badly performing AI – there is no content. I would be interested in seeing how GPT-5 ranks against LLMs on a variety of dimensions, but I think the author is trying to say you shouldn’t try: “It does not matter if AI passes a benchmark test or not.” That clearly depends on the question.
Chat GPT is clearly able to do some things faster than human beings, is may be able to do better than an average gathering of humans at getting something correct. Nevertheless, human beings will need to decide what tools to apply in what cases, we will still have to evaluate what comes out of the different models, and most importantly we will have to act on the basis of what we obtain. For all of those things, the evaluation and comparison of AIs will matter just as evaluating and comparing human experts has been done for millennia. AI is here, yes; but it’s flaws do matter and many of us would argue that the _reality_ of AI isn’t here yet, just the early versions of the instances. The reality will only emerge when we have seamlessly integrated AI into our human processes. THAT will only happen after evaluation and application.

GPT-5

Human Intelligence

1 thought on “Ranking GPT-5 against LLMs”

Leave a Reply Cancel reply