时文阅读:人工智能学会推理

时文阅读:人工智能学会推理
困难 4

Artificial Intelligence Learns to Reason

人工智能学会推理

Artificial Intelligence Learns to Reason


六级 | 480

刘立军供稿


Part I. Passage


Artificial Intelligence Learns to Reason


Reasoning is a central feature of human intelligence, but it remains a challenge for large language models (LLMs) like GPT-4. Consider a simple puzzle: Julia has two sisters and one brother. How many sisters does her brother Martin have? Solving this requires reasoning, which LLMs often struggle with. However, a new generation of AI systems, known as large reasoning models (LRMs), is emerging to address this challenge.


LRMs are designed to perform specific types of reasoning and have shown significant improvements in solving complex problems. For example, OpenAIs o1 model, one of the leading LRMs, can generate a chain of thought to solve the puzzle about Julia and Martin. Unlike traditional LLMs, which often fail at such reasoning tasks, o1 can produce a logical sequence of steps to arrive at the correct answer:


1. Julia has two sisters, so there are three girls in total.

2. Julia has one brother, Martin.

3. There are four siblings in total: three girls and one boy.

4. From Martins perspective, he has three sisters.

5. Therefore, Martin has three sisters.


This ability to generate reasoning steps has enabled LRMs to achieve impressive results on various benchmarks, including math, science, and coding challenges. Some companies are even betting big on LRMs as the basis for commercially lucrative AI assistants. For example, OpenAI has released its best LRMs to subscribers for a monthly fee, with plans to charge even more for advanced reasoning models.


However, not everyone is convinced that LRMs are truly reasoning. Some argue that these models are merely mimicking human reasoning patterns they were trained on, rather than performing general and robust problem-solving. The debate centers on whether LRMs are genuinely reasoning or just pretending to.


LRMs are built on top of pretrained LLMs and undergo further training to generate chains of thought. This process involves both supervised learning, where models are trained on human-generated reasoning steps, and reinforcement learning, where models are rewarded for generating correct answers and well-structured reasoning steps. Reinforcement learning, in particular, has shown great potential in enabling LRMs to develop advanced problem-solving strategies without explicit human guidance.


Despite their impressive performance on benchmarks, LRMs still face challenges. Their reasoning steps are often long and computationally expensive, and their overall robustness remains largely untested, especially for real-world problems that do not have clear answers or defined solution steps. Additionally, the human-like language used by LRMs may mislead users into trusting them too much, when their underlying mechanisms may not be fully reliable.


More research is needed to determine the true capabilities of LRMs. While some companies remain secretive about their models, others, like DeepSeek, are releasing detailed information about their LRMs to facilitate further study. As the debate over the nature of reasoning in AI continues, the development of LRMs will undoubtedly play a crucial role in shaping the future of artificial intelligence.



词汇


1. intelligence n. 智力;智能  

2. sibling n. 兄弟姐妹  

3. benchmark n. 基准;标准

4. mimic v. 模仿;模仿  

5. mechanism n. 机制;原理  


Part II. Questions


Q1. What distinguishes large reasoning models (LRMs) from traditional large language models (LLMs)?

A. LRMs can solve real-world problems without predefined solutions.  

B. LRMs are trained to generate structured reasoning steps.  

C. LRMs rely solely on reinforcement learning for their training.  

D. LRMs achieve higher computational efficiency than LLMs.  


Q2. Why do some researchers remain skeptical about LRMs' reasoning abilities?

A. LRMs lack the computational power to handle reasoning tasks.  

B. LRMs fail to perform well on benchmarks involving human reasoning.  

C. LRMs may not be fundamentally reasoning but mimicking patterns.  

D. LRMs rely too heavily on human-generated reasoning steps.  


Q3. Based on the text, what is the most likely future challenge for LRMs in real-world applications?

A. Achieving computational efficiency for large-scale reasoning tasks.  

B. Developing reasoning capabilities without human supervision.  

C. Replacing traditional LLMs in all AI-related domains.  

D. Ensuring robustness in solving problems with unclear solutions.  


Q4. According to the text, what is a potential risk of relying on LRMs?

A. Their reasoning steps are often too simplistic.  

B. Their human-like language may lead to overtrust.  

C. Their training process lacks transparency.  

D. Their reliance on supervised learning limits scalability.  


Q5. What is the significance of companies like DeepSeek releasing detailed information about their LRMs?

A. It promotes further research into the true capabilities of LRMs.  

B. It helps establish LRMs as the dominant AI models in the industry.  

C. It ensures the commercial success of LRMs in global markets.  

D. It highlights the limitations of LRMs in real-world applications.  


Part III. KEY


Q1.B.【解析】细节题。根据LRMs are designed to perform specific types of reasoning and have shown significant improvements in solving complex problems Unlike traditional LLMs, which often fail at such reasoning tasks, o1 can produce a logical sequence of steps to arrive at the correct answer.,可知LRMs被设计用来执行特定类型的推理,并在解决复杂问题方面表现出显著改进不同于传统的LLMso1可以生成逻辑步骤序列以得出正确答案。因此,正确答案为B


Q2. C.【解析】细节题。根据Some argue that these models are merely mimicking human reasoning patterns they were trained on, rather than performing general and robust problem-solving.,可知一些人认为,这些模型只是模仿了它们被训练过的人类推理模式,并非真正进行推理因此,正确答案为C


Q3.D.【解析】推理题。根据Their overall robustness remains largely untested, especially for real-world problems that do not have clear answers or defined solution steps.,可知它们的整体稳健性在很大程度上仍未经测试,尤其是对于没有明确答案或固定解决步骤的现实问题 因此,正确答案为D


Q4. B.【解析】细节题。根据Additionally, the human-like language used by LRMs may mislead users into trusting them too much, when their underlying mechanisms may not be fully reliable.,可知LRMs使用的人类化语言可能会误导用户,使其过度信任它们,而它们的底层逻辑可能并不完全可靠。因此,正确答案为B


Q5. A.【解析】细节题。根据While some companies remain secretive about their models, others, like DeepSeek, are releasing detailed information about their LRMs to facilitate further study.,可知DeepSeek,正在发布有关其LRMs的详细信息以推动进一步研究。因此,正确答案为A


本文图片来源于摄图网,版权归摄图网所有)

  • 字数:934个
  • 易读度:困难
  • 来源:刘立军 2026-02-13