英伟达发布自动驾驶汽车开源推理人工智能。

内容来源:https://aibusiness.com/automation/nvidia-open-reasoning-ai-self-driving-vehicles
内容总结:
在近日于圣地亚哥举行的NeurIPS学术会议上,英伟达发布了名为Alpamayo-R1(AR1)的新型人工智能模型,该模型被描述为全球首个面向自动驾驶的产业级开放推理视觉语言行动模型。这一以秘鲁安第斯山脉险峰命名的系统,通过结合思维链推理与路径规划技术,能够像人类一样分解复杂交通场景、分析多种可能性后再做出决策。
该模型具备同步处理文本与图像信息的能力,使车辆传感器可将感知的视觉信息转化为自然语言描述。英伟达表示,这项技术对实现SAE标准下的L4级自动驾驶至关重要——该级别意味着车辆在特定环境下可完全自主驾驶。公司副总裁Bryan Catanzaro在技术博客中举例说明:在行人密集且毗邻自行车道的区域,AR1能通过推理轨迹记录决策依据,并据此规划远离自行车道或礼让违规行人的行驶路线。
除上述场景外,该模型在行人密集的交叉路口、前方车道封闭、自行车道被违规占用等复杂情境中均能展现类人推理能力。其特有的“思维外化”机制使工程师能清晰追溯车辆决策逻辑,为提升自动驾驶安全性提供重要参考。目前该模型已在GitHub和Hugging Face平台开源,研究人员可基于其进行非商业用途的定制化开发。
中文翻译:
由谷歌云赞助
如何选择首批生成式AI应用场景
要开启生成式AI之旅,首先应关注能够优化人类信息交互体验的领域。
本周在NeurIPS大会上亮相的Alpamayo-R1模型,旨在通过类人推理实现四级自动驾驶。
英伟达借本届NeurIPS大会发布了新型人工智能系统,期望以此加速普及自动驾驶汽车的进程。
在圣地亚哥的活动中,该公司推出了Alpamayo-R1(简称AR1),称其为全球首个面向自动驾驶的工业级开源推理视觉语言行动模型。
VLA模型能同步处理文本与图像,这意味着车辆传感器可将"看到"的场景转化为自然语言描述。
该软件以安第斯山脉中难以攀登的阿尔帕马约峰命名,融合了思维链AI推理与路径规划技术。通过像人类一样分解场景、权衡所有可能性后再行动,它比前代自动驾驶软件更能应对复杂路况。
英伟达表示,这种能力对实现四级自动驾驶"至关重要"——根据国际汽车工程师学会的定义,该级别意味着汽车能在特定环境下完全自主操控。
在阿尔帕马约-R1发布之际,英伟达应用深度学习研究副总裁布莱恩·卡坦扎罗通过博文演示了其运作原理。
他举例道:"借助AR1的思维链推理功能,在自行车道旁行人密集区域行驶的自动驾驶汽车,可整合路径数据与推理轨迹(即行动决策依据),从而规划后续行驶路线,例如远离自行车道或为可能横穿道路的行人停车。"
英伟达还列举了AR1类人推理能力适用的其他复杂场景:行人密集的十字路口、即将封闭的车道、自行车道上的并行停车车辆等。
通过外化推理过程,AR1让工程师更清晰地了解决策逻辑,这显然有助于提升车辆安全性的优化方向。
该模型基于英伟达今年早前发布的Cosmos Reason构建,其开源特性将支持研究人员根据非商业用途进行定制,无论是用于基准测试还是自主开发自动驾驶系统。
AR1已在GitHub和Hugging Face平台开放。据卡坦扎罗介绍,强化学习后训练被证实"效果显著",研究人员反馈其推理能力获得"实质性提升"。
您可能还喜欢
英文来源:
Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
Alpamayo-R1 was introduced this week at NeurIPS and aims to achieve Level 4 automation with human-like reasoning.
Nvidia used this year's NeurIPS conference to reveal new AI that it has hoped will help accelerate progress toward widespread self-driving vehicles.
At the event in San Diego, the company presented Alpamayo-R1 (AR1), which it described as the world's first industry-scale open reasoning vision language action (VLA) model for autonomous driving.
VLA models can process text and images together, meaning vehicle sensors can translate what they "see" into descriptions that use natural language.
Nvidia's software -- named after a mountain in the Peruvian Andes considered challenging to scale -- combines chain of thought AI reasoning with path planning. This allows it to better process complex situations than previous iterations of self-driving software by breaking down a scenario and considering all possible options, just as a human would do, before proceeding.
This ability, Nvidia said, will be "critical" in helping to achieve Level 4 automation -- defined by the Society of Automotive Engineers as when a car is in complete control of the driving process in specific circumstances.
In a blog post published to coincide with the unveiling of Alpamayo-R1, Bryan Catanzaro, Nvidia vice president of applied deep learning research, provided an example of how it would work.
Catanzaro said: "By tapping into the chain-of-thought reasoning enabled by AR1, an AV [autonomous vehicle] driving in a pedestrian-heavy area next to a bike lane could take in data from its path, incorporate reasoning traces -- explanations on why it took certain actions -- and use that information to plan its future trajectory, such as moving away from the bike lane or stopping for potential jaywalker."
Other nuanced scenarios cited by Nvidia where AR1's human-style reasoning would assist include pedestrian-heavy intersections, an upcoming lane closure or when a vehicle is double parked in a bicycle lane.
By effectively thinking aloud with its reasoning, AR1 gives engineers greater insight as to why it has made a specific decision, which obviously helps them better understand how to make vehicles safer.
The model is based on Nvidia's Cosmos Reason, which launched earlier this year, and its open access will allow researchers to customize it for their own non-commercial use cases, either for benchmarking or building their own AVs.
AR1 is available on GitHub and Hugging Face, and according to Catanzaro, reinforcement learning post-training has proven "especially effective," with researchers reporting "significant improvement" in reasoning capabilities.
You May Also Like