«

全接入式AI智能体时代已来临。

qimuai 发布于 阅读:18 一手编译


全接入式AI智能体时代已来临。

内容来源:https://www.wired.com/story/expired-tired-wired-all-access-ai-agents/

内容总结:

生成式AI“智能体”浪潮来袭:数据隐私面临空前挑战

多年来,使用谷歌、脸书、微软等科技巨头的“免费”服务,代价往往是交出个人数据。将生活上传至云端虽带来便利,却也使个人信息落入大公司手中,成为其潜在的牟利资源。如今,新一代生成式人工智能系统正试图获取比以往更广泛的数据权限。

近两年,从OpenAI的ChatGPT到谷歌的Gemini,生成式AI工具已从初期的文本聊天机器人,演进为科技公司大力推广的“智能体”或“助手”。它们承诺能代表用户执行操作、完成任务,但前提是用户需向其开放系统与数据访问权限。如果说早期大语言模型的争议在于未经授权复制网络版权数据,那么AI智能体对个人数据的深度触及,或将引发一系列新的隐私与安全问题。

“为实现完整功能、访问各类应用,AI智能体通常需要触及设备操作系统层面。”艾达·洛芙莱斯研究所高级研究员哈里·法默指出,其研究发现AI助手可能对网络安全和隐私构成“深远威胁”。为实现个性化服务,用户往往需要进行数据交换——“所有这些功能都需要获取大量关于你的信息。”

尽管AI智能体尚无严格定义,但其核心特征在于被赋予一定自主性的生成式AI系统。当前,包括AI浏览器在内的各类助手已能接管设备、代为浏览网页、预订航班、进行研究或添加购物车项目,有些甚至能完成包含数十个步骤的复杂任务。

尽管现有AI智能体仍存在漏洞且时常无法完成任务,科技公司坚信,随着能力提升,这类系统将从根本上改变数百万人的工作方式。而其效用的关键,正源于对数据的获取。若想获得能管理日程与任务的系统,就必须向其开放日历、消息、电子邮件等权限。

部分更先进的AI产品已预示了智能体可能获取的权限范围:为企业开发的智能体可读取代码、电子邮件、数据库、Slack消息、谷歌云端硬盘文件等;微软曾引发争议的Recall功能会每隔数秒截取桌面屏幕,以便用户回溯设备上的所有操作;交友软件Tinder则开发了AI功能,可扫描用户手机照片以“更好地理解”其兴趣与个性。

牛津大学副教授卡里萨·维利兹指出,消费者大多无法核实科技公司是否如其宣称的那样处理数据。“这些公司在数据使用上非常随意,已多次表现出对隐私的不尊重。”

现代AI产业从未真正尊重数据权利。自2010年代初机器学习与深度学习突破显示,更多数据能带来更好模型效果后,攫取信息的竞赛便日益激烈。人脸识别公司Clearview从网络抓取数百万张照片;谷歌曾仅支付5美元获取人脸扫描数据;据称甚至有政府机构使用受剥削儿童、签证申请人及死者的图像测试系统。

近年来,渴求数据的AI公司为构建大语言模型与生成式AI系统,未经许可或付费便抓取大量网络内容与数百万书籍。在耗尽网络公开数据后,许多公司默认将用户数据用于AI训练,将“选择退出”而非“选择加入”设为常态。

尽管注重隐私的AI系统正在开发,相关保护措施也已存在,但智能体的数据处理大多在云端进行,数据在系统间流转可能引发风险。欧洲数据监管机构委托的一项研究概述了智能体相关的诸多隐私风险:敏感数据可能泄露、滥用或截获;系统可能在无保障情况下将敏感信息传输至外部系统;数据处理方式可能触碰隐私法规红线。

维利兹教授进一步指出:“即使你本人知情并同意数据使用方式,与你互动的人可能并未同意。如果系统能访问你的所有联系人、邮件和日历,而你又通过它联系我,那么我的数据也在未经我同意的情况下被获取了。”

智能体的行为还可能威胁现有安全实践。所谓的“提示注入攻击”——通过文本向大语言模型注入恶意指令——可能导致数据泄露。若智能体获准深度访问设备,则设备内所有数据均面临风险。

加密通讯应用Signal基金会主席梅雷迪思·惠特克今年初警告:“通过操作系统智能体实现全面渗透与隐私消亡的未来尚未到来,但这正是这些公司推动的方向,而开发者往往无法选择退出。”她表示,能访问设备或操作系统一切的智能体,对Signal及应用层隐私构成“生存性威胁”,并呼吁建立明确的开发者退出机制。

艾达·洛芙莱斯研究所的法默提醒,许多用户已与现有聊天机器人建立起紧密联系,并在互动中分享了大量敏感数据,这使得AI系统与此前的技术截然不同。“在与这类系统进行个人数据交换时务必谨慎。它们当前采用的商业模式,未来很可能发生改变。”

中文翻译:

多年来,使用谷歌、脸书、微软等科技巨头"免费"服务的代价,就是交出自己的数据。将生活上传至云端、使用免费技术固然带来便利,却也使个人信息落入大型企业手中——它们往往伺机将这些数据变现。如今,新一代生成式人工智能系统对用户数据的索取程度,很可能将远超以往。

过去两年间,从OpenAI的ChatGPT到谷歌的Gemini,生成式AI工具已不再局限于企业最初发布的简单文本聊天机器人。大型AI公司正加紧构建并推广所谓的"智能体"与"助手",宣称它们能代表用户执行操作、完成任务。但问题在于:若想充分发挥其功能,就必须向它们开放系统与数据访问权限。尽管大型语言模型(LLM)早期争议焦点在于公然抄袭网络受版权保护的数据,但AI智能体对个人数据的访问权限,很可能将引发一系列全新问题。

艾达·洛夫莱斯研究所高级研究员哈里·法默指出:"为实现完整功能、访问各类应用程序,AI智能体通常需要接入设备操作系统层面。"他的研究发现,AI助手可能对网络安全与隐私构成"严重威胁"。法默表示,为实现聊天机器人或助手的个性化服务,往往需要用户让渡数据权益:"所有这些功能都需要获取大量个人信息才能运作。"

虽然AI智能体尚无严格定义,但通常可理解为被赋予一定自主权的生成式AI系统或大型语言模型。当前包括AI网页浏览器在内的各类智能体或助手,已能接管用户设备、代为浏览网页、预订航班、开展研究或添加购物车商品,有些甚至能完成包含数十个独立步骤的复杂任务。

尽管现有AI智能体仍漏洞频出、时常无法完成既定任务,但随着能力提升,科技公司押注这些系统将从根本上改变数百万人的工作方式。其功能实现的关键要素正是数据访问权限。因此,若想获得能管理日程与任务的系统,就必须开放日历、消息、电子邮件等数据的访问权限。

某些更先进的AI产品与功能已预示智能体可能获得的权限范围:为企业开发的特定智能体能读取代码、电子邮件、数据库、Slack消息、谷歌云端硬盘文件等;微软颇具争议的Recall产品每数秒截取一次桌面画面,以便用户回溯设备上的所有操作;Tinder推出的AI功能可扫描用户手机相册,"以更深入了解"其"兴趣与个性"。

牛津大学副教授卡里萨·贝利兹指出,大多数情况下消费者根本无法核实AI或科技公司是否如其宣称的那样处理数据。"这些公司对待数据极其随意,"贝利兹直言,"它们已多次表现出对隐私缺乏基本尊重。"

现代AI产业从未真正尊重过数据权利。2010年代初机器学习与深度学习突破表明,系统训练数据越多效果越佳,这激化了企业疯狂攫取信息的竞赛:Clearview等面部识别公司从全网爬取数百万张人脸照片;谷歌仅支付5美元就获取人脸扫描数据;据称甚至有政府机构利用受剥削儿童、签证申请人及逝者的图像测试系统。

短短数年后,渴求数据的AI公司为构建当前正向智能体拓展的大型语言模型与生成式AI系统,在未经许可或付费的情况下大规模抓取网络内容、复制数百万本书籍。当网络公开资源消耗殆尽,许多公司默认采用用户数据训练AI系统,将"选择退出"而非"选择加入"设为常态。

尽管注重隐私的AI系统正在开发,部分隐私保护措施也已落实,但智能体的大量数据处理将在云端进行,数据在系统间流转可能引发风险。欧洲数据监管机构委托的一项研究概述了智能体相关的诸多隐私风险:敏感数据可能遭泄露、滥用或截取;系统可能在无保障措施下向外部系统传输敏感信息;数据处理方式可能触犯隐私法规。

"即便你真正知情并同意数据使用方式,与你互动的人未必同意,"牛津大学副教授贝利兹强调,"如果系统能访问你所有联系人、邮件和日历,而你又通过存有我联系方式的设备联系我,那他们也在获取我的数据——这绝非我所愿。"

智能体的行为还可能威胁现有安全机制。所谓的提示词注入攻击——通过文本输入向大型语言模型注入恶意指令——可能导致数据泄露。若智能体被赋予设备深层访问权限,更将对设备内所有数据构成威胁。

"通过操作系统智能体实现全面渗透与隐私消亡的未来尚未到来,但这正是这些公司全力推进的方向,且开发者无法选择退出,"加密通讯应用Signal所属基金会主席梅雷迪思·惠特克今年初向《连线》杂志表示。她警告,能访问设备或操作系统所有内容的智能体,对Signal及应用层隐私构成"生存性威胁","我们呼吁建立明确的开发者层级退出机制,明确宣告:'如果你是智能体,就他妈别碰我们。'"

艾达·洛夫莱斯研究所的法默提醒,许多用户已与现有聊天机器人建立深度互动关系,过程中可能已共享海量敏感数据,这使得AI系统与此前所有技术存在本质区别。"在与这类系统进行个人数据交换时务必保持警惕,"法默警示,"这些系统当前的商业模式,很可能并非它们未来将采用的模式。"

英文来源:

For years, the cost of using “free” services from Google, Facebook, Microsoft, and other Big Tech firms has been handing over your data. Uploading your life into the cloud and using free tech brings conveniences, but it puts personal information in the hands of giant corporations that will often be looking to monetize it. Now, the next wave of generative AI systems are likely to want more access to your data than ever before.
Over the past two years, generative AI tools—such as OpenAI’s ChatGPT and Google’s Gemini—have moved beyond the relatively straightforward, text-only chatbots that the companies initially released. Instead, Big AI is increasingly building and pushing toward the adoption of agents and “assistants” that promise they can take actions and complete tasks on your behalf. The problem? To get the most out of them, you’ll need to grant them access to your systems and data. While much of the initial controversy over large language models (LLMs) was the flagrant copying of copyrighted data online, AI agents’ access to your personal data will likely cause a new host of problems.
“AI agents, in order to have their full functionality, in order to be able to access applications, often need to access the operating system or the OS level of the device on which you’re running them,” says Harry Farmer, a senior researcher at the Ada Lovelace Institute, whose work has included studying the impact of AI assistants and found that they may cause “profound threat” to cybersecurity and privacy. For personalization of chatbots or assistants, Farmer says, there can be data trade-offs. “All those things, in order to work, need quite a lot of information about you,” he says.
While there’s no strict definition of what an AI agent actually is, they’re often best thought of as a generative AI system or LLM that has been given some level of autonomy. At the moment, agents or assistants, including AI web browsers, can take control of your device and browse the web for you, booking flights, conducting research, or adding items to shopping carts. Some can complete tasks that include dozens of individual steps.
While current AI agents are glitchy and often can’t complete the tasks they’ve been set out to do, tech companies are betting the systems will fundamentally change millions of people’s jobs as they become more capable. A key part of their utility likely comes from access to data. So, if you want a system that can provide you with your schedule and tasks, it’ll need access to your calendar, messages, emails, and more.
Some more advanced AI products and features provide a glimpse into how much access agents and systems could be given. Certain agents being developed for businesses can read code, emails, databases, Slack messages, files stored in Google Drive, and more. Microsoft’s controversial Recall product takes screenshots of your desktop every few seconds, so that you can search everything you’ve done on your device. Tinder has created an AI feature that can search through photos on your phone “to better understand” users’ “interests and personality.”
Carissa Véliz, an author and associate professor at the University of Oxford, says most of the time consumers have no real way to check if AI or tech companies are handling their data in the ways they claim to. “These companies are very promiscuous with data,” Véliz says. “They have shown to not be very respectful of privacy.”
The modern AI industry has never really been respectful of data rights. After the machine-learning and deep-learning breakthroughs of the early 2010s showed that the systems could produce better results when they are trained on more data, the race to hoover up as much information as possible intensified. Face recognition firms, such as Clearview, scraped millions of photos of people from across the web. Google paid people just $5 for facial scans; official government agencies allegedly used images of exploited children, visa applicants, and dead people to test their systems.
Fast forward a few years, and data-hungry AI firms scraped huge swaths of the web and copied millions of books—often without permission or payment—to build the LLMs and generative AI systems they’re currently expanding into agents. Having exhausted much of the web, many companies made it their default position to train AI systems on user data, making people opt out instead of opt in.
While some privacy-focused AI systems are being developed, and some privacy protections are in place, much of the data processing by agents will take place in the cloud, and data moving from one system to another could cause problems. One study, commissioned by European data regulators, outlined a host of privacy risks linked to agents, including: how sensitive data could be leaked, misused, or intercepted; how systems could transmit sensitive information to external systems without safeguards in place; and how data handling could rub up against privacy regulations.
“Even if, let's say, you genuinely consent and you genuinely are informed about how your data is used, the people with whom you interact might not be consenting,” Véliz, the Oxford associate professor, says. “If the system has access to all of your contacts and your emails and your calendar and you’re calling me and you have my contact, they're accessing my data too, and I don't want them to.”
The behavior of agents can also threaten existing security practices. So-called prompt-injection attacks, where malicious instructions are fed to an LLM in text it reads or ingests, can lead to leaks. And if agents are given deep access to devices, they pose a threat to all data included on them.
“The future of total infiltration and privacy nullification via agents on the operating system is not here yet, but that is what is being pushed by these companies without the ability for developers to opt out,” Meredith Whittaker, the president of the Signal Foundation, which runs the encrypted Signal messaging app, told WIRED earlier this year. Agents that can access everything on your device or operating system pose an “existential threat” to Signal and application-level privacy, Whittaker said. “What we’re calling for is very clear developer-level opt-outs to say, ‘Do not fucking touch us if you’re an agent.’”
For individuals, Farmer from the Ada Lovelace Institute says many people have already built up intense relationships with existing chatbots and may have shared huge volumes of sensitive data with them during the process, making them different from other systems that have come before. “Be very careful about the quid pro quo when it comes to your personal data with these sorts of systems,” Farmer says. “The business model these systems are operating on currently may well not be the business model that they adopt in the future.”

连线杂志AI最前沿

文章目录


    扫描二维码,在手机上阅读