苹果使用3D高斯溅射技术用于“人物”和照片的3D转换。

苹果使用3D高斯溅射技术用于“人物”和照片的3D转换。
Apple’s Persona technology uses Gaussian splatting to create 3D facial scans

原始链接: https://www.cnet.com/tech/computing/apple-talks-to-me-about-vision-pro-personas-where-is-our-virtual-presence-headed/

## Apple Vision Pro 的“人物形象”：连接未来的展望 Apple Vision Pro 头显最引人注目的功能之一是“人物形象”——极其逼真、实时的用户 3D 头像。这些头像通过高细节扫描技术（利用高斯飞溅和机器学习）创建，旨在复制用户的外貌，以及微妙的面部表情和肢体语言，从而在虚拟互动中营造出令人惊讶的沉浸式临场感。虽然类似的技术已经存在（Meta、Microsoft，甚至早期的 AR 眼镜尝试），但 Apple 的实现目前被认为是目前最先进的。在演示中，用户表示与以“人物形象”出现在他们空间中的同事建立了真实的联系，促进了超越普通视频通话的协作。目前，“人物形象”功能仅限于 3499 美元的 Vision Pro。Apple 优先考虑设备端处理，以保护隐私和实现逼真效果。然而，该技术更广泛的应用潜力——甚至在 iPhone 上——也令人着迷，特别是考虑到该技术仅需要“少量”图像进行扫描。Apple 将“人物形象”设想为一种真实的呈现工具，旨在弥合物理距离，并彻底改变医疗培训等领域。虽然目前每个用户仅限于一个“人物形象”，但这项技术代表着向科幻小说中长期设想的“全息远程呈现”迈出的重要一步。

## 苹果的Persona技术与高斯溅射：摘要苹果的“Persona”技术利用高斯溅射创建逼真的3D面部扫描，用于虚拟呈现，相比早期测试版本有了显著改进——甚至让一些观察者误以为长达30分钟。该技术在最近的Corridor Digital视频中得到展示，视频重现了《黑客帝国》的子弹时间场景，重点在于构建环境而非传统建模。讨论的中心是该技术在沉浸式体验方面的潜力，例如远程协作和异地恋，但也强调了Vision Pro头显本身的局限性，包括单屏幕镜像和电池续航。高斯溅射涉及用赋予颜色值的3D点来表示场景（颜色随视角变化），提供了一种新的图形渲染流程。虽然前景可期，但关于在实际应用中的延迟以及该技术是否解决了真正的问题，或者只是在寻找解决方案的问题仍然存在，考虑到像网络摄像头这样的现有替代方案。Vision Pro的高成本也引发了对可访问性和广泛采用的担忧。

原文

Buried inside Apple's $3,499 Vision Pro VR headset is a feature that continually wows me, but you've probably never heard of it. The feature, called Personas, involves two or more users, all wearing Vision Pros, chatting with one another in real time but as virtual replicas.

Now out of beta, Personas are part of Apple's avatar system for the Vision Pro, creating replicas of yourself via a 3D photo scan.

Taking a scan of myself isn't a new thing. Some five years ago, I tried telepresence with 3D-scanned avatars on Nreal AR glasses with a company called Spatial. I've gotten peeks at Meta's realistic codec avatars. I explored cartoon avatar telepresence with Microsoft in HoloLens. And I've even scanned myself into all sorts of bizarre AI deepfakes using OpenAI's Sora phone app.

Still, no one is doing anything in VR or AR headsets or glasses as advanced as Apple's Vision Personas. And we haven't seen the beginning of how good things could get.

Watch this: Apple Vision Pro's Best Feature Is Your Avatar. Could Personas End Up on an iPhone Next? | All Things Mobile

04:13

To learn more, I donned an M5 Vision Pro headset and jumped into a FaceTime for an exclusive chat with Apple's senior director of the Vision Products Group, Jeff Norris, and the senior director of product marketing, Steve Sinclair. The two showed up as Personas in my home office. We wandered in like ghosts when the meeting started, face to face, so to speak. After a few minutes, it felt like we were actually spending time together in person.

Apple doesn't discuss the future. But Norris and Sinclair did explain some of the very cool 3D tech that makes Personas seem so realistic. As we chatted, I imagined that similar scans could be done on places other than Vision Pro, like maybe your iPhone, which would be accessible to more people.

Apple's Personas seem uncanny outside the headset, but not inside.

Apple

Real telepresence is expensive magic

It's hard to find another person who has a Vision Pro, but when I have, the eerie sense of someone ghost-walking into my home is like wizardry. Apple's VisionOS has evolved to allow collaboration between Personas, flexing virtual spaces out for up to five people to see and share virtual objects and apps together. Multiple people in the same room wearing Vision headsets can collaborate with Personas that can beam in remotely, as well.

I've dreamed of that Tony Stark-like, Star Wars-hologram telepresence idea for years now. It's basically here. It's just walled into very expensive hardware.

Smart glasses haven't been able to handle the load of avatars like this yet, although AR glasses from Snap and others may be trying soon. My question for Apple is: What technology is making Personas happen, and could it ever appear anywhere else?

Splatting scan technology uses machine learning

In our meeting, Norris explains that Persona technology uses Gaussian splatting to create those surprisingly convincing 3D facial scans. Gaussian splatting is the key tech to many 3D applications right now, often applied to scanning objects or large-scale environments. Meta's Hyperscape Capture app on Quest can scan whole rooms into 3D-walkable spaces in VR, for example. It knits a 3D image or landscape from a series of 2D images using AI.

What makes Personas unique is the focus on scanning yourself instead of your environment. Using VisionOS 26, Norris showed me the key changes from the earlier Persona versions. The renders can now show greater detail at multiple angles and capture details like jewelry and eyelashes. Bodies and faces are scanned together, which makes the render feel more seamless.

"There's machine learning involved, but not many people really realize that it's a concert of networks that come together," says Norris. "We counted them up, it's over a dozen, but we actually reduced the number when we moved to this new version of Personas."

I mentioned the possibility of scanning rooms into Vision Pro down the road (apps like Scaniverse and Polycam already show off 3D scans in headsets). Norris says Apple is already applying Gaussian splatting to the spatial 3D conversions of photos, which now look weirdly immersive in headsets. So, what's next?

Vision Pro headsets can collaborate in the same space and fold in Personas from somewhere else at the same time. You just need to have one of those headsets to participate in the spatial experience.

Apple/Screenshot by Joe Maldonado/CNET

It doesn't take much to capture the photos needed. Could it be done on iPhones?

Even though the Persona scan is done via Vision Pro's headset, which requires me to hold up the headset to turn my head and scan, it's not a process that requires me to use Vision Pro's sensors extensively.

"We only need a handful of images when we are enrolling your Persona," Norris tells me. "That includes a few facial expressions to help our networks understand how your face moves when you're talking and smiling. And that's it."

I wonder whether an iPhone could eventually scan a Persona, which I'd find a lot easier than using the Vision Pro. Norris doesn't answer that directly.

"It's interesting to imagine different ways of accomplishing that," he responds. "But right now, we love that it's self-contained to the device and that all the processing happens on the device. None of these images have to go anywhere in order for that to happen."

Me in my VisionOS Persona during my first demo of the new version at WWDC earlier this year.

Apple

What could this mean for our future sense of virtual identity?

The single Persona I scan and bond to my Apple ID on Vision Pro feels like it's designed to act as a one-to-one mapping for my virtual self. It's the closest thing Apple has to a substitute for using a camera to broadcast my actual face, which can't be done since I'm wearing a headset.

AI companies are already scanning and generating virtual versions of people in increasing numbers of deepfakes, both intentional and unintentional. OpenAI's Sora app is the most prominent example now, and uses a similar type of face-scanning tech on iPhones to generate a "Cameo" of myself I can lend out to others.

I ask Norris where the line can be drawn going forward. He makes it clear that Apple wants to clearly and securely represent a person in real time, not as a reproduction.

"We have focused Personas on that authentic representation goal," he says. "We're trying to grant what I think is a fundamental human wish, which is: 'I wish you were here.' That begins by trying to be faithful to how we appear, and how we're moving, and how we're emoting as we speak."

Can I have more than one Persona, or more customization?

Right now, Apple limits you to using one Persona scan at a time, which surprises me. I'd love a variety of Scott Stein avatars in different moods or simply with different clothes. While Apple doesn't explore identity transformation via scans, I do appreciate the options for realistic glasses, and I'd love to be able to add more accessories.

"People can reenroll or just put on a different shirt and enroll again," says Norris. "I totally understand why that would be something we'd want. But we're focusing on just the one at a time right now."

I tried using scanned avatars with Nreal AR glasses back in 2020 using an app by Spatial could use phones and headsets together. Will Apple do that too?

Spatial

Would Personas ever extend outside Vision Pro?

I'm already thinking about more options for Personas, not just for Apple's expensive headset, but for iPhones and other devices.

What if they could be personal stand-ins on our FaceTime calls? I can already call my wife on FaceTime from Vision Pro, and she can see my 2D Persona there. She laughs at it because it feels somewhat supernatural. If Apple has already opened the doorway this much with Animoji on FaceTime, why not Personas too?

Norris insists that Personas work better in the Vision headset, which I agree with. The renderings feel more convincing, somehow. When we place ourselves in environments that are already half-composed of virtual things, these 3D-scanned identities appear more natural. But physical distance and body expressions can also happen in space. Personas can leave their box and hover around as torsos, hands and faces.

"I can tell a joke and you're gonna get it because you're gonna see my body language, and see my facial expressions that you don't see on a two-dimensional screen," says Sinclair. "Here, we're in the room together, and it feels like we really are here."

As his Persona stands next to my cluttered desk in that virtual form, I realize he's right.

Apple is already receiving feedback about this for business uses. "We're hearing about it in healthcare as well," Norris says. "Doctors who create procedures and want to train other people. They don't have to travel around the country. They can just get on a FaceTime call with their Personas."

I still see a future where iPhones, iPads, laptops and headsets all collaborate together, something companies like Microsoft and Qualcomm have pointed to as a bridge between headsets and flat-screen devices. Samsung and Google are discussing those types of connecting points with Android XR, too. Apple has an ARKit on iPhones and iPads, so the possibilities already exist.

Norris says that Personas outside of a headset would be missing something right now. "To get the full appreciation of the experience, you kind of have to have both the sensing capabilities and the incredible display capabilities. They really have to kind of come together to create a magic moment like this."

As Apple moves toward an expected line of smart glasses in the future, and inevitably toward more advanced iPhones and iPads, that philosophy could evolve. Personas are the start of a fundamental change in how we handle collaboration and connection.

For the moment, however, you'll never experience it unless you're inside a Vision Pro. I look forward to a time when the entry ticket into this magic telepresence world is far more affordable and better distributed, so more people can come aboard.

Right now, my Persona is mainly by itself. I'd love it if I could get some company more often.

Don't miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.