有什么算法可以将2D视频转换为3D视频吗?
是否有任何算法可以将 2D 视频转换为 3D 视频(用于使用眼镜观看)?
(啊啦,将阿凡达变成阿凡达,以获得 IMAX 3D 体验。)或者至少将其转换为视频,准备使用它来感受 3D 观看效果:
(来源:3dglassesonline。 com)
或
(来源: 3dglassesonline.com)
Is there any algorithm for converting 2D video into 3D video (for viewing using glasses)?
(A-la turning Avatar into Avatar for an IMAX 3D experience.) Or at least turn it into video prepared for feeling some 3D viewing using it a-la:
(source: 3dglassesonline.com)
or
(source: 3dglassesonline.com)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
嗯,斯坦福大学确实有一种将 2D 照片转换为 3D 模型的算法。我的猜测是,对于电影来说应该更容易,因为这样你就有了几张照片而不是一张照片,所以你可以通过比较相邻帧来提取更多有关深度的信息。
可以说,结果永远不会像一开始就以 3D 方式渲染/拍摄电影时那么好。
Well, Stanford does have an algorithm for converting 2D photos into 3D models. My guess is that with movies it should be even easier, because then you have several photos instead of just one, so you can extract much more information about depth by comparing neighboring frames.
Arguably, the results will never be quite as good as when you just render/shoot the movie in 3D to begin with.
否 - 3D 视频需要提供 2D 视频中不包含的额外信息(深度)。
如果您有场景的 2D 渲染(例如在《玩具总动员》中),那么制作 3D 电影非常容易 - 您只需更改场景的视角并重新渲染即可。
No - 3D video require that extra information (depth) be present that simply isn't contained in 2D video.
If you have a 2D rendering of a sceene (for example in Toy Story) then its quite easy to produce a 3D film - you just change the viewing angle of the sceene and re-render.
由于非常简单的原因,它通常无法工作:假设您有一个场景,墙上有一扇窗户,显示海滩,除此之外,您还有一张照片,显示墙上有一扇窗户,显示海滩。算法如何区分两者?你如何辨别什么是有深度的现实,什么只是平面照片?
It cannot work in general for a very simple reason: Suppose you have a scene with a window in a wall showing a beach, and beside that, you have a photograph showing a window in a wall showing a beach. How can the algorithm differentiate between the two? How can you detect what is reality with depth and what is just a flat photograph?
您可能应该了解偏光眼镜和红/蓝眼镜之间的区别。红/蓝眼镜3D效果制作起来很简单。您只需拍摄一张相距几英寸的照片(有点像眼睛的排列方式),然后将每张图像叠加到另一张图像上。 在 Adobe Photoshop 中制作浮雕图像。
至于偏光眼镜效果,这个就有点难了。如果您去电影院戴着偏光眼镜观看 3D 电影,您看到的是真正的 3D。它的工作原理是有两台投影仪。一台投影仪以一种偏振类型投影电影,第二投影仪以另一种偏振类型投影电影。这些图像彼此重叠,因此如果您佩戴 3D 偏光眼镜,图像就会以 3D 形式显示。
使用电视或电脑显示器无法轻松完成此操作。您的电视或显示器必须同时投影两个图像。不过,由于现在3D的普及,市场上出现了不投影两个图像,而是显示3D的3D电视和显示器。它们的工作原理如下:
普通计算机屏幕或电视以 60 Hz 的频率刷新。这意味着一秒钟内有60次,你看到的图像正在被刷新。因为速度太快,人眼看不到闪烁。 3D 电视和显示器以 120 Hz 刷新。偏振图像以每秒 120 次的速度交换,但由于有两个,所以每秒出现 60 次,这就是产生 3D 效果的原因。
我希望这能帮助你理解一点。
要回答您的问题,是的,您可以创建 3D 视频,但您需要 3D 显示器和 3D 电视才能观看它。
You should probably understand the difference between polarizing glasses and red/blue glasses. The red/blue glasses 3D effect is simple to do. You simply need to take a picture about a couple of inches apart (kind of like how are eyes are laid out) and superimpose each image over the other. There is a tutorial on how to do this in Making Anaglyph Images in Adobe Photoshop.
As for the polarizing glasses effect, this is a little harder. If you go to a movie theatre and watch a 3D movie with polarizing glasses, you are seeing true 3D. It works by having has two projectors. One projector is projecting the movie in one type of polarization and the second projector is projecting the movie at the other type of polarization. The images are overlaid right on top of each other so if you're wearing your 3D polarizing glasses, it appears in 3D.
This can't be done as easily with a TV or computer monitor. Your TV or monitor would have to project two images simultaneously. Due to the popularity of 3D now though, there are 3D TVs and monitors appearing on the market that do not project two images, but display 3D. Here's how they work:
A normal computer screen or TV refreshes at a frequency of 60 Hz. This means 60 times in one second, the image you see is being refreshed. Because this is so fast, the human eye doesn't see flicker. 3D TVs and monitors refresh at 120 Hz. The polarizing images are interchanges at a rate of 120 times per second, but since there are two, it appears at 60 times a second, which is what produces the 3D effect.
I hope this helps you understand a little.
To answer your question, yes, you can create 3D videos, but you would need a 3D monitor and 3D TV to watch it.
并不真地。算法是否应该以某种方式理解场景内容并从中推断出深度信息?请记住,3D 视频需要深度信息。否则无法知道两个框架部件偏移多少。
您可能可以通过将不同的深度分配给不同程度的失焦来尝试它,但我怀疑会出现一些可用的东西。
Not really. Should the algorithm somehow understand the scene content and extrapolate depth information from that? Remember that 3D video needs depth information. Otherwise there is no way of knowing how much to offset the two frame parts.
You could probably try it by assigning various depths to various degrees of being out-of-focus but I doubt something usable would come out.
没有单独的算法,但是是的,这是可能的。这是非常困难的。现在有人正在研究这个问题。所涉及的算法编写起来非常具有挑战性,它们并不总是能正常工作,而且任何完整的解决方案都需要大量的处理能力。任何解决方案首先都是离线的(而不是实时的)。
3D 感知与立体光学的联系并不像您想象的那么紧密。如果您认为需要两只眼睛才能观看 3D,请尝试戴着眼罩四处走动。你会做得很好。有(少量)程序,包括一些商业软件包,可以在不使用立体相机的情况下从一组 2D 图片创建 3D 模型。有些是在线运行的,随着看到的内容越来越多,构建出更详细的模型。
只要想一想,我就能想到一些你在电影中遇到的问题。例如,我可以想象遮罩以不正确的深度渲染。来自 Apple Motion 等软件的特效视频最终可能会出现奇怪的伪影。
No individual algorithm per say, but yes, it is possible. It is very hard. There are people working on this problem right now. The algorithms involved are very challenging to write, they don't always work right, and any complete solution would require a large amount of processing power. Any solution would be offline (instead of real time) at first.
3D perception isn't tied as closely to stereo optics as you might believe. If you think you need two eyes to see 3D, then try walking around with an eyepatch on. You'll do just fine. There are a (small) number of programs out there, including some commercial software packages, that create 3D models from sets of 2D pictures without a stereo camera. Some run online, constructing a more detailed model as more of it is seen.
Just thinking about it I can think of some problems you'd run into with movies in particular. For example, I could imagine mattes getting rendered at an incorrect depth. Videos with special effects from software like Apple Motion might end up with strange artifacts.
有现有的算法可用于从 2D 图像中提取 3D 形状,此处,或此处。您可以从视频的每一帧中提取形状,甚至可以使用多个帧通过检测形状的运动来更好地理解形状。
然而,结果很可能远未达到 3D 电影内容的标准质量。
There are existing algorithms for extracting 3D shapes from 2D images, here, or here, for example. You can extract shapes from each frame of video, and even use multiple frames to gain better understanding of shapes by detecting their motion.
However, odds are that the results will be of nowhere near the standard quality of content of a 3D movie.
在我曾经工作过的地方对此进行了一些研究(尽管我根本没有参与其中)。本文从机载视频序列中自动提取 3D 模型可能会有所帮助。
There was some research done on this at a place I once worked (although I wasn't involved with it at all). This paper, Automatic extraction of 3D models from an airborne video sequence might be helpful.
也许会有一种模拟立体视图的算法,但它不可能是相同的。
原因很简单。在 2D 视频中,不仅缺少深度信息(这不足以获得立体视频),而且缺少从另一个角度可见的隐藏表面。
每个人都可能认为深度信息可以从可用信息中推断出来,这是事实。但是,缺失的信息无法如此准确地获得良好的立体效果。
除此之外,我听说过一个系统,可以从指向同一目标的 8 个(八个!)摄像机中提取准确的 3D 模型。正确模拟衣服动作也非常准确。然而,这是处理 8 个(八个!)2D 视频时完成的。仅用一个 2D 视频怎么可能达到相同的结果?
想要的结果的实现本质上取决于信息的可用性,在这种情况下(恕我直言)没有足够的信息。虽然可以尝试模拟 2D 视频的立体效果,但本质上它需要艰苦的工作、长时间的处理,结果是相对于原始立体视频而言质量较低。
我想记住 3D 视图是由我们的大脑生成的。眼睛只能捕捉二维图像,而我们的大脑处理这两个图像可以生成所看到物体的深度视图。
Maybe there will be an algorithm for emulating stereoscopic views, but it cannot be the same.
The reason is quite simple. In a 2D video is not only missing the depth information (which is not sufficient for getting a stereoscopic video), but it is missing the hidden surfaces which would be visible from another point of view.
Everyone could think that the depth information could be extrapoled from the available information, and this is true. But, the missing information cannot be so accurate for having a good stereoscopic effect.
Apart from that, I've heard about a system which coudl extract accurate 3D models from 8 (eight!) cameras pointing on the same target. It is so accurate to emulate also clothes movements correctly. However, this is done processing 8 (eight!) 2D videos. How it could be possible to achieve the same result with only one 2D video?
The achivement of the wanted result depends essentially by the information availability, and in this case (IMHO) there is not enought information. Althought, it could be possible to try to emulate stereoscopic effect from a 2D video, but essentially it needs an hard work, long processing and the consequence is low quality result respect an original stereospic video.
I would like to remember that the 3D view is generated by our brain. The eyes can capture only 2D images, and our brain, processing the two images can generate a depth view of the seen objects.
在今年的 CES 展会上,东芝展示了手机电视显示屏,他们声称它能够将 2D 电视信号转换为 3D。我不知道它是否会产生好的结果,也不知道他们正在使用什么算法,但如果他们是真的,应该有一个算法。
但不幸的是,我不知道如何做到这一点。
On this year's CES show, Toshiba presented the cell-TV display, and they claim that it is able to convert 2D tv-signals into 3D. I don't know if it produces a good result or what algorithm they are using, but if they are true there should be an algorithm for this.
But unfortunally, I don't know how it could be done.