WebGL 是如何工作的?
我正在寻求对 WebGL 工作原理的深入了解。我想要获得大多数人不太关心的知识,因为这些知识对于普通 WebGL 程序员来说不一定有用。例如,整个渲染系统的每个部分(浏览器、图形驱动程序等)在屏幕上获取图像方面发挥什么作用? 每个浏览器是否都必须创建一个 javascript/html 引擎/环境才能在浏览器中运行 WebGL?为什么 chrome 在 WebGL 兼容方面领先于其他人?
那么,有哪些好的资源可以开始呢? kronos 规范有点缺乏(从我浏览几分钟后看到的)我想要的东西。我主要想知道这是如何在浏览器中完成/实现的,以及您的系统上还需要进行哪些更改才能使其成为可能。
I'm looking for deep understanding of how WebGL works. I'm wanting to gain knowledge at a level that most people care less about, because the knowledge isn't necessary useful to the average WebGL programmer. For instance, what role does each part(browser, graphics driver, etc..) of the total rendering system play in getting an image on the screen?
Does each browser have to create a javascript/html engine/environment in order to run WebGL in browser? Why is chrome a head of everyone else in terms of being WebGL compatible?
So, what's some good resources to get started? The kronos specification is kind of lacking( from what I saw browsing it for a few minutes ) for what I'm wanting. I'm wanting mostly how is this accomplished/implemented in browsers and what else needs to change on your system to make it possible.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
希望这篇小文章对您有所帮助。它概述了我所学到的有关 WebGL 和 3D 的大部分知识。顺便说一句,如果我有任何错误,请有人纠正我——因为我也在学习!
架构
浏览器就是一个 Web 浏览器。它所做的只是公开 WebGL API(通过 JavaScript),程序员可以用它来完成其他所有事情。
据我所知,WebGL API 本质上只是一组(浏览器提供的)JavaScript 函数,它们围绕 OpenGL ES 规范。因此,如果您了解 OpenGL ES,您就可以很快采用 WebGL。不过,请不要将其与纯 OpenGL 混淆。 “ES”很重要。
图形驱动程序提供实际运行代码的 OpenGL ES 实现。此时,它正在机器硬件上运行,甚至低于 C 代码。虽然这首先使 WebGL 成为可能,但它也是一把双刃剑,因为 OpenGL ES 驱动程序中的错误(我已经注意到相当多的错误)将出现在您的 Web 应用程序中,而您不会除非您可以依靠您的用户群来提交连贯的错误报告(包括操作系统、视频硬件和驱动程序版本),否则您不一定知道这一点。 以下是此类问题的调试过程最终的样子。
在 Windows 上,有一个额外的存在于 WebGL API 和硬件之间的层:ANGLE 或“几乎原生图形层引擎”。由于 Windows 上的 OpenGL ES 驱动程序通常很糟糕,ANGLE 会接收这些调用并将它们转换为 DirectX 9 调用。
3D 绘图
现在您已经了解了各个部分如何组合在一起,让我们看一下如何将所有部分组合在一起生成 3D 图像的较低级别说明。
JavaScript
首先,JavaScript 代码从 HTML5 canvas 元素获取 3D 上下文。然后它注册一组着色器,这些着色器是用 GLSL([开放] GL 着色语言)编写的,本质上类似于 C 代码。
该过程的其余部分是非常模块化的。您需要使用着色器中定义的制服和属性将顶点数据和您打算使用的任何其他信息(例如顶点颜色、纹理坐标等)获取到图形管道,但确切的布局和命名这些信息很大程度上取决于开发人员。
JavaScript 设置初始数据结构并将它们发送到 WebGL API,WebGL API 将它们发送到 ANGLE 或 OpenGL ES,最终将其发送到图形硬件。
顶点着色器
一旦信息可供着色器使用,着色器必须分两个阶段转换信息以生成 3D 对象。第一阶段是顶点着色器,它设置网格坐标。 (此阶段完全在视频卡上运行,位于上面讨论的所有 API 之下。)最常见的是,在顶点着色器上执行的过程如下所示:
其中
VERTEX_POSITION
是 4D 向量 (x, y、z 和 w 通常设置为 1);VIEW_MATRIX
是一个 4x4 矩阵,表示相机的世界视野;MODEL_MATRIX
是一个 4x4 矩阵,它将对象空间坐标(即应用旋转或平移之前对象的局部坐标)转换为世界空间坐标;PROJECTION_MATRIX
代表相机的镜头。剪辑坐标
当应用所有这些后,
gl_Position
变量将具有一组范围在 [-1, 1] 内的 XYZ 坐标和一个 W 分量。这些称为剪辑坐标。在向
gl_Position
提供剪辑坐标后,WebGL 将结果除以gl_Position.w
,生成所谓的标准化设备坐标。从这里开始,将像素投影到屏幕上只需乘以 1/2 屏幕尺寸,然后加上 1/2 屏幕尺寸即可。[1]以下是一些剪辑坐标转换的示例转换为 800x600 显示器上的 2D 坐标:
像素着色器
一旦确定了像素的绘制位置,像素就会被传递给像素着色器,像素着色器将选择像素的实际颜色。这可以通过多种方式来完成,从简单地将特定颜色硬编码到纹理查找,到更高级的法线和视差映射(本质上是“欺骗”纹理查找以产生不同效果的方法)。
深度和深度缓冲区
现在,到目前为止我们已经忽略了剪辑坐标的 Z 分量。这是如何实现的。当我们乘以投影矩阵时,第三个剪辑组件得到了一些数字。如果该数字大于 1.0 或小于 -1.0,则该数字超出了投影矩阵的视图范围,分别对应于矩阵 zFar 和 zNear 值。
因此,如果它不在 [-1, 1] 范围内,那么它会被完全裁剪。如果在该范围内,则 Z 值将缩放为 0 到 1[2] 并与深度缓冲区[3]进行比较>。深度缓冲区等于屏幕尺寸,因此如果使用 800x600 的投影,深度缓冲区的宽度为 800 像素,高度为 600 像素。我们已经有了像素的 X 和 Y 坐标,因此将它们插入深度缓冲区以获取当前存储的 Z 值。如果 Z 值大于新的 Z 值,则新的 Z 值比之前绘制的值更接近,并会替换它[4]。此时,可以安全地点亮相关像素(或者在 WebGL 的情况下,将像素绘制到画布上),并将 Z 值存储为新的深度值。
如果 Z 值大于存储的深度值,则它被视为位于已绘制内容的“后面”,并且该像素将被丢弃。
[1]实际转换使用
gl.viewport
设置将标准化设备坐标转换为像素。[2]它实际上是缩放到
gl.depthRange
设置的。它们默认为 0 到 1。[3]假设您有深度缓冲区并且已使用
gl.enable(gl.DEPTH_TEST)< 打开深度测试/code>.
[4]您可以设置 Z 值与
gl.depthFunc
进行比较的方式Hopefully this little write-up is helpful to you. It overviews a big chunk of what I've learned about WebGL and 3D in general. BTW, if I've gotten anything wrong, somebody please correct me -- because I'm still learning, too!
Architecture
The browser is just that, a Web browser. All it does is expose the WebGL API (via JavaScript), which the programmer does everything else with.
As near as I can tell, the WebGL API is essentially just a set of (browser-supplied) JavaScript functions which wrap around the OpenGL ES specification. So if you know OpenGL ES, you can adopt WebGL pretty quickly. Don't confuse this with pure OpenGL, though. The "ES" is important.
The graphics driver supplies the implementation of OpenGL ES that actually runs your code. At this point, it's running on the machine hardware, below even the C code. While this is what makes WebGL possible in the first place, it's also a double edged sword because bugs in the OpenGL ES driver (which I've noted quite a number of already) will show up in your Web application, and you won't necessarily know it unless you can count on your user base to file coherent bug reports including OS, video hardware and driver versions. Here's what the debug process for such issues ends up looking like.
On Windows, there's an extra layer which exists between the WebGL API and the hardware: ANGLE, or "Almost Native Graphics Layer Engine". Because the OpenGL ES drivers on Windows generally suck, ANGLE receives those calls and translates them into DirectX 9 calls instead.
Drawing in 3D
Now that you know how the pieces come together, let's look at a lower level explanation of how everything comes together to produce a 3D image.
JavaScript
First, the JavaScript code gets a 3D context from an HTML5 canvas element. Then it registers a set of shaders, which are written in GLSL ([Open] GL Shading Language) and essentially resemble C code.
The rest of the process is very modular. You need to get vertex data and any other information you intend to use (such as vertex colors, texture coordinates, and so forth) down to the graphics pipeline using uniforms and attributes which are defined in the shader, but the exact layout and naming of this information is very much up to the developer.
JavaScript sets up the initial data structures and sends them to the WebGL API, which sends them to either ANGLE or OpenGL ES, which ultimately sends it off to the graphics hardware.
Vertex Shaders
Once the information is available to the shader, the shader must transform the information in 2 phases to produce 3D objects. The first phase is the vertex shader, which sets up the mesh coordinates. (This stage runs entirely on the video card, below all of the APIs discussed above.) Most usually, the process performed on the vertex shader looks something like this:
where
VERTEX_POSITION
is a 4D vector (x, y, z, and w which is usually set to 1);VIEW_MATRIX
is a 4x4 matrix representing the camera's view into the world;MODEL_MATRIX
is a 4x4 matrix which transforms object-space coordinates (that is, coords local to the object before rotation or translation have been applied) into world-space coordinates; andPROJECTION_MATRIX
which represents the camera's lens.Clip Coordinates
When all of these have been applied, the
gl_Position
variable will have a set of XYZ coordinates ranging within [-1, 1], and a W component. These are called clip coordinates.After you supply clip coordinates to
gl_Position
WebGL divides the result bygl_Position.w
producing what's called normalized device coordinates.From there, projecting a pixel onto the screen is a simple matter of multiplying by 1/2 the screen dimensions and then adding 1/2 the screen dimensions.[1] Here are some examples of clip coordinates translated into 2D coordinates on an 800x600 display:
Pixel Shaders
Once it's been determined where a pixel should be drawn, the pixel is handed off to the pixel shader, which chooses the actual color the pixel will be. This can be done in a myriad of ways, ranging from simply hard-coding a specific color to texture lookups to more advanced normal and parallax mapping (which are essentially ways of "cheating" texture lookups to produce different effects).
Depth and the Depth Buffer
Now, so far we've ignored the Z component of the clip coordinates. Here's how that works out. When we multiplied by the projection matrix, the third clip component resulted in some number. If that number is greater than 1.0 or less than -1.0, then the number is beyond the view range of the projection matrix, corresponding to the matrix zFar and zNear values, respectively.
So if it's not in the range [-1, 1] then it's clipped entirely. If it is in that range, then the Z value is scaled to 0 to 1[2] and is compared to the depth buffer[3]. The depth buffer is equal to the screen dimensions, so that if a projection of 800x600 is used, the depth buffer is 800 pixels wide and 600 pixels high. We already have the pixel's X and Y coordinates, so they are plugged into the depth buffer to get the currently stored Z value. If the Z value is greater than the new Z value, then the new Z value is closer than whatever was previously drawn, and replaces it[4]. At this point it's safe to light up the pixel in question (or in the case of WebGL, draw the pixel to the canvas), and store the Z value as the new depth value.
If the Z value is greater than the stored depth value, then it is deemed to be "behind" whatever has already been drawn, and the pixel is discarded.
[1]The actual conversion uses the
gl.viewport
settings to convert from normalized device coordinates to pixels.[2]It's actually scaled to the
gl.depthRange
settings. They default 0 to 1.[3]Assuming you have a depth buffer and you've turned on depth testing with
gl.enable(gl.DEPTH_TEST)
.[4]You can set how Z values are compared with
gl.depthFunc
我会阅读这些文章
http://webglfundamentals.org/webgl/lessons/ webgl-how-it-works.html
假设这些文章有帮助,剩下的就是 WebGL 在浏览器中运行。它渲染到画布标签。您可以将 canvas 标签视为 img 标签,只不过您使用 WebGL API 生成图像而不是下载图像。
与其他 HTML5 标签一样,canvas 标签可以使用 CSS 进行样式设置,位于页面其他部分的下方或上方。与页面的其他部分合成(混合)。与页面的其他部分一起通过 CSS 进行变换、旋转、缩放。这与 OpenGL 或 OpenGL ES 有很大区别。
I would read these articles
http://webglfundamentals.org/webgl/lessons/webgl-how-it-works.html
Assuming those articles are helpful, the rest of the picture is that WebGL runs in a browser. It renderers to a canvas tag. You can think of a canvas tag like an img tag except you use the WebGL API to generate an image instead of download one.
Like other HTML5 tags the canvas tag can be styled with CSS, be under or over other parts of the page. Is composited (blended) with other parts of the page. Be transformed, rotated, scaled by CSS along with other parts of the page. That's a big difference from OpenGL or OpenGL ES.