WebGL 是如何工作的?

发布于 2024-12-03 04:42:20 字数 309 浏览 1 评论 0原文

我正在寻求对 WebGL 工作原理的深入了解。我想要获得大多数人不太关心的知识,因为这些知识对于普通 WebGL 程序员来说不一定有用。例如,整个渲染系统的每个部分(浏览器、图形驱动程序等)在屏幕上获取图像方面发挥什么作用? 每个浏览器是否都必须创建一个 javascript/html 引擎/环境才能在浏览器中运行 WebGL?为什么 chrome 在 WebGL 兼容方面领先于其他人?

那么,有哪些好的资源可以开始呢? kronos 规范有点缺乏(从我浏览几分钟后看到的)我想要的东西。我主要想知道这是如何在浏览器中完成/实现的,以及您的系统上还需要进行哪些更改才能使其成为可能。

I'm looking for deep understanding of how WebGL works. I'm wanting to gain knowledge at a level that most people care less about, because the knowledge isn't necessary useful to the average WebGL programmer. For instance, what role does each part(browser, graphics driver, etc..) of the total rendering system play in getting an image on the screen?
Does each browser have to create a javascript/html engine/environment in order to run WebGL in browser? Why is chrome a head of everyone else in terms of being WebGL compatible?

So, what's some good resources to get started? The kronos specification is kind of lacking( from what I saw browsing it for a few minutes ) for what I'm wanting. I'm wanting mostly how is this accomplished/implemented in browsers and what else needs to change on your system to make it possible.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

凑诗 2024-12-10 04:42:20

希望这篇小文章对您有所帮助。它概述了我所学到的有关 WebGL 和 3D 的大部分知识。顺便说一句,如果我有任何错误,请有人纠正我——因为我也在学习!

架构

浏览器就是一个 Web 浏览器。它所做的只是公开 WebGL API(通过 JavaScript),程序员可以用它来完成其他所有事情。

据我所知,WebGL API 本质上只是一组(浏览器提供的)JavaScript 函数,它们围绕 OpenGL ES 规范。因此,如果您了解 OpenGL ES,您就可以很快采用 WebGL。不过,请不要将其与纯 OpenGL 混淆。 “ES”很重要。

WebGL 规范故意保持非常低的水平,留下了很多东西
从一个应用程序到下一个应用程序重新实现。这取决于
社区编写自动化框架,并由开发人员决定
选择要使用的框架(如果有)。这并不完全困难
自己推出,但这确实意味着需要花费大量的开销
重新发明轮子。 (FWIW,我一直在研究自己的 WebGL
有一段时间称为 Jax
的框架
现在。)

图形驱动程序提供实际运行代码的 OpenGL ES 实现。此时,它正在机器硬件上运行,甚至低于 C 代码。虽然这首先使 WebGL 成为可能,但它也是一把双刃剑,因为 OpenGL ES 驱动程序中的错误(我已经注意到相当多的错误)将出现在您的 Web 应用程序中,而您不会除非您可以依靠您的用户群来提交连贯的错误报告(包括操作系统、视频硬件和驱动程序版本),否则您不一定知道这一点。 以下是此类问题的调试过程最终的样子。

在 Windows 上,有一个额外的存在于 WebGL API 和硬件之间的层:ANGLE 或“几乎原生图形层引擎”。由于 Windows 上的 OpenGL ES 驱动程序通常很糟糕,ANGLE 会接收这些调用并将它们转换为 DirectX 9 调用。

3D 绘图

现在您已经了解了各个部分如何组合在一起,让我们看一下如何将所有部分组合在一起生成 3D 图像的较低级别说明。

JavaScript

首先,JavaScript 代码从 HTML5 canvas 元素获取 3D 上下文。然后它注册一组着色器,这些着色器是用 GLSL([开放] GL 着色语言)编写的,本质上类似于 C 代码。

该过程的其余部分是非常模块化的。您需要使用着色器中定义的制服和属性将顶点数据和您打算使用的任何其他信息(例如顶点颜色、纹理坐标等)获取到图形管道,但确切的布局和命名这些信息很大程度上取决于开发人员。

JavaScript 设置初始数据结构并将它们发送到 WebGL API,WebGL API 将它们发送到 ANGLE 或 OpenGL ES,最终将其发送到图形硬件。

顶点着色器

一旦信息可供着色器使用,着色器必须分两个阶段转换信息以生成 3D 对象。第一阶段是顶点着色器,它设置网格坐标。 (此阶段完全在视频卡上运行,位于上面讨论的所有 API 之下。)最常见的是,在顶点着色器上执行的过程如下所示:

gl_Position = PROJECTION_MATRIX * VIEW_MATRIX * MODEL_MATRIX * VERTEX_POSITION

其中 VERTEX_POSITION 是 4D 向量 (x, y、z 和 w 通常设置为 1); VIEW_MATRIX 是一个 4x4 矩阵,表示相机的世界视野; MODEL_MATRIX 是一个 4x4 矩阵,它将对象空间坐标(即应用旋转或平移之前对象的局部坐标)转换为世界空间坐标; PROJECTION_MATRIX 代表相机的镜头。

大多数情况下,VIEW_MATRIXMODEL_MATRIX 是预先计算的,
称为MODELVIEW_MATRIX。有时,所有 3 个都被预先计算成
MODELVIEW_PROJECTION_MATRIX 或只是 MVP。一般都是这些意思
作为优化,尽管我想找时间做一些基准测试。它是
如果是的话,JavaScript 中的预计算实际上可能会更慢
每一帧都完成,因为 JavaScript 本身并不是那么快。在
在这种情况下,通过计算提供的硬件加速
在 JavaScript 中,GPU 可能比 CPU 更快。我们可以
当然希望未来的 JS 实现能够解决这个潜力
问题在于速度更快。

剪辑坐标

当应用所有这些后,gl_Position 变量将具有一组范围在 [-1, 1] 内的 XYZ 坐标和一个 W 分量。这些称为剪辑坐标。

值得注意的是,剪辑坐标是顶点着色器真正的唯一内容
需要生产。您可以完全跳过矩阵变换
执行上面的操作,只要产生一个剪辑坐标结果即可。 (我什至有
尝试用四元数替换矩阵;它起作用了
很好,但我放弃了这个项目,因为我没有得到
我所希望的性能改进。)

在向 gl_Position 提供剪辑坐标后,WebGL 将结果除以 gl_Position.w,生成所谓的标准化设备坐标。
从这里开始,将像素投影到屏幕上只需乘以 1/2 屏幕尺寸,然后加上 1/2 屏幕尺寸即可。[1]以下是一些剪辑坐标转换的示例转换为 800x600 显示器上的 2D 坐标:

clip = [0, 0]
x = (0 * 800/2) + 800/2 = 400
y = (0 * 600/2) + 600/2 = 300

clip = [0.5, 0.5]
x = (0.5 * 800/2) + 800/2 = 200 + 400 = 600
y = (0.5 * 600/2) + 600/2 = 150 + 300 = 450

clip = [-0.5, -0.25]
x = (-0.5  * 800/2) + 800/2 = -200 + 400 = 200
y = (-0.25 * 600/2) + 600/2 = -150 + 300 = 150

像素着色器

一旦确定了像素的绘制位置,像素就会被传递给像素着色器,像素着色器将选择像素的实际颜色。这可以通过多种方式来完成,从简单地将特定颜色硬编码到纹理查找,到更高级的法线和视差映射(本质上是“欺骗”纹理查找以产生不同效果的方法)。

深度和深度缓冲区

现在,到目前为止我们已经忽略了剪辑坐标的 Z 分量。这是如何实现的。当我们乘以投影矩阵时,第三个剪辑组件得到了一些数字。如果该数字大于 1.0 或小于 -1.0,则该数字超出了投影矩阵的视图范围,分别对应于矩阵 zFar 和 zNear 值。

因此,如果它不在 [-1, 1] 范围内,那么它会被完全裁剪。如果在该范围内,则 Z 值将缩放为 0 到 1[2] 并与深度缓冲区[3]进行比较>。深度缓冲区等于屏幕尺寸,因此如果使用 800x600 的投影,深度缓冲区的宽度为 800 像素,高度为 600 像素。我们已经有了像素的 X 和 Y 坐标,因此将它们插入深度缓冲区以获取当前存储的 Z 值。如果 Z 值大于新的 Z 值,则新的 Z 值比之前绘制的值更接近,并会替换它[4]。此时,可以安全地点亮相关像素(或者在 WebGL 的情况下,将像素绘制到画布上),并将 Z 值存储为新的深度值。

如果 Z 值大于存储的深度值,则它被视为位于已绘制内容的“后面”,并且该像素将被丢弃。

[1]实际转换使用 gl.viewport 设置将标准化设备坐标转换为像素。

[2]它实际上是缩放到gl.depthRange 设置的。它们默认为 0 到 1。

[3]假设您有深度缓冲区并且已使用 gl.enable(gl.DEPTH_TEST)< 打开深度测试/code>.

[4]您可以设置 Z 值与 gl.depthFunc 进行比较的方式

Hopefully this little write-up is helpful to you. It overviews a big chunk of what I've learned about WebGL and 3D in general. BTW, if I've gotten anything wrong, somebody please correct me -- because I'm still learning, too!

Architecture

The browser is just that, a Web browser. All it does is expose the WebGL API (via JavaScript), which the programmer does everything else with.

As near as I can tell, the WebGL API is essentially just a set of (browser-supplied) JavaScript functions which wrap around the OpenGL ES specification. So if you know OpenGL ES, you can adopt WebGL pretty quickly. Don't confuse this with pure OpenGL, though. The "ES" is important.

The WebGL spec was intentionally left very low-level, leaving a lot to
be re-implemented from one application to the next. It is up to the
community to write frameworks for automation, and up to the developer
to choose which framework to use (if any). It's not entirely difficult
to roll your own, but it does mean a lot of overhead spent on
reinventing the wheel. (FWIW, I've been working on my own WebGL
framework called Jax
for a while
now.)

The graphics driver supplies the implementation of OpenGL ES that actually runs your code. At this point, it's running on the machine hardware, below even the C code. While this is what makes WebGL possible in the first place, it's also a double edged sword because bugs in the OpenGL ES driver (which I've noted quite a number of already) will show up in your Web application, and you won't necessarily know it unless you can count on your user base to file coherent bug reports including OS, video hardware and driver versions. Here's what the debug process for such issues ends up looking like.

On Windows, there's an extra layer which exists between the WebGL API and the hardware: ANGLE, or "Almost Native Graphics Layer Engine". Because the OpenGL ES drivers on Windows generally suck, ANGLE receives those calls and translates them into DirectX 9 calls instead.

Drawing in 3D

Now that you know how the pieces come together, let's look at a lower level explanation of how everything comes together to produce a 3D image.

JavaScript

First, the JavaScript code gets a 3D context from an HTML5 canvas element. Then it registers a set of shaders, which are written in GLSL ([Open] GL Shading Language) and essentially resemble C code.

The rest of the process is very modular. You need to get vertex data and any other information you intend to use (such as vertex colors, texture coordinates, and so forth) down to the graphics pipeline using uniforms and attributes which are defined in the shader, but the exact layout and naming of this information is very much up to the developer.

JavaScript sets up the initial data structures and sends them to the WebGL API, which sends them to either ANGLE or OpenGL ES, which ultimately sends it off to the graphics hardware.

Vertex Shaders

Once the information is available to the shader, the shader must transform the information in 2 phases to produce 3D objects. The first phase is the vertex shader, which sets up the mesh coordinates. (This stage runs entirely on the video card, below all of the APIs discussed above.) Most usually, the process performed on the vertex shader looks something like this:

gl_Position = PROJECTION_MATRIX * VIEW_MATRIX * MODEL_MATRIX * VERTEX_POSITION

where VERTEX_POSITION is a 4D vector (x, y, z, and w which is usually set to 1); VIEW_MATRIX is a 4x4 matrix representing the camera's view into the world; MODEL_MATRIX is a 4x4 matrix which transforms object-space coordinates (that is, coords local to the object before rotation or translation have been applied) into world-space coordinates; and PROJECTION_MATRIX which represents the camera's lens.

Most often, the VIEW_MATRIX and MODEL_MATRIX are precomputed and
called MODELVIEW_MATRIX. Occasionally, all 3 are precomputed into
MODELVIEW_PROJECTION_MATRIX or just MVP. These are generally meant
as optimizations, though I'd like find time to do some benchmarks. It's
possible that precomputing is actually slower in JavaScript if it's
done every frame, because JavaScript itself isn't all that fast. In
this case, the hardware acceleration afforded by doing the math on the
GPU might well be faster than doing it on the CPU in JavaScript. We can
of course hope that future JS implementations will resolve this potential
gotcha by simply being faster.

Clip Coordinates

When all of these have been applied, the gl_Position variable will have a set of XYZ coordinates ranging within [-1, 1], and a W component. These are called clip coordinates.

It's worth noting that clip coordinates is the only thing the vertex shader really
needs to produce. You can completely skip the matrix transformations
performed above, as long as you produce a clip coordinate result. (I have even
experimented with swapping out matrices for quaternions; it worked
just fine but I scrapped the project because I didn't get the
performance improvements I'd hoped for.)

After you supply clip coordinates to gl_Position WebGL divides the result by gl_Position.w producing what's called normalized device coordinates.
From there, projecting a pixel onto the screen is a simple matter of multiplying by 1/2 the screen dimensions and then adding 1/2 the screen dimensions.[1] Here are some examples of clip coordinates translated into 2D coordinates on an 800x600 display:

clip = [0, 0]
x = (0 * 800/2) + 800/2 = 400
y = (0 * 600/2) + 600/2 = 300

clip = [0.5, 0.5]
x = (0.5 * 800/2) + 800/2 = 200 + 400 = 600
y = (0.5 * 600/2) + 600/2 = 150 + 300 = 450

clip = [-0.5, -0.25]
x = (-0.5  * 800/2) + 800/2 = -200 + 400 = 200
y = (-0.25 * 600/2) + 600/2 = -150 + 300 = 150

Pixel Shaders

Once it's been determined where a pixel should be drawn, the pixel is handed off to the pixel shader, which chooses the actual color the pixel will be. This can be done in a myriad of ways, ranging from simply hard-coding a specific color to texture lookups to more advanced normal and parallax mapping (which are essentially ways of "cheating" texture lookups to produce different effects).

Depth and the Depth Buffer

Now, so far we've ignored the Z component of the clip coordinates. Here's how that works out. When we multiplied by the projection matrix, the third clip component resulted in some number. If that number is greater than 1.0 or less than -1.0, then the number is beyond the view range of the projection matrix, corresponding to the matrix zFar and zNear values, respectively.

So if it's not in the range [-1, 1] then it's clipped entirely. If it is in that range, then the Z value is scaled to 0 to 1[2] and is compared to the depth buffer[3]. The depth buffer is equal to the screen dimensions, so that if a projection of 800x600 is used, the depth buffer is 800 pixels wide and 600 pixels high. We already have the pixel's X and Y coordinates, so they are plugged into the depth buffer to get the currently stored Z value. If the Z value is greater than the new Z value, then the new Z value is closer than whatever was previously drawn, and replaces it[4]. At this point it's safe to light up the pixel in question (or in the case of WebGL, draw the pixel to the canvas), and store the Z value as the new depth value.

If the Z value is greater than the stored depth value, then it is deemed to be "behind" whatever has already been drawn, and the pixel is discarded.

[1]The actual conversion uses the gl.viewport settings to convert from normalized device coordinates to pixels.

[2]It's actually scaled to the gl.depthRange settings. They default 0 to 1.

[3]Assuming you have a depth buffer and you've turned on depth testing with gl.enable(gl.DEPTH_TEST).

[4]You can set how Z values are compared with gl.depthFunc

枫以 2024-12-10 04:42:20

我会阅读这些文章

http://webglfundamentals.org/webgl/lessons/ webgl-how-it-works.html

假设这些文章有帮助,剩下的就是 WebGL 在浏览器中运行。它渲染到画布标签。您可以将 canvas 标签视为 img 标签,只不过您使用 WebGL API 生成图像而不是下载图像。

与其他 HTML5 标签一样,canvas 标签可以使用 CSS 进行样式设置,位于页面其他部分的下方或上方。与页面的其他部分合成(混合)。与页面的其他部分一起通过 CSS 进行变换、旋转、缩放。这与 OpenGL 或 OpenGL ES 有很大区别。

I would read these articles

http://webglfundamentals.org/webgl/lessons/webgl-how-it-works.html

Assuming those articles are helpful, the rest of the picture is that WebGL runs in a browser. It renderers to a canvas tag. You can think of a canvas tag like an img tag except you use the WebGL API to generate an image instead of download one.

Like other HTML5 tags the canvas tag can be styled with CSS, be under or over other parts of the page. Is composited (blended) with other parts of the page. Be transformed, rotated, scaled by CSS along with other parts of the page. That's a big difference from OpenGL or OpenGL ES.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文