opengl 文本标记的技术和速度期望
我正在使用 opengl(固定功能管道),并且可能绘制数十万个点,并用文本标签标记每个点。这个问题是关于我是否以合理的方式这样做,以及我对速度的期望。
通过为每个字符创建纹理坐标矩形并使用小字体位图对矩形进行纹理化(每个字符在纹理中约为 5x13 像素)来绘制文本标签。
在测试文件中,我有大约 158,000 个点,以经度和纬度给出,因此这个经度/纬度空间是我的“模型空间”。我读取这些点,并为它们创建一个 opengl 顶点缓冲区。然后每个点都会获得一个通常为三到四个字符长的标签。假设平均 3.5 个字符。这些点以屏幕坐标绘制(正交投影模式)。对于每个角色,我创建一个纹理坐标矩形来获取该角色的正确像素,并在屏幕坐标中创建一个矩形,将在其中绘制该角色。这两组矩形均被放入顶点缓冲区中。因此,即 158k * 3.5 * 8 = 440 万个点,即绘图矩形的 880 万个单独坐标号,以及纹理坐标的 880 万个数字。
当需要渲染时,我需要(至少我相信这是唯一的方法)更新所有这些绘图矩形的屏幕坐标,以匹配所有模型点的当前屏幕位置。因此,这意味着对于 158 个模型点中的每一个,我必须根据该点的模型(世界)坐标计算投影(屏幕)坐标,然后为该点的三个或四个字符矩形中的每一个设置四个角坐标。所以基本上我会在每次渲染时更新所有 880 万个数字。每次渲染大约需要 0.3 秒来更新这些数字。
第一个问题:这听起来像是在 opengl 中处理点标签的正确/必要的方法吗?如果有某种方式可以说“自动渲染到这组矩形点中,这些点链接到该模型点,但被视为距投影模型点的屏幕偏移”,那就太理想了。然后我就不必在每个渲染上更新绘制矩形。但不存在这样的事情,对吧?
问题二:除了每次渲染之前更新所有这些屏幕矩形的时间之外,当所有 158k 标签都显示在屏幕上时,渲染本身大约需要 1 秒(这显然不是一个有用的用户体验,但我只是想了解这里的速度)。当我放大时,屏幕上实际绘制的点/标签越来越少,渲染时间也会成比例地缩短。我只是想了解,在我的具有平均/现代 GPU 的平均/现代笔记本电脑上,整整一秒听起来是否是渲染这些 158k * 3.5 = 553k 纹理四边形的合理时间。我知道人们谈论“数百万个三角形”不是障碍,但我想知道我看到的纹理速度是否合理/预期。
感谢您的任何帮助。
在下面添加了代码。请注意,我想删除的是每个渲染上的 position_labels
调用。
SCREEN_VERTEX_DTYPE = np.dtype(
[ ( "x_lb", np.float32 ), ( "y_lb", np.float32 ),
( "x_lt", np.float32 ), ( "y_lt", np.float32 ),
( "x_rt", np.float32 ), ( "y_rt", np.float32 ),
( "x_rb", np.float32 ), ( "y_rb", np.float32 ) ]
)
TEXTURE_COORDINATE_DTYPE = np.dtype(
[ ( "u_lb", np.float32 ), ( "v_lb", np.float32 ),
( "u_lt", np.float32 ), ( "v_lt", np.float32 ),
( "u_rt", np.float32 ), ( "v_rt", np.float32 ),
( "u_rb", np.float32 ), ( "v_rb", np.float32 ) ]
)
# screen_vertex_data is numpy array of SCREEN_VERTEX_DTYPE
# texcoord_data is numpy array of TEXTURE_COORDINATE_DTYPE
# not shown: code to fill initial vals of screen_vertex_data and texcoord_data
self.vbo_screen_vertexes = gl_vbo.VBO( screen_vertex_data )
self.vbo_texture_coordinates = gl_vbo.VBO( texcoord_data )
...
# then on each render:
def render( self ):
self.position_labels()
gl.glEnable( gl.GL_TEXTURE_2D )
gl.glBindTexture( gl.GL_TEXTURE_2D, self.font_texture )
gl.glEnableClientState( gl.GL_VERTEX_ARRAY )
self.vbo_screen_vertexes.bind()
gl.glVertexPointer( 2, gl.GL_FLOAT, 0, None )
gl.glEnableClientState( gl.GL_TEXTURE_COORD_ARRAY )
self.vbo_texture_coordinates.bind()
gl.glTexCoordPointer( 2, gl.GL_FLOAT, 0, None )
# set up an orthogonal projection
gl.glMatrixMode(gl.GL_PROJECTION)
gl.glPushMatrix()
gl.glLoadIdentity()
window_size = application.GetClientSize()
gl.glOrtho(0, window_size[ 0 ], 0, window_size[ 1 ], -1, 1)
gl.glMatrixMode(gl.GL_MODELVIEW)
gl.glPushMatrix()
gl.glLoadIdentity()
vertex_count = np.alen( self.character_coordinates_data ) * 4
gl.glDrawArrays( gl.GL_QUADS, 0, vertex_count )
# undo the orthogonal projection
gl.glMatrixMode(gl.GL_PROJECTION)
gl.glPopMatrix()
gl.glMatrixMode(gl.GL_MODELVIEW)
gl.glPopMatrix()
self.vbo_texture_coordinates.unbind()
gl.glDisableClientState( gl.GL_TEXTURE_COORD_ARRAY )
self.vbo_screen_vertexes.unbind()
gl.glDisableClientState( gl.GL_VERTEX_ARRAY )
gl.glBindTexture( gl.GL_TEXTURE_2D, 0 )
gl.glDisable( gl.GL_TEXTURE_2D )
def position_labels( self ):
window_size = application.GetClientSize()
world_size = ( rect.width( application.world_rect ), rect.height( application.world_rect ) )
world_to_screen_factor_x = float( window_size[ 0 ] ) / float( world_size[ 0 ] )
world_to_screen_factor_y = float( window_size[ 1 ] ) / float( world_size[ 1 ] )
wr_lower_left = application.world_rect[ 0 ]
shift_pixels_x = ( wr_lower_left[ 0 ] + 180.0 ) * world_to_screen_factor_x
shift_pixels_y = ( wr_lower_left[ 1 ] + 90.0 ) * world_to_screen_factor_y
# map to screen coordinates
self.character_coordinates_data.screen_x = ( self.character_coordinates_data.world_x + 180.0 ) * world_to_screen_factor_x - shift_pixels_x
self.character_coordinates_data.screen_y = ( self.character_coordinates_data.world_y + 90.0 ) * world_to_screen_factor_y - shift_pixels_y
screen_vertex_data = self.vbo_screen_vertexes.data
screen_vertex_data.x_lb = self.character_coordinates_data.screen_x + self.character_coordinates_data.screen_offset_x
screen_vertex_data.y_lb = self.character_coordinates_data.screen_y + self.character_coordinates_data.screen_offset_y - self.character_coordinates_data.screen_height
screen_vertex_data.x_lt = screen_vertex_data.x_lb
screen_vertex_data.y_lt = screen_vertex_data.y_lb + self.character_coordinates_data.screen_height
screen_vertex_data.x_rt = screen_vertex_data.x_lb + self.character_coordinates_data.screen_width
screen_vertex_data.y_rt = screen_vertex_data.y_lb + self.character_coordinates_data.screen_height
screen_vertex_data.x_rb = screen_vertex_data.x_lb + self.character_coordinates_data.screen_width
screen_vertex_data.y_rb = screen_vertex_data.y_lb
self.vbo_screen_vertexes[ : np.alen( screen_vertex_data ) ] = screen_vertex_data
I am using opengl (fixed-function pipeline) and I'm drawing potentially hundreds of thousands of points, and labeling each with a text label. This question is about whether I'm doing this in a reasonable way, and what I can expect in terms of speed.
The text labels are drawn by creating a texture coordinate rectangle for each character, and texturing the rectangles using a small font bitmap (each character is about 5x13 pixels in the texture).
In a test file, I have about 158,000 points, given in longitude and latitude, so this lon/lat space is my "model space". I read points those in, and create an opengl vertex buffer for them. Then each point gets a label that typically is three or four characters long. So, let's say 3.5 characters on average. The points are drawn in screen coordinates (ortho projection mode). For each character I create a texture coordinate rect to grab the right pixels for the character, and I create a rectangle in screen coordinates into which the character will be drawn. These two sets of rects each are put into a vertex buffer. So that's 158k * 3.5 * 8 = 4.4 million points, or 8.8 million individual coordinate numbers for the drawing rects, and also 8.8 million numbers for the texture coordinates.
When it comes time to render, I need to (at least I believe this is the only way to do it) update the screen coordinates of all those drawing rects, to match the current screen position of all the model points. So that means for each of the 158 model points I have to compute the projected (screen) coordinates from the model (world) coordinates of the point, and then set the four corner coordinates for each of the three or four character rects for the point. So basically I'm updating all 8.8 million of those numbers on each render. It's taking about 0.3 seconds per render to update those numbers.
QUESTION NUMBER ONE: Does this sound like the right/necessary way to handle labeling of points in opengl? It would be ideal if there were some way to say, "automatically render into this set of rect points, which are linked to this model point but treated as screen offsets from the projected model point". Then I wouldn't have to update the draw rects on each render. But there is no such thing, right?
QUESTION NUMBER TWO: In addition to the time to update all those screen rects before each render, the render itself takes about 1 full second when all 158k labels are shown on the screen (which obviously is not a useful user experience, but I'm just trying to understand the speeds here). As I zoom in, and fewer and fewer points/labels are actually drawn on the screen, the render time becomes proportionally shorter. I'm just trying to understand whether, on my average/modern laptop with an average/modern GPU, that full one second sounds like a reasonable amount of time to render those 158k * 3.5 = 553k textured quads. I know people talk about "millions of triangles" not being an obstacle, but I'm wondering with the texturing the speed I'm seeing is reasonable/expected.
Thanks for any help.
Added code below. Note it's the position_labels
call on each render that I'd like to get rid of.
SCREEN_VERTEX_DTYPE = np.dtype(
[ ( "x_lb", np.float32 ), ( "y_lb", np.float32 ),
( "x_lt", np.float32 ), ( "y_lt", np.float32 ),
( "x_rt", np.float32 ), ( "y_rt", np.float32 ),
( "x_rb", np.float32 ), ( "y_rb", np.float32 ) ]
)
TEXTURE_COORDINATE_DTYPE = np.dtype(
[ ( "u_lb", np.float32 ), ( "v_lb", np.float32 ),
( "u_lt", np.float32 ), ( "v_lt", np.float32 ),
( "u_rt", np.float32 ), ( "v_rt", np.float32 ),
( "u_rb", np.float32 ), ( "v_rb", np.float32 ) ]
)
# screen_vertex_data is numpy array of SCREEN_VERTEX_DTYPE
# texcoord_data is numpy array of TEXTURE_COORDINATE_DTYPE
# not shown: code to fill initial vals of screen_vertex_data and texcoord_data
self.vbo_screen_vertexes = gl_vbo.VBO( screen_vertex_data )
self.vbo_texture_coordinates = gl_vbo.VBO( texcoord_data )
...
# then on each render:
def render( self ):
self.position_labels()
gl.glEnable( gl.GL_TEXTURE_2D )
gl.glBindTexture( gl.GL_TEXTURE_2D, self.font_texture )
gl.glEnableClientState( gl.GL_VERTEX_ARRAY )
self.vbo_screen_vertexes.bind()
gl.glVertexPointer( 2, gl.GL_FLOAT, 0, None )
gl.glEnableClientState( gl.GL_TEXTURE_COORD_ARRAY )
self.vbo_texture_coordinates.bind()
gl.glTexCoordPointer( 2, gl.GL_FLOAT, 0, None )
# set up an orthogonal projection
gl.glMatrixMode(gl.GL_PROJECTION)
gl.glPushMatrix()
gl.glLoadIdentity()
window_size = application.GetClientSize()
gl.glOrtho(0, window_size[ 0 ], 0, window_size[ 1 ], -1, 1)
gl.glMatrixMode(gl.GL_MODELVIEW)
gl.glPushMatrix()
gl.glLoadIdentity()
vertex_count = np.alen( self.character_coordinates_data ) * 4
gl.glDrawArrays( gl.GL_QUADS, 0, vertex_count )
# undo the orthogonal projection
gl.glMatrixMode(gl.GL_PROJECTION)
gl.glPopMatrix()
gl.glMatrixMode(gl.GL_MODELVIEW)
gl.glPopMatrix()
self.vbo_texture_coordinates.unbind()
gl.glDisableClientState( gl.GL_TEXTURE_COORD_ARRAY )
self.vbo_screen_vertexes.unbind()
gl.glDisableClientState( gl.GL_VERTEX_ARRAY )
gl.glBindTexture( gl.GL_TEXTURE_2D, 0 )
gl.glDisable( gl.GL_TEXTURE_2D )
def position_labels( self ):
window_size = application.GetClientSize()
world_size = ( rect.width( application.world_rect ), rect.height( application.world_rect ) )
world_to_screen_factor_x = float( window_size[ 0 ] ) / float( world_size[ 0 ] )
world_to_screen_factor_y = float( window_size[ 1 ] ) / float( world_size[ 1 ] )
wr_lower_left = application.world_rect[ 0 ]
shift_pixels_x = ( wr_lower_left[ 0 ] + 180.0 ) * world_to_screen_factor_x
shift_pixels_y = ( wr_lower_left[ 1 ] + 90.0 ) * world_to_screen_factor_y
# map to screen coordinates
self.character_coordinates_data.screen_x = ( self.character_coordinates_data.world_x + 180.0 ) * world_to_screen_factor_x - shift_pixels_x
self.character_coordinates_data.screen_y = ( self.character_coordinates_data.world_y + 90.0 ) * world_to_screen_factor_y - shift_pixels_y
screen_vertex_data = self.vbo_screen_vertexes.data
screen_vertex_data.x_lb = self.character_coordinates_data.screen_x + self.character_coordinates_data.screen_offset_x
screen_vertex_data.y_lb = self.character_coordinates_data.screen_y + self.character_coordinates_data.screen_offset_y - self.character_coordinates_data.screen_height
screen_vertex_data.x_lt = screen_vertex_data.x_lb
screen_vertex_data.y_lt = screen_vertex_data.y_lb + self.character_coordinates_data.screen_height
screen_vertex_data.x_rt = screen_vertex_data.x_lb + self.character_coordinates_data.screen_width
screen_vertex_data.y_rt = screen_vertex_data.y_lb + self.character_coordinates_data.screen_height
screen_vertex_data.x_rb = screen_vertex_data.x_lb + self.character_coordinates_data.screen_width
screen_vertex_data.y_rb = screen_vertex_data.y_lb
self.vbo_screen_vertexes[ : np.alen( screen_vertex_data ) ] = screen_vertex_data
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
投影模式和屏幕坐标是两个不同的东西。您当然可以选择投影参数,以便 OpenGL 单位与屏幕像素匹配,但这不是必需的。只是为了澄清。
对于问题一:OpenGL 只是一个绘图 API,没有更高级别的功能。所以,是的,让这些家伙保持同步是你的负担。幸运的是,您只需计算一次。缩放、平移、旋转等都可以通过操纵变换矩阵来完成;然而,视图中的每次更改都需要完全重绘。
对于问题二:这一切都归结为填充率而不是处理不可见的东西。一件有趣的事情是,虽然 GPU 可以在一秒钟内处理数百万个三角形,但如果以易于消化的块的形式提供,它们的效果最好,即如果它成批出现,则全部适合缓存。我发现,每个批处理 1000 到 3000 个顶点效果最好。此外,一些影响来自所访问纹理的总大小,而不仅仅是您实际访问的部分。然而,对于未优化的绘图方法,您的数字听起来是合理的。
The projection mode and screen coordinates are two distinct things. You can sure choose projection parameters so that OpenGL units match screen pixels, but this is no necessity. Just for clarification.
To question one: OpenGL is merely a drawing API, there's no higher level functionality. So yes, it's your burden to keep those fellas in sync. Luckily you've to do the math only once. Zooming, translating, rotating, etc. can all be done by manipulating the transformation matrices; however each change in the view requires a full redraw.
To question two: It all boils down to fillrate and not processing stuff that's not visible. One interesting thing is, that while GPUs can process millions of triangles in a second, they do it best if served in easy digestible chunks, i.e. if it comes in batches the all fit into the caches. I found out, that batches 1000 to 3000 vertices each work best. Also some influence comes from the total size of the accessed texture, and not just the part you're actually accessing. However your figures sound reasonable for a unoptimized drawing method.