Tibco Rendezvous - 尺寸限制

发布于 2024-08-10 22:18:32 字数 220 浏览 7 评论 0原文

我试图将一个可能很大的字符串放入集合点消息中,并对大小限制感到好奇。我知道整个消息有一个物理限制(64mb?),但我很好奇其他一些变量会如何影响它。具体来说:

  • 钥匙有多大?
  • 字符串的存储方式(在一个字段与多个字段中)

对于上述任何主题或任何其他可能相关的任何建议,我们将不胜感激。

注意:我想将消息保留为原始字符串(而不是字节码等)。

I am attempting to put a potentially large string into a rendezvous message and was curious about size constraints. I understand there is a physical limit (64mb?) to the message as a whole, but I'm curious about how some other variables could affect it. Specifically:

  • How big the keys are?
  • How the string is stored (in one field vs. multiple fields)

Any advice on any of the above topics or anything else that could be relevant would be greatly appreciated.

Note: I would like to keep the message as a raw string (as opposed to bytecode, etc).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

木落 2024-08-17 22:18:32

来自关于超大消息的 Tibco 文档:

Rendezvous 软件可以传输非常
大消息;它将它们分为
小包,并将它们放在
尽可能快地网络
接受他们。在某些情况下,这
行为可能会淹没网络
容量;应用程序可以实现
通过划分大数据来提高吞吐量
消息分成更小的块和
调节发送的速率
那些块。您可以使用
评估块的性能工具
大小和发送速率以获得最佳
吞吐量。

此示例发送一条消息
由一千万字节组成。
自动交会软件
将消息分成数据包并
发送他们。然而,这一波爆
数据包可能超出网络容量,
导致吞吐量不佳:

发件人> rvperfm -大小 10000000 -消息 1   

在第二个示例中,
申请瓜分千万
字节分成一千个小字节
每条消息一万字节,
并自动判断批次
调节流量的大小和间隔
为了获得最佳吞吐量:

发件人> rvperfm -大小 10000 -消息 1000 -自动   

通过改变 -messages 和 -size
参数,您可以确定
最适合您的邮件大小
特定网络中的应用程序。
应用程序开发人员可以使用它
调节发送速率的信息
以提高性能。

至于实际限制,Add 字符串函数采用 C 风格的 ansi 字符串,因此理论上是无界的,但是考虑到

tibrv_status tibrvMsg_AddOpaque( 
   tibrvMsg       message, 
   const char*    fieldName, 
   const void*    value, 
   tibrv_u32      size); 

采用 u32 的 AddOpaque 的签名,声明限制可能是 4GB 而不是 64MB 似乎是明智的。

也就是说,使用 Tib 传输如此大的数据包可能会成为严重的性能瓶颈,因为它在尝试将此类消息发送给所有消费者时可能必须缓冲大量流量。默认情况下,rvd 缓冲区只有 60 秒,因此如果您的流量很大,您可能会发现自己遭受消息丢失。

tibco 中的消息开销在很大程度上非常简单:

  1. 与每条消息(标头)相关的固定成本、
  2. 所有字段(类型信息和字段 id)
  3. 加上所有可变长度方面的成本,包括:
    1. 发送和接收主题(实际上每个主题限制为 256 个字节)
    2. 字段名称。我发现文档中字段名称的长度没有限制,但它们越小越好,最好还是完全不使用它们并使用数字标识符
    3. 消息中的数组/字符串/不透明/用户定义的可变长度字段

注意:如果您使用嵌套消息,只需递归上述内容即可。

在您的情况下,与名称相比,有效负载开销将非常巨大(只要它们合理且简单),尝试优化这些没有什么意义。

您可能会发现,如果您以压缩形式传输字符串,可以通过使用启用压缩的 rvrd 或通过更改生产者/消费者以使用快速但有效的东西(例如 deflate)来传输字符串,从而在线路/缓冲上获得相当高的效率(或者如果你感觉像 QuickLZ、FastLZ、LZO 等深奥的东西,尤其是那些具有固定内存占用压缩/解压缩引擎的东西)

你没有说出你的目标平台 api(例如 .net/java/C++/C)并且这会让事情变得有点色彩。在线路上,无论 java/.net 默认情况下是否使用 UTF-16,所有字符串数据都将为每个字符 1 个字节,但是,将这些数据放入消息中/从消息中读取它们会产生大量翻译成本,因为底层缓冲区无法重用在这些情况下,必须执行复制(以及分别压缩/扩展)。
如果您坚持使用不透明的字节序列,则通过托管包装器 api 可能会在简单的实现中产生复制开销,但如果您不需要将数据作为本机字符串使用,则开销至少会更少。

From the Tibco docs on Very Large Messages:

Rendezvous software can transport very
large messages; it divides them into
small packets, and places them on the
network as quickly as the network can
accept them. In some situations, this
behavior can overwhelm network
capacity; applications can achieve
higher throughput by dividing large
messages into smaller chunks and
regulating the rate at which it sends
those chunks. You can use the
performance tool to evaluate chunk
sizes and send rates for optimal
throughput.

This example, sends one message
consisting of ten million bytes.
Rendezvous software automatically
divides the message into packets and
sends them. However, this burst of
packets might exceed network capacity,
resulting in poor throughput:

sender> rvperfm -size 10000000 -messages 1   

In this second example, the
application divides the ten million
bytes into one thousand smaller
messages of ten thousand bytes each,
and automatically determines the batch
size and interval to regulate the flow
for optimal throughput:

sender> rvperfm -size 10000 -messages 1000 -auto   

By varying the -messages and -size
parameters, you can determine the
optimal message size for your
applications in a specific network.
Application developers can use this
information to regulate sending rates
for improved performance.

As to actual limits the Add string function takes a C style ansi string so is theoretically unbounded but, given the signature of the AddOpaque

tibrv_status tibrvMsg_AddOpaque( 
   tibrvMsg       message, 
   const char*    fieldName, 
   const void*    value, 
   tibrv_u32      size); 

which takes a u32 it would seem sensible to state that the limit is likely to be 4GB rather than 64MB.

That said using Tib to transfer such large packets is likely to be a serious performance bottleneck as it may have to buffer significant amounts of traffic as it tries to get these sorts of messages to all consumers. By default the rvd buffer is only 60 seconds so you may find yourself suffering message loss if this is a significant amount of your traffic.

Message overhead within tibco is largely as simple as:

  1. the fixed cost associated with each message (the header)
  2. All the fields (type info and the field id)
  3. Plus the cost of all variable length aspects including:
    1. the send and receive subjects (effectively limited to 256 bytes each)
    2. the field names. I can find no limit to the length of the field names in the docs but the smaller they are the better, better still don't use them at all and use the numerical identifiers
    3. the array/string/opaque/user defined variable length fields in the message

Note: If you use nested messages simply recurse the above.

In your case the payload overhead will be so vast in comparison to the names (so long as they are reasonable and simple) there is little point attempting to optimize these at all.

You may find you can considerable efficiency on the wire/buffered if you transmit the strings in a compressed form, either through the use of an rvrd with compression enabled or by changing your producer/consumer to use something fast but effective like deflate (or if you're feeling esoteric things like QuickLZ,FastLZ,LZO,etc. Especially ones with fixed memory footprint compress/decompress engines)

You don't say which platform api you are targeting (.net/java/C++/C for example) and this will colour things a little. On the wire all string data will be in 1 byte per character regardless of java/.net using UTF-16 by default however you will incur a significant translation cost placing these into/reading them out of the message because the underlying buffer cannot be reused in those cases and a copy (and compaction/expansion respectively) must be performed.
If you stick to opaque byte sequences you will still have the copy overhead in the naieve implementations possible through the managed wrapper apis but this will at least be less overhead if you have no need to work with the data as a native string.

一片旧的回忆 2024-08-17 22:18:32

正如 OP 中推测的那样,消息的总体最大大小为 64MB。来自“Tibco Rendezvous Concepts”文档:

虽然交换大数据缓冲区的能力是 Rendezvous 的一个功能
软件中,消息最好不要太大。例如,交换数据
最多 10,000 字节,单个消息是高效的。但要发送的文件可能是
长度为数兆字节,我们建议使用多个发送调用,也许一个
对于每个记录、块或磁道。根据经验确定最有效的尺寸
当前的网络条件。 (实际大小限制为 64 MB,这很少见
合适的尺寸。)

The overall maximum size of a message is 64MB as was speculated in the OP. From the "Tibco Rendezvous Concepts" document:

Although the ability to exchange large data buffers is a feature of Rendezvous
software, it is best not to make messages too large. For example, to exchange data
up to 10,000 bytes, a single message is efficient. But to send files that could be
many megabytes in length, we recommend using multiple send calls, perhaps one
for each record, block or track. Empirically determine the most efficient size for
the prevailing network conditions. (The actual size limit is 64 MB, which is rarely
an appropriate size.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文