通过 TCP/IP 传输浮点值和数据损坏
我有一个非常奇怪的错误。
我有两个通过 TCP/IP 进行通信的应用程序。
应用程序A是服务器,应用程序B是客户端。
应用程序 A 每 100 毫秒向应用程序 B 发送一堆浮点值。
该错误如下:有时应用程序 B 接收到的某些浮点值与应用程序 A 传输的值不同。
最初,我认为以太网或 TCP/IP 驱动程序存在问题(某种数据损坏) )。然后我在其他 Windows 机器上测试了代码,但问题仍然存在。
然后我在Linux(Ubuntu 10.04.1 LTS)上测试了代码,问题仍然存在!
这些值在发送之前和接收之后立即记录。
代码非常简单:消息协议有一个 4 字节标头,如下所示:
//message header
struct MESSAGE_HEADER {
unsigned short type;
unsigned short length;
};
//orientation message
struct ORIENTATION_MESSAGE : MESSAGE_HEADER
{
float azimuth;
float elevation;
float speed_az;
float speed_elev;
};
//any message
struct MESSAGE : MESSAGE_HEADER {
char buffer[512];
};
//receive specific size of bytes from the socket
static int receive(SOCKET socket, void *buffer, size_t size) {
int r;
do {
r = recv(socket, (char *)buffer, size, 0);
if (r == 0 || r == SOCKET_ERROR) break;
buffer = (char *)buffer + r;
size -= r;
} while (size);
return r;
}
//send specific size of bytes to a socket
static int send(SOCKET socket, const void *buffer, size_t size) {
int r;
do {
r = send(socket, (const char *)buffer, size, 0);
if (r == 0 || r == SOCKET_ERROR) break;
buffer = (char *)buffer + r;
size -= r;
} while (size);
return r;
}
//get message from socket
static bool receive(SOCKET socket, MESSAGE &msg) {
int r = receive(socket, &msg, sizeof(MESSAGE_HEADER));
if (r == SOCKET_ERROR || r == 0) return false;
if (ntohs(msg.length) == 0) return true;
r = receive(socket, msg.buffer, ntohs(msg.length));
if (r == SOCKET_ERROR || r == 0) return false;
return true;
}
//send message
static bool send(SOCKET socket, const MESSAGE &msg) {
int r = send(socket, &msg, ntohs(msg.length) + sizeof(MESSAGE_HEADER));
if (r == SOCKET_ERROR || r == 0) return false;
return true;
}
当我收到消息“方向”时,有时“方位角”值与服务器发送的值不同!
数据不应该始终相同吗? TCP/IP 不能保证数据传输不被损坏吗?数学协处理器中的异常是否会影响 TCP/IP 堆栈?我先收到少量字节(4个字节),然后收到消息正文,这是一个问题吗?
编辑:
问题出在字节顺序交换例程中。以下代码交换特定浮点的字节顺序,然后再次交换并打印字节:
#include <iostream>
using namespace std;
float ntohf(float f)
{
float r;
unsigned char *s = (unsigned char *)&f;
unsigned char *d = (unsigned char *)&r;
d[0] = s[3];
d[1] = s[2];
d[2] = s[1];
d[3] = s[0];
return r;
}
int main() {
unsigned long l = 3206974079;
float f1 = (float &)l;
float f2 = ntohf(ntohf(f1));
unsigned char *c1 = (unsigned char *)&f1;
unsigned char *c2 = (unsigned char *)&f2;
printf("%02X %02X %02X %02X\n", c1[0], c1[1], c1[2], c1[3]);
printf("%02X %02X %02X %02X\n", c2[0], c2[1], c2[2], c2[3]);
getchar();
return 0;
}
输出为:
7F 8A 26 BF 7F CA 26 BF
即浮点赋值可能会标准化该值,产生与原始值不同的值。
欢迎对此提出任何意见。
EDIT2:
谢谢大家的回复。问题似乎在于,当通过“return”语句返回时,交换的浮点数被推送到 CPU 的浮点堆栈中。然后调用者从堆栈中弹出该值,该值被舍入,但它是交换的浮点数,因此舍入弄乱了该值。
I have an extremely strange bug.
I have two applications that communicate over TCP/IP.
Application A is the server, and application B is the client.
Application A sends a bunch of float values to application B every 100 milliseconds.
The bug is the following: sometimes some of the float values received by application B are not the same as the values transmitted by application A.
Initially, I thought there was a problem with the Ethernet or TCP/IP drivers (some sort of data corruption). I then tested the code in other Windows machines, but the problem persisted.
I then tested the code on Linux (Ubuntu 10.04.1 LTS) and the problem is still there!!!
The values are logged just before they are sent and just after they are received.
The code is pretty straightforward: the message protocol has a 4 byte header like this:
//message header
struct MESSAGE_HEADER {
unsigned short type;
unsigned short length;
};
//orientation message
struct ORIENTATION_MESSAGE : MESSAGE_HEADER
{
float azimuth;
float elevation;
float speed_az;
float speed_elev;
};
//any message
struct MESSAGE : MESSAGE_HEADER {
char buffer[512];
};
//receive specific size of bytes from the socket
static int receive(SOCKET socket, void *buffer, size_t size) {
int r;
do {
r = recv(socket, (char *)buffer, size, 0);
if (r == 0 || r == SOCKET_ERROR) break;
buffer = (char *)buffer + r;
size -= r;
} while (size);
return r;
}
//send specific size of bytes to a socket
static int send(SOCKET socket, const void *buffer, size_t size) {
int r;
do {
r = send(socket, (const char *)buffer, size, 0);
if (r == 0 || r == SOCKET_ERROR) break;
buffer = (char *)buffer + r;
size -= r;
} while (size);
return r;
}
//get message from socket
static bool receive(SOCKET socket, MESSAGE &msg) {
int r = receive(socket, &msg, sizeof(MESSAGE_HEADER));
if (r == SOCKET_ERROR || r == 0) return false;
if (ntohs(msg.length) == 0) return true;
r = receive(socket, msg.buffer, ntohs(msg.length));
if (r == SOCKET_ERROR || r == 0) return false;
return true;
}
//send message
static bool send(SOCKET socket, const MESSAGE &msg) {
int r = send(socket, &msg, ntohs(msg.length) + sizeof(MESSAGE_HEADER));
if (r == SOCKET_ERROR || r == 0) return false;
return true;
}
When I receive the message 'orientation', sometimes the 'azimuth' value is different from the one sent by the server!
Shouldn't the data be the same all the time? doesn't TCP/IP guarantee delivery of the data uncorrupted? could it be that an exception in the math co-processor affects the TCP/IP stack? is it a problem that I receive a small number of bytes first (4 bytes) and then the message body?
EDIT:
The problem is in the endianess swapping routine. The following code swaps the endianess of a specific float around, and then swaps it again and prints the bytes:
#include <iostream>
using namespace std;
float ntohf(float f)
{
float r;
unsigned char *s = (unsigned char *)&f;
unsigned char *d = (unsigned char *)&r;
d[0] = s[3];
d[1] = s[2];
d[2] = s[1];
d[3] = s[0];
return r;
}
int main() {
unsigned long l = 3206974079;
float f1 = (float &)l;
float f2 = ntohf(ntohf(f1));
unsigned char *c1 = (unsigned char *)&f1;
unsigned char *c2 = (unsigned char *)&f2;
printf("%02X %02X %02X %02X\n", c1[0], c1[1], c1[2], c1[3]);
printf("%02X %02X %02X %02X\n", c2[0], c2[1], c2[2], c2[3]);
getchar();
return 0;
}
The output is:
7F 8A 26 BF
7F CA 26 BF
I.e. the float assignment probably normalizes the value, producing a different value from the original.
Any input on this is welcomed.
EDIT2:
Thank you all for your replies. It seems the problem is that the swapped float, when returned via the 'return' statement, is pushed in the CPU's floating point stack. The caller then pops the value from the stack, the value is rounded, but it is the swapped float, and therefore the rounding messes up the value.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
TCP 尝试传递未改变的字节,但除非机器具有相似的 CPU 和操作系统,否则无法保证一个系统上的浮点表示与另一个系统上的浮点表示相同。您需要一种机制来确保这一点,例如 XDR 或 Google 的 protobuf。
TCP tries to deliver unaltered bytes, but unless the machines have similar CPU-s and operating-systems, there's no guarantee that the floating-point representation on one system is identical to that on the other. You need a mechanism for ensuring this such as XDR or Google's protobuf.
您正在通过网络发送二进制数据,使用实现定义的结构布局填充,因此只有当您为应用程序 A 和应用程序 B 使用相同的硬件、操作系统和编译器时,这才有效。
不过,如果可以的话,我看不出你的代码有什么问题。一个潜在的问题是您使用 ntohs 来提取消息的长度,而该长度是总长度减去标头长度,因此您需要确保正确设置它。它需要完成
,但您不显示设置消息的代码......
You're sending binary data over the network, using implementation-defined padding for the struct layout, so this will only work if you're using the same hardware, OS and compiler for both application A and application B.
If that's ok, though, I can't see anything wrong with your code. One potential issue is that you're using ntohs to extract the length of the message and that length is the total length minus the header length, so you need to make sure you setting it properly. It needs to be done as
but you don't show the code that sets up the message...