转换一个 unsigned int +字符串到 unsigned char 向量

发布于 2024-12-08 18:03:19 字数 1482 浏览 6 评论 0原文

我正在使用 NetLink 套接字库 ( https://sourceforge.net/apps/wordpress/netlinksockets/ ),我想以我指定的格式通过网络发送一些二进制数据。

我计划的格式非常简单,如下所示:

  • 字节 0 和 1:uint16_t 类型的操作码(即,无符号整数始终为 2 字节长)

  • 第 2 个字节以后:任何其他必要的数据,例如字符串、整数、它们的组合等。对方将根据操作码解释此数据。例如,如果操作码为 0,表示“登录”,则该数据将由一个字节整数组成,告诉您用户名的长度,后跟一个包含用户名的字符串,最后一个包含密码的字符串。对于操作码 1“发送聊天消息”,这里的整个数据可能只是聊天消息的字符串。

不过,这是该库为我提供的用于发送数据的内容:

void send(const string& data);
void send(const char* data);
void rawSend(const vector<unsigned char>* data);

我假设我想使用 rawSend() 来执行此操作。但是 rawSend() 接受无符号字符,而不是指向内存的 void* 指针?如果我尝试将某些类型的数据转换为无符号字符数组,是否会丢失一些数据?如果我错了,请纠正我..但如果我是对的,这是否意味着我应该寻找另一个支持真正的二进制数据传输的库?

假设这个库确实满足我的目的,那么我究竟如何将各种数据类型转换并连接到一个 std::vector 中?我尝试过的是这样的:

#define OPCODE_LOGINREQUEST 0

std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
loginRequestData->push_back(opcode);
// and at this point (not shown), I would push_back() the individual characters of the strings of the username and password.. after one byte worth of integer telling you how many characters long the username is (so you know when the username stops and the password begins)
socket->rawSend(loginRequestData);

但是,当我尝试解释数据时,在另一端遇到了一些异常。难道我选角的方式全错了?转换为无符号字符会丢失数据吗?

提前致谢。

I'm working with the NetLink socket library ( https://sourceforge.net/apps/wordpress/netlinksockets/ ), and I want to send some binary data over the network in a format that I specify.

The format I have planned is pretty simple and is as follows:

  • Bytes 0 and 1: an opcode of the type uint16_t (i.e., an unsigned integer always 2 bytes long)

  • Bytes 2 onward: any other data necessary, such as a string, an integer, a combination of each, etc.. the other party will interpret this data according to the opcode. For example, if the opcode is 0 which represents "log in", this data will consist of one byte integer telling you how long the username is, followed by a string containing the username, followed by a string containing the password. For opcode 1, "send a chat message", the entire data here could be just a string for the chat message.

Here's what the library gives me to work with for sending data, though:

void send(const string& data);
void send(const char* data);
void rawSend(const vector<unsigned char>* data);

I'm assuming I want to use rawSend() for this.. but rawSend() takes unsigned chars, not a void* pointer to memory? Isn't there going to be some loss of data here if I try to cast certain types of data to an array of unsigned chars? Please correct me if I'm wrong.. but if I'm right, does this mean I should be looking at another library that has support for real binary data transfer?

Assuming this library does serve my purposes, how exactly would I cast and concatenate my various data types into one std::vector? What I've tried is something like this:

#define OPCODE_LOGINREQUEST 0

std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
loginRequestData->push_back(opcode);
// and at this point (not shown), I would push_back() the individual characters of the strings of the username and password.. after one byte worth of integer telling you how many characters long the username is (so you know when the username stops and the password begins)
socket->rawSend(loginRequestData);

Ran into some exceptions, though, on the other end when I tried to interpret the data. Am I approaching the casting all wrong? Am I going to lose data by casting to unsigned chars?

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

我的黑色迷你裙 2024-12-15 18:03:19

我喜欢他们如何让你创建一个向量(它必须使用堆,因此在不可预测的时间内执行),而不是仅仅退回到 C 标准 (const void* buffer, size_t len) tuple,它与一切兼容,并且在性能方面无与伦比。那好吧。

您可以尝试这样做:

void send_message(uint16_t opcode, const void* rawData, size_t rawDataSize)
{
    vector<unsigned char> buffer;
    buffer.reserve(sizeof(uint16_t) + rawDataSize);
#if BIG_ENDIAN_OPCODE
    buffer.push_back(opcode >> 8);
    buffer.push_back(opcode & 0xFF);
#elseif LITTLE_ENDIAN_OPCODE
    buffer.push_back(opcode & 0xFF);
    buffer.push_back(opcode >> 8);
#else
    // Native order opcode
    buffer.insert(buffer.end(), reinterpret_cast<const unsigned char*>(&opcode), 
        reinterpret_cast<const unsigned char*>(&opcode) + sizeof(uint16_t));
#endif
    const unsigned char* base(reinterpret_cast<const unsigned char*>(rawData));
    buffer.insert(buffer.end(), base, base + rawDataSize);
    socket->rawSend(&buffer); // Why isn't this API using a reference?!
}

这使用 insert ,它应该比使用 push_back() 手写循环进行优化。如果 rawSend 抛出异常,它也不会泄漏缓冲区。

注意:字节顺序必须与此连接两端的平台匹配。如果没有,您需要选择一个字节顺序并坚持使用(互联网标准通常这样做,并且您使用 htonlhtons 函数)或您需要检测字节顺序(从接收者的 POV 来看“本机”或“向后”)并修复它(如果是“向后”)。

I like how they make you create a vector (which must use the heap and thus execute in unpredictable time) instead of just falling back to the C standard (const void* buffer, size_t len) tuple, which is compatible with everything and can't be beat for performance. Oh, well.

You could try this:

void send_message(uint16_t opcode, const void* rawData, size_t rawDataSize)
{
    vector<unsigned char> buffer;
    buffer.reserve(sizeof(uint16_t) + rawDataSize);
#if BIG_ENDIAN_OPCODE
    buffer.push_back(opcode >> 8);
    buffer.push_back(opcode & 0xFF);
#elseif LITTLE_ENDIAN_OPCODE
    buffer.push_back(opcode & 0xFF);
    buffer.push_back(opcode >> 8);
#else
    // Native order opcode
    buffer.insert(buffer.end(), reinterpret_cast<const unsigned char*>(&opcode), 
        reinterpret_cast<const unsigned char*>(&opcode) + sizeof(uint16_t));
#endif
    const unsigned char* base(reinterpret_cast<const unsigned char*>(rawData));
    buffer.insert(buffer.end(), base, base + rawDataSize);
    socket->rawSend(&buffer); // Why isn't this API using a reference?!
}

This uses insert which should optimize better than a hand-written loop with push_back(). It also won't leak the buffer if rawSend tosses an exception.

NOTE: Byte order must match for the platforms on both ends of this connection. If it does not, you'll need to either pick one byte order and stick with it (Internet standards usually do this, and you use the htonl and htons functions) or you need to detect byte order ("native" or "backwards" from the receiver's POV) and fix it if "backwards".

内心旳酸楚 2024-12-15 18:03:19

我会使用这样的东西:

#define OPCODE_LOGINREQUEST 0 
#define OPCODE_MESSAGE 1

void addRaw(std::vector<unsigned char> &v, const void *data, const size_t len)
{
    const unsigned char *ptr = static_cast<const unsigned char*>(data);
    v.insert(v.end(), ptr, ptr + len);
}

void addUint8(std::vector<unsigned char> &v, uint8_t val)
{
    v.push_back(val);
}

void addUint16(std::vector<unsigned char> &v, uint16_t val)
{
    val = htons(val);
    addRaw(v, &val, sizeof(uint16_t));
}

void addStringLen(std::vector<unsigned char> &v, const std::string &val)
{
    uint8_t len = std::min(val.length(), 255);
    addUint8(v, len);
    addRaw(v, val.c_str(), len);
}

void addStringRaw(std::vector<unsigned char> &v, const std::string &val)
{
    addRaw(v, val.c_str(), val.length());
}

void sendLogin(const std::string &user, const std::string &pass)
{
    std::vector<unsigned char> data(
        sizeof(uint16_t) +
        sizeof(uint8_t) + std::min(user.length(), 255) +
        sizeof(uint8_t) + std::min(pass.length(), 255)
    );
    addUint16(data, OPCODE_LOGINREQUEST);
    addStringLen(data, user);
    addStringLen(data, pass);
    socket->rawSend(&data);
}

void sendMsg(const std::string &msg)
{
    std::vector<unsigned char> data(
      sizeof(uint16_t) +
      msg.length()
    );
    addUint16(data, OPCODE_MESSAGE);
    addStringRaw(data, msg);
    socket->rawSend(&data);
}

I would use something like this:

#define OPCODE_LOGINREQUEST 0 
#define OPCODE_MESSAGE 1

void addRaw(std::vector<unsigned char> &v, const void *data, const size_t len)
{
    const unsigned char *ptr = static_cast<const unsigned char*>(data);
    v.insert(v.end(), ptr, ptr + len);
}

void addUint8(std::vector<unsigned char> &v, uint8_t val)
{
    v.push_back(val);
}

void addUint16(std::vector<unsigned char> &v, uint16_t val)
{
    val = htons(val);
    addRaw(v, &val, sizeof(uint16_t));
}

void addStringLen(std::vector<unsigned char> &v, const std::string &val)
{
    uint8_t len = std::min(val.length(), 255);
    addUint8(v, len);
    addRaw(v, val.c_str(), len);
}

void addStringRaw(std::vector<unsigned char> &v, const std::string &val)
{
    addRaw(v, val.c_str(), val.length());
}

void sendLogin(const std::string &user, const std::string &pass)
{
    std::vector<unsigned char> data(
        sizeof(uint16_t) +
        sizeof(uint8_t) + std::min(user.length(), 255) +
        sizeof(uint8_t) + std::min(pass.length(), 255)
    );
    addUint16(data, OPCODE_LOGINREQUEST);
    addStringLen(data, user);
    addStringLen(data, pass);
    socket->rawSend(&data);
}

void sendMsg(const std::string &msg)
{
    std::vector<unsigned char> data(
      sizeof(uint16_t) +
      msg.length()
    );
    addUint16(data, OPCODE_MESSAGE);
    addStringRaw(data, msg);
    socket->rawSend(&data);
}
七颜 2024-12-15 18:03:19
std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
loginRequestData->push_back(opcode);

如果 unsigned char 的长度是 8 位(在大多数系统中都是如此),则每次推送时都会丢失 opcode 中的高 8 位。您应该为此收到警告。

rawSend 采用向量 的决定非常奇怪,通用库会在不同的抽象级别上工作。我只能猜测是这样,因为 rawSend 复制了传递的数据,并保证其生命周期直到操作完成。如果不是,那么这只是一个糟糕的设计选择;除此之外,它通过指针获取参数...您应该将此 data 视为原始内存的容器,有一些怪癖需要正确处理,但这是您期望的方式在这种情况下使用 pod 类型:

data->insert( data->end(), reinterpret_cast< char const* >( &opcode ), reinterpret_cast< char const* >( &opcode ) + sizeof( opcode ) );
std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
loginRequestData->push_back(opcode);

If unsigned char is 8 bits long -which in most systems is-, you will be loosing the higher 8 bits from opcode every time you push. You should be getting a warning for this.

The decision for rawSend to take a vector is quite odd, a general library would work at a different level of abstraction. I can only guess that it is this way because rawSend makes a copy of the passed data, and guarantees its lifetime until the operation has completed. If not, then is just a poor design choice; add to that the fact that its taking the argument by pointer... You should see this data as a container of raw memory, there are some quirks to get right but here is how you would be expected to work with pod types in this scenario:

data->insert( data->end(), reinterpret_cast< char const* >( &opcode ), reinterpret_cast< char const* >( &opcode ) + sizeof( opcode ) );
累赘 2024-12-15 18:03:19

这将有效:

#define OPCODE_LOGINREQUEST 0

std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
unsigned char *opcode_data = (unsigned char *)&opcode;
for(int i = 0; i < sizeof(opcode); i++)
    loginRequestData->push_back(opcode_data[i]);
socket->rawSend(loginRequestData);

这也适用于任何 POD 类型。

This will work:

#define OPCODE_LOGINREQUEST 0

std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
unsigned char *opcode_data = (unsigned char *)&opcode;
for(int i = 0; i < sizeof(opcode); i++)
    loginRequestData->push_back(opcode_data[i]);
socket->rawSend(loginRequestData);

This will also work for any POD type.

魂ガ小子 2024-12-15 18:03:19

是的,使用 rawSend 因为 send 可能需要一个 NULL 终止符。

通过转换为 char 而不是 void*,您不会丢失任何内容。记忆就是记忆。除了 RTTI 信息之外,C++ 中的类型永远不会存储在内存中。您可以通过转换为操作码指示的类型来恢复数据。

如果您可以在编译时决定所有发送的格式,我建议使用结构来表示它们。我以前曾专业地这样做过,这只是清晰存储各种消息格式的最佳方法。而且从另一面打开包装也非常容易;只需根据操作码将原始缓冲区转换为结构即可!

struct MessageType1 {
    uint16_t opcode;
    int myData1;
    int myData2;
};

MessageType1 msg;

std::vector<char> vec;
char* end = (char*)&msg + sizeof(msg);
vec.insert( vec.end(), &msg, end );

send(vec);

struct 方法是最好、最简洁的发送和接收方式,但布局在编译时是固定的。
如果消息的格式直到运行时才确定,请使用 char 数组:

char buffer[2048];

*((uint16_t*)buffer) = opcode;
// now memcpy into it
// or placement-new to construct objects in the buffer memory

int usedBufferSpace = 24; //or whatever

std::vector<char> vec;
const char* end = buffer + usedBufferSpace;
vec.insert( vec.end(), buffer, end );

send(&buffer);

Yeah, go with rawSend since send probably expects a NULL terminator.

You don't lose anything by casting to char instead of void*. Memory is memory. Types are never stored in memory in C++ except for RTTI info. You can recover your data by casting to the type indicated by your opcode.

If you can decide the format of all your sends at compile time, I recommend using structs to represent them. I've done this before professionally, and this is simply the best way to clearly store the formats for a wide variety of messages. And it's super easy to unpack on the other side; just cast the raw buffer into the struct based on the opcode!

struct MessageType1 {
    uint16_t opcode;
    int myData1;
    int myData2;
};

MessageType1 msg;

std::vector<char> vec;
char* end = (char*)&msg + sizeof(msg);
vec.insert( vec.end(), &msg, end );

send(vec);

The struct approach is the best, neatest way to send and receive, but the layout is fixed at compile time.
If the format of the messages is not decided until runtime, use a char array:

char buffer[2048];

*((uint16_t*)buffer) = opcode;
// now memcpy into it
// or placement-new to construct objects in the buffer memory

int usedBufferSpace = 24; //or whatever

std::vector<char> vec;
const char* end = buffer + usedBufferSpace;
vec.insert( vec.end(), buffer, end );

send(&buffer);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文