返回介绍

Exercise 36: Safer Strings

发布于 2025-03-08 19:42:08 字数 7040 浏览 0 评论 0 收藏 0

I've already introduced you to the Better String library in Exercise 26 when we made devpkg . This exercise is designed to get you into using bstring from now on, why C's strings are an incredibly bad idea, and then have you change the liblcthw code to use bstring.

Why C Strings Were A Horrible Idea

When people talk about problems with C, it's concept of a "string" is one of the top flaws. You've been using these extensively, and I've talked about the kinds of flaws they have, but there's not much that explains exactly why C strings are flawed and always will be. I'll try to explain that right now, but part of my explanation will just be that after decades of using C's strings there's enough evidence that they are just a bad idea.

It is impossible to confirm that any given C string is valid:

  • A C string is invalid if it does not end in '\0' .
  • Any loop that processes an invalid C string will loop infinitely (or, just buffer overflow).
  • C strings do not have a known length, so the only way to check if it's terminated correctly is to loop through it.
  • Therefore, it is not possible to validate a C string without possibly looping infinitely.

This is simple logic. You can't write a loop that checks if a C string is valid because invalid C strings cause loops to never terminate. That's it, and the only solution is to include the size . Once you know the size you can avoid the infinite loop problem. If you look at the two functions I showed you from Exercise 27 you can see this:

void copy(char to[], char from[])
{
    int i = 0;

    // while loop will not end if from isn't '\0' terminated
    while((to[i] = from[i]) != '\0') {
        ++i;
    }
}

int safercopy(int from_len, char *from, int to_len, char *to)
{
    int i = 0;
    int max = from_len > to_len - 1 ? to_len - 1 : from_len;

    // to_len must have at least 1 byte
    if(from_len < 0 || to_len <= 0) return -1;

    for(i = 0; i < max; i++) {
        to[i] = from[i];
    }

    to[to_len - 1] = '\0';

    return i;
}

Imagine you want to add a check to the copy function to confirm that the from string is valid. How would you do that? Why you'd write a loop that checked that the string ended in '\0' . Oh wait, if the string doesn't end in '\0' then how does the checking loop end? It doesn't. Checkmate.

No matter what you do, you can't check that a C string is valid without knowing the length of the underlying storage, and in this case the safercopy includes those lengths. This function doesn't have the same problem as it's loops will always terminate, even if you lie to it about the size, you still have to give it a finite size.

What the Better String library does is create a struct that always includes the length of the string's storage. Because the length is always available to a bstring then all of its operations can be safer. The loops will terminate, the contents can be validated, and it will not have this major flaw. The bstring library also comes with a ton of operations you need with strings, like splitting, formatting, searching, and they are most likely done right and safer.

There could be flaws in bstring, but it's been around a long time so those are probably minimal. They still find flaws in glibc so what's a programmer to do right?

Using bstrlib

There's quite a few improved string libraries, but I like bstrlib because it fits in one file for the basics and has most of the stuff you need to deal with strings. You've already used it a bit, so in this exercise you'll go get the two files bstrlib.c and bstrlib.h from the Better String

Here's me doing this in the liblcthw project directory:

$ mkdir bstrlib
$ cd bstrlib/
$ unzip ~/Downloads/bstrlib-05122010.zip
Archive:  /Users/zedshaw/Downloads/bstrlib-05122010.zip
...
$ ls
bsafe.c             bstraux.c       bstrlib.h       bstrwrap.h      license.txt     test.cpp
bsafe.h             bstraux.h       bstrlib.txt     cpptest.cpp     porting.txt     testaux.c
bstest.c    bstrlib.c       bstrwrap.cpp    gpl.txt         security.txt
$ mv bstrlib.h bstrlib.c ../src/lcthw/
$ cd ../
$ rm -rf bstrlib
## make the edits
$ vim src/lcthw/bstrlib.c
$ make clean all
...
$

On line 14 you seem me edit the bstrlib.c file to move it to a new location and to fix a bug on OSX. Here's the diff:

25c25
< #include "bstrlib.h"
---
> #include <lcthw/bstrlib.h>
2759c2759
< #ifdef __GNUC__
---
> #if defined(__GNUC__) && !defined(__APPLE__)

That is, change the include to be &lt;lcthw/bstrlib.h&gt; , and then fix one of the ifdef at line 2759.

Learning The Library

This exercise is short and simply getting you ready for the remaining exercises that use the library. In the next two exercises I'll use bstrlib.c to create a Hashmap data structure.

You should now get familiar with this library by reading the header file, the implementations, and then write a tests/bstr_tests.c that tests out the following functions:

bfromcstr

Create a bstring from a C style constant.

blk2bstr

Same but give the length of the buffer.

bstrcpy

Copy a bstring.

bassign

Set one bstring to another.

bassigncstr

Set a bstring to a C string's contents.

bassignblk

Set a bstring to a C string but give the length.

bdestroy

Destroy a bstring.

bconcat

Concatenate one bstring onto another.

bstricmp

Compare two bstrings returning the same result as strcmp.

biseq

Tests if two bstrings are equal.

binstr

Tells if one bstring is in another.

bfindreplace

Find one bstring in another then replace it with a third.

bsplit

How to split a bstring into a bstrList.

bformat

Doing a format string, super handy.

blength

Getting the length of a bstring.

bdata

Getting the data from a bstring.

bchar

Getting a char from a bstring.

Your test should try out all of these operations, and a few more that you find interesting from the header file. Make sure to run the test under valgrind to make sure you use the memory correctly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文