K& R 练习:我的代码可以工作,但感觉很臭; 清理建议?

发布于 2024-07-07 07:06:02 字数 2053 浏览 8 评论 0原文

我正在写《K&R》一书。 我读的书比做的练习多,主要是因为时间不够。 我正在赶上,并且已经完成了教程第一章中的几乎所有练习。

我的问题是练习 1-18。 练习的目的是:

编写一个程序来删除尾随空白并 从输入行中跳出制表符,并删除整个空白行

我的代码(如下)执行此操作,并且有效。 我的问题是我实施的修剪方法。 不知怎么的,感觉……不对劲。 就像如果我在代码审查中看到类似的 C# 代码,我可能会发疯。 (C# 是我的专长之一。)

任何人都可以提供一些关于清理这个问题的建议吗?问题是这些建议只能使用《K &#39》第 1 章中的知识。 R.(我知道有无数种方法可以使用完整的 C 库来清理这个问题;我们在这里只讨论第 1 章和基本的 stdio.h。)另外,在提供建议时,你能解释一下为什么它会有所帮助吗? (毕竟,我正在努力学习!还有谁比这里的专家更适合学习呢?)

#include <stdio.h>

#define MAXLINE 1000

int getline(char line[], int max);
void trim(char line[], char ret[]);

int main()
{
    char line[MAXLINE];
    char out[MAXLINE];
    int length;

    while ((length = getline(line, MAXLINE)) > 0)
    {
        trim(line, out);
        printf("%s", out);
    }

    return 0;
}

int getline(char line[], int max)
{
    int c, i;

    for (i = 0; i < max - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
        line[i] = c;

    if (c == '\n')
    {
        line[i] = c;
        ++i;
    }

    line[i] = '\0'; 
    return i;
}

void trim(char line[], char ret[])
{
    int i = 0;

    while ((ret[i] = line[i]) != '\0')
        ++i;

    if (i == 1)
    {
        // Special case to remove entirely blank line
        ret[0] = '\0';
        return;
    }

    for (  ; i >= 0; --i)
    {
        if (ret[i] == ' ' || ret[i] == '\t')
            ret[i] = '\0';
        else if (ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n')
            break;
    }

    for (i = 0; i < MAXLINE; ++i)
    {
        if (ret[i] == '\n')
        {
            break;
        }
        else if (ret[i] == '\0')
        {
            ret[i] = '\n';
            ret[i + 1] = '\0';
            break;
        }
    }
}

编辑:我很感谢我在这里看到的所有有用的提示。 我想提醒大家,我在 C 方面仍然是个菜鸟,特别是还没有达到指导的水平。 (记住 K&R 的第 1 章的内容——第 1 章不做指针。)我“有点”得到了其中一些解决方案,但它们仍然比我所处的位置先进。 。

我正在寻找的大部分内容是修剪方法本身 - 特别是我循环了 3 次(感觉很脏) 我觉得如果我再聪明一点(即使没有 C 语言的高级知识),这可能会更干净。

I'm working on the K&R book. I've read farther ahead than I've done exercises, mostly for lack of time. I'm catching up, and have done almost all the exercises from chapter 1, which is the tutorial.

My issue was exercise 1-18. The exercise is to:

Write a program to remove trailing blanks and
tabs from line of input, and to delete entirely blank lines

My code (below) does that, and works. My problem with it is the trim method I implemented. It feels ... wrong ... somehow. Like if I saw similar code in C# in a code review, I'd probably go nuts. (C# being one of my specialties.)

Can anyone offer some advice on cleaning this up -- with the catch that said advice has to only use knowledge from Chapter 1 of K & R. (I know there are a zillion ways to clean this up using the full C library; we're just talking Chapter 1 and basic stdio.h here.) Also, when giving the advice, can you explain why it will help? (I am, after all, trying to learn! And who better to learn from than the experts here?)

#include <stdio.h>

#define MAXLINE 1000

int getline(char line[], int max);
void trim(char line[], char ret[]);

int main()
{
    char line[MAXLINE];
    char out[MAXLINE];
    int length;

    while ((length = getline(line, MAXLINE)) > 0)
    {
        trim(line, out);
        printf("%s", out);
    }

    return 0;
}

int getline(char line[], int max)
{
    int c, i;

    for (i = 0; i < max - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
        line[i] = c;

    if (c == '\n')
    {
        line[i] = c;
        ++i;
    }

    line[i] = '\0'; 
    return i;
}

void trim(char line[], char ret[])
{
    int i = 0;

    while ((ret[i] = line[i]) != '\0')
        ++i;

    if (i == 1)
    {
        // Special case to remove entirely blank line
        ret[0] = '\0';
        return;
    }

    for (  ; i >= 0; --i)
    {
        if (ret[i] == ' ' || ret[i] == '\t')
            ret[i] = '\0';
        else if (ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n')
            break;
    }

    for (i = 0; i < MAXLINE; ++i)
    {
        if (ret[i] == '\n')
        {
            break;
        }
        else if (ret[i] == '\0')
        {
            ret[i] = '\n';
            ret[i + 1] = '\0';
            break;
        }
    }
}

EDIT: I appreciate all the helpful tips I'm seeing here. I would like to remind folks that I'm still a n00b with C, and specifically haven't gotten up to pointers yet. (Remember the bit about Ch.1 of K&R -- Ch.1 doesn't do pointers.) I "kinda" get some of those solutions, but they're still a touch advanced for where I'm at ...

And most of what I'm looking for is the trim method itself -- specifically the fact that I'm looping through 3 times (which feels so dirty). I feel like if I were just a touch more clever (even without the advanced knowledge of C), this could have been cleaner.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

毁梦 2024-07-14 07:06:03

就我个人而言,我会将这样的代码放入

ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n'

一个单独的函数(甚至定义宏)中

Personally I'd put code like this:

ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n'

into a separate function (or even a define macro)

小情绪 2024-07-14 07:06:03
  1. 修剪确实应该只使用 1 个缓冲区(如 @Ferruccio 所说)。
  2. 修剪需要被分解,因为@plinth说
  3. 修剪不需要返回任何值(如果你想检查空字符串,测试行[0] == 0)
  4. 以获得额外的C风格,使用指针而不是索引

-转到结束行数(终止于 0;
- 如果不在行首并且当前字符是空格,则将其替换为 0。
- 后退一个字符

char *findEndOfString(char *string) {
  while (*string) ++string;
  return string; // string is now pointing to the terminating 0
}

void trim(char *line) {
  char *end = findEndOfString(line);
   // note that we start at the first real character, not at terminating 0
  for (end = end-1; end >= line; end--) {
      if (isWhitespace(*end)) *end = 0;
      else return;
  }
}
  1. trim should indeed use 1 buffer only (as @Ferruccio says).
  2. trim needs to be broken up, as @plinth says
  3. trim needs not return any value (if you want to check for empty string, test line[0] == 0)
  4. for extra C flavor, use pointers rather than indexes

-go to end of line (terminating 0;
-while not at the start of line and current character is space, replace it with 0.
-back off one char

char *findEndOfString(char *string) {
  while (*string) ++string;
  return string; // string is now pointing to the terminating 0
}

void trim(char *line) {
  char *end = findEndOfString(line);
   // note that we start at the first real character, not at terminating 0
  for (end = end-1; end >= line; end--) {
      if (isWhitespace(*end)) *end = 0;
      else return;
  }
}
谎言月老 2024-07-14 07:06:03

做同样事情的另一个例子。 使用 C99 特定的东西造成了一些轻微的违规。 这在 K&R 中是找不到的。 还使用了assert()函数,它是标准库的一部分,但可能没有在K&R的第一章中介绍。

#include <stdbool.h> /* needed when using bool, false and true. C99 specific. */
#include <assert.h> /* needed for calling assert() */

typedef enum {
  TAB = '\t',
  BLANK = ' '
} WhiteSpace_e;

typedef enum {
  ENDOFLINE = '\n',
  ENDOFSTRING = '\0'
} EndofLine_e;

bool isWhiteSpace(
  char character
) {
  if ( (BLANK == character) || (TAB == character ) ) {
    return true;
  } else {
    return false;
  }
}

bool isEndOfLine( 
  char character
) {
 if ( (ENDOFLINE == character) || (ENDOFSTRING == character ) ) {
    return true;
  } else {
    return false;
  }
}   

/* remove blanks and tabs (i.e. whitespace) from line-string */
void removeWhiteSpace(
  char string[]
) {
  int i;
  int indexOutput;

  /* copy all non-whitespace character in sequential order from the first to the last.
    whitespace characters are not copied */
  i = 0;
  indexOutput = 0;
  while ( false == isEndOfLine( string[i] ) ) {
    if ( false == isWhiteSpace( string[i] ) ) {
      assert ( indexOutput <= i );
      string[ indexOutput ] = string[ i ];
      indexOutput++;
    }
    i++; /* proceed to next character in the input string */
  }

  assert( isEndOfLine( string[ i ] ) );
  string[ indexOutput ] = ENDOFSTRING;

}

Another example of doing the same thing. Did some minor violation by using C99-specific stuff. that will not be found in K&R. also used the assert() function which is part of the starndard library, but is probably not covered in chapter one of K&R.

#include <stdbool.h> /* needed when using bool, false and true. C99 specific. */
#include <assert.h> /* needed for calling assert() */

typedef enum {
  TAB = '\t',
  BLANK = ' '
} WhiteSpace_e;

typedef enum {
  ENDOFLINE = '\n',
  ENDOFSTRING = '\0'
} EndofLine_e;

bool isWhiteSpace(
  char character
) {
  if ( (BLANK == character) || (TAB == character ) ) {
    return true;
  } else {
    return false;
  }
}

bool isEndOfLine( 
  char character
) {
 if ( (ENDOFLINE == character) || (ENDOFSTRING == character ) ) {
    return true;
  } else {
    return false;
  }
}   

/* remove blanks and tabs (i.e. whitespace) from line-string */
void removeWhiteSpace(
  char string[]
) {
  int i;
  int indexOutput;

  /* copy all non-whitespace character in sequential order from the first to the last.
    whitespace characters are not copied */
  i = 0;
  indexOutput = 0;
  while ( false == isEndOfLine( string[i] ) ) {
    if ( false == isWhiteSpace( string[i] ) ) {
      assert ( indexOutput <= i );
      string[ indexOutput ] = string[ i ];
      indexOutput++;
    }
    i++; /* proceed to next character in the input string */
  }

  assert( isEndOfLine( string[ i ] ) );
  string[ indexOutput ] = ENDOFSTRING;

}
逐鹿 2024-07-14 07:06:03

这是我在不知道第 1 章或 K & 2 中内容的情况下对练习的尝试。 R. 我假设有指针?

#include "stdio.h"

size_t StrLen(const char* s)
{
    // this will crash if you pass NULL
    size_t l = 0;
    const char* p = s;
    while(*p)
    {
        l++;
        ++p;
    }
    return l;
}

const char* Trim(char* s)
{
    size_t l = StrLen(s);
    if(l < 1)
        return 0;

    char* end = s + l -1;
    while(s < end && (*end == ' ' || *end == '\t'))
    {
        *end = 0;
        --end;
    }

    return s;
}

int Getline(char* out, size_t max)
{
    size_t l = 0;
    char c;
    while(c = getchar())
    {
        ++l;

        if(c == EOF) return 0;
        if(c == '\n') break;

        if(l < max-1)
        {
            out[l-1] = c;
            out[l] = 0;
        }
    }

    return l;
}

#define MAXLINE 1024

int main (int argc, char * const argv[]) 
{
    char line[MAXLINE];
    while (Getline(line, MAXLINE) > 0)
    {
        const char* trimmed = Trim(line);
        if(trimmed)
            printf("|%s|\n", trimmed);

        line[0] = 0;
    }

    return 0;
}

Here's my stab at the exercise without knowing what is in Chapter 1 or K & R. I assume pointers?

#include "stdio.h"

size_t StrLen(const char* s)
{
    // this will crash if you pass NULL
    size_t l = 0;
    const char* p = s;
    while(*p)
    {
        l++;
        ++p;
    }
    return l;
}

const char* Trim(char* s)
{
    size_t l = StrLen(s);
    if(l < 1)
        return 0;

    char* end = s + l -1;
    while(s < end && (*end == ' ' || *end == '\t'))
    {
        *end = 0;
        --end;
    }

    return s;
}

int Getline(char* out, size_t max)
{
    size_t l = 0;
    char c;
    while(c = getchar())
    {
        ++l;

        if(c == EOF) return 0;
        if(c == '\n') break;

        if(l < max-1)
        {
            out[l-1] = c;
            out[l] = 0;
        }
    }

    return l;
}

#define MAXLINE 1024

int main (int argc, char * const argv[]) 
{
    char line[MAXLINE];
    while (Getline(line, MAXLINE) > 0)
    {
        const char* trimmed = Trim(line);
        if(trimmed)
            printf("|%s|\n", trimmed);

        line[0] = 0;
    }

    return 0;
}
悲欢浪云 2024-07-14 07:06:02

如果你坚持阅读第一章,那对我来说看起来相当不错。 从代码审查的角度来看,我建议如下:

在 C 中检查相等性时,始终将常量放在第一位

if (1 == myvar)

这样,您就永远不会意外地执行以下操作:

if (myvar = 1)

在 C# 中您无法摆脱这一点,但它在 C# 中编译得很好C 并能成为真正的魔鬼调试者。

If you are sticking with chapter 1, that looks pretty good to me. Here's what I would recommend from a code-review standpoint:

When checking equality in C, always put the constant first

if (1 == myvar)

That way you will never accidentally do something like this:

if (myvar = 1)

You can't get away with that in C#, but it compiles fine in C and can be a real devil to debug.

思慕 2024-07-14 07:06:02

没有理由有两个缓冲区,您可以就地修剪输入行

int trim(char line[])
{
    int len = 0;
    for (len = 0; line[len] != 0; ++len)
        ;

    while (len > 0 &&
           line[len-1] == ' ' && line[len-1] == '\t' && line[len-1] == '\n')
        line[--len] = 0;

    return len;
}

通过返回行长度,您可以通过测试非零长度行来消除空白行

if (trim(line) != 0)
    printf("%s\n", line);

编辑:您可以使 while 循环更加简单,假设 ASCII 编码。

while (len > 0 && line[len-1] <= ' ')
    line[--len] = 0;

There is no reason to have two buffers, you can trim the input line in place

int trim(char line[])
{
    int len = 0;
    for (len = 0; line[len] != 0; ++len)
        ;

    while (len > 0 &&
           line[len-1] == ' ' && line[len-1] == '\t' && line[len-1] == '\n')
        line[--len] = 0;

    return len;
}

By returning the line length, you can eliminate blank lines by testing for non-zero length lines

if (trim(line) != 0)
    printf("%s\n", line);

EDIT: You can make the while loop even simpler, assuming ASCII encoding.

while (len > 0 && line[len-1] <= ' ')
    line[--len] = 0;
寂寞花火° 2024-07-14 07:06:02

修剪()太大了。

我认为你需要的是一个 strlen-ish 函数(继续写 int stringlength(const char *s))。

然后你需要一个名为 int scanback(const char *s, const char *matches, int start) 的函数,它从 start 开始,只要在 s id 处扫描的字符包含在匹配项中,就向下到 z,返回最后一个索引,其中找到匹配项。

然后,您需要一个名为 int scanfront(const char *s, const char *matches) 的函数,该函数从 0 开始,只要在 s 处扫描的字符包含在匹配项中就向前扫描,返回找到匹配项的最后一个索引。

然后你需要一个名为 int charinstring(char c, const char *s) 的函数,如果 c 包含在 s 中,则该函数返回非零,否则返回 0。

您应该能够根据这些来编写修剪。

trim() is too big.

What I think you need is a strlen-ish function (go ahead and write it int stringlength(const char *s)).

Then you need a function called int scanback(const char *s, const char *matches, int start) which starts at start, goes down to z as long as the character being scanned at s id contained in matches, return the last index where a match is found.

Then you need a function called int scanfront(const char *s, const char *matches) which starts at 0 and scans forward as long as the character being scanned at s is contained in matches, returning the last index where a match is found.

Then you need a function called int charinstring(char c, const char *s) which returns non-zero if c is contained in s, 0 otherwise.

You should be able to write trim in terms of these.

坦然微笑 2024-07-14 07:06:02

就个人而言,对于 while 构造:

我更喜欢以下内容:

while( (ret[i] = line[i]) )
        i++;

to:

while ((ret[i] = line[i]) != '\0')
        ++i;

它们都检查 != 0 但第一个看起来更干净一些。 如果 char 不是 0,则循环体将执行,否则它将跳出循环。

同样,对于“for”语句,虽然在语法上是有效的,但我发现以下内容:

for (  ; i >= 0; --i)

对我来说看起来“奇怪”,而且确实是潜在错误的潜在噩梦解决方案。 如果我正在审查这段代码,它会像一个发光的红色警告一样。 通常,您希望使用 for 循环来迭代已知次数,否则请考虑使用 while 循环。 (一如既往,规则也有例外,但我发现这通常是正确的)。 上面的 for 语句可以变成:

while (i)
{
        if (ret[i] == ' ' || ret[i] == '\t')
        {
            ret[i--] = '\0';
        }
        else if (ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n')
        {
            break;
        }
}

Personally for while constructs:

I prefer the following:

while( (ret[i] = line[i]) )
        i++;

to:

while ((ret[i] = line[i]) != '\0')
        ++i;

They both check against != 0 but the first looks a little cleaner. If the char is anything other thah 0, then the loop body will execute else it will break out of the loop.

Also for 'for' statements, whilst being syntatically valid, I find that the following:

for (  ; i >= 0; --i)

just looks 'odd' to me and indeed is a potential nightmare solution for potential bugs. If I was reviewing this code, it would be like a glowing red warning like. Typically you want to use for loops for iterating a known number of times, otherwise cosider a while loop. (as always there are exceptions to the rule but Ive found that this generally holds true). The above for statement could become:

while (i)
{
        if (ret[i] == ' ' || ret[i] == '\t')
        {
            ret[i--] = '\0';
        }
        else if (ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n')
        {
            break;
        }
}
寂寞清仓 2024-07-14 07:06:02

首先:

int main(void)

您知道 main() 的参数。 他们什么都不是。 (或者 argc&argv,但我不认为那是第 1 章的内容。)

在风格上,您可能想尝试 K&R 风格的括号。 它们在垂直空间上更容易:(

void trim(char line[], char ret[])
{
    int i = 0;

    while ((ret[i] = line[i]) != '\0')
        ++i;

    if (i == 1) { // Special case to remove entirely blank line
        ret[0] = '\0';
        return;
    }

    for (; i>=0; --i) { //continue backwards from the end of the line
        if ((ret[i] == ' ') || (ret[i] == '\t')) //remove trailing whitespace
            ret[i] = '\0';

        else if ((ret[i] != '\0') && (ret[i] != '\r') && (ret[i] != '\n')) //...until we hit a word character
            break;
    }

    for (i=0; i<MAXLINE-1; ++i) { //-1 because we might need to add a character to the line
        if (ret[i] == '\n') //break on newline
            break;

        if (ret[i] == '\0') { //line doesn't have a \n -- add it
            ret[i] = '\n';
            ret[i+1] = '\0';
            break;
        }
    }
}

还添加了注释并修复了一个错误。)

一个大问题是 MAXLINE 常量的使用 - main() 专门将它用于 和 < em>输出变量; Trim() 只对它们起作用,不需要使用常量。 您应该将大小作为参数传递,就像在 getline() 中所做的那样。

First of all:

int main(void)

You know the parameters to main(). They're nothing. (Or argc&argv, but I don't think that's Chapter 1 material.)

Stylewise, you might want to try K&R-style brackets. They're much easier on the vertical space:

void trim(char line[], char ret[])
{
    int i = 0;

    while ((ret[i] = line[i]) != '\0')
        ++i;

    if (i == 1) { // Special case to remove entirely blank line
        ret[0] = '\0';
        return;
    }

    for (; i>=0; --i) { //continue backwards from the end of the line
        if ((ret[i] == ' ') || (ret[i] == '\t')) //remove trailing whitespace
            ret[i] = '\0';

        else if ((ret[i] != '\0') && (ret[i] != '\r') && (ret[i] != '\n')) //...until we hit a word character
            break;
    }

    for (i=0; i<MAXLINE-1; ++i) { //-1 because we might need to add a character to the line
        if (ret[i] == '\n') //break on newline
            break;

        if (ret[i] == '\0') { //line doesn't have a \n -- add it
            ret[i] = '\n';
            ret[i+1] = '\0';
            break;
        }
    }
}

(Also added comments and fixed one bug.)

A big issue is the usage of the MAXLINE constant -- main() exclusively uses it for the line and out variables; trim(), which is only working on them doesn't need to use the constant. You should pass the size(s) as a parameter just like you did in getline().

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文