使用 strtok 不同的字符串长度
void redact_words(const char *text_filename, const char *redact_words_filename){
FILE *fp = fopen(text_filename,"r");
FILE *f2p = fopen(redact_words_filename,"r");
FILE *f3p = fopen("result.txt", "w"); ;
char buffer1[1000];
char buffer2[1000];
char *word;
char *redact;
char **the_words;
//if ((fgets(buffer1, 1000 ,fp) == NULL) || (fgets(buffer2,1000 ,f2p) == NULL))
fgets(buffer1,1000,fp);
fgets(buffer2,1000,f2p);
rewind(fp);
rewind(f2p);
int word_count = 0;
while (!feof(f2p)){
char c = fgetc(f2p);
if (c == ' '){
word_count += 1;
}
}
word_count += 1;
the_words = malloc(3 * sizeof(char*));
redact = strtok(buffer2, ", ");
for (int i = 0; i < word_count; i++){
the_words[i] = malloc(100);
the_words[i] = redact;
redact = strtok(NULL, ", ");
}
char result[256] = "";
word = strtok(buffer1, " ");
while (word != NULL){
for (int i = 0; i < word_count; i++){
if (strcasecmp(the_words[i],word) == 0){
for (int i = 0; i < strlen(word); i++){
strcat(result,"*");
}
strcat(result, " ");
break;
}
else{
if (i==(word_count-1)){
strcat(result, word);
strcat(result, " ");
}
}
}
word = strtok(NULL," ");
}
fputs(result, f3p);
fclose(fp);
fclose(f2p);
fclose(f3p);
free(the_words);
}
这是我的 C 代码,如果名为 redact_words_filename 的文件中存在该单词,则用星号替换名为 text_filename 的文件中的单词。然而,我注意到在与 2 个字符串的比较过程中
if (strcasecmp(the_words[i],word) == 0){
for (int i = 0; i < strlen(word); i++){
strcat(result,"*");
}
,当我在两个文本文件中都有“quick”一词时,the_words[i] 包含一个长度为 6 的字符串,而 word 中的一个包含一个长度为 5 的字符串,两者都包含值很快,因此它没有注册为同一个字符串。为什么其中一根弦比另一根长?
(Ps,我对糟糕的代码质量表示歉意)
编辑1:好的,所以我发现它与每行末尾的 \n 有关。试图找到一种方法来解决这个问题。
编辑2:我设法通过一个简单的for循环摆脱\n
for (int i = 0; i < word_count; i++){
the_words[i] = malloc(100);
the_words[i] = redact;
for (int j = 0; j < strlen(redact); j++){
if (redact[j] == '\n'){
redact[j] = '\0';
}
}
redact = strtok(NULL, ", ");
}
void redact_words(const char *text_filename, const char *redact_words_filename){
FILE *fp = fopen(text_filename,"r");
FILE *f2p = fopen(redact_words_filename,"r");
FILE *f3p = fopen("result.txt", "w"); ;
char buffer1[1000];
char buffer2[1000];
char *word;
char *redact;
char **the_words;
//if ((fgets(buffer1, 1000 ,fp) == NULL) || (fgets(buffer2,1000 ,f2p) == NULL))
fgets(buffer1,1000,fp);
fgets(buffer2,1000,f2p);
rewind(fp);
rewind(f2p);
int word_count = 0;
while (!feof(f2p)){
char c = fgetc(f2p);
if (c == ' '){
word_count += 1;
}
}
word_count += 1;
the_words = malloc(3 * sizeof(char*));
redact = strtok(buffer2, ", ");
for (int i = 0; i < word_count; i++){
the_words[i] = malloc(100);
the_words[i] = redact;
redact = strtok(NULL, ", ");
}
char result[256] = "";
word = strtok(buffer1, " ");
while (word != NULL){
for (int i = 0; i < word_count; i++){
if (strcasecmp(the_words[i],word) == 0){
for (int i = 0; i < strlen(word); i++){
strcat(result,"*");
}
strcat(result, " ");
break;
}
else{
if (i==(word_count-1)){
strcat(result, word);
strcat(result, " ");
}
}
}
word = strtok(NULL," ");
}
fputs(result, f3p);
fclose(fp);
fclose(f2p);
fclose(f3p);
free(the_words);
}
So this is my C code to replace words from the file called text_filename with asterixs if the word exists in a file called redact_words_filename. However, I noticed during the comparison with the 2 strings
if (strcasecmp(the_words[i],word) == 0){
for (int i = 0; i < strlen(word); i++){
strcat(result,"*");
}
that when I have the word quick for example in both text files, the_words[i] contains a string of length 6 while the one in word contains a string of length 5, both containing the value quick, and so it is not registering as the same string. Why is one of the strings longer than another?
(P.s I apologise for the bad code quality)
Edit 1: Ok so I found out it has to do with \n which is put in at the end of every line. Trying to find a way to solve this.
Edit 2: I managed to get rid of \n through a simple for loop
for (int i = 0; i < word_count; i++){
the_words[i] = malloc(100);
the_words[i] = redact;
for (int j = 0; j < strlen(redact); j++){
if (redact[j] == '\n'){
redact[j] = '\0';
}
}
redact = strtok(NULL, ", ");
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
两个明显的问题就在这里,
the_words
中的 3 个指针分配了空间,但随后您将word_count
个单词放入其中。所以如果 word_count > 3、您将溢出并为每个单词获得未定义的行为the_words[i] = strdup(redact);
来分配适量的内存,并将字符串复制到分配的内存中。Two obvious problems just here
the_words
but then you go and putword_count
words into it. So if word_count > 3, you'll overflow and get undefined behaviorthe_words[i] = strdup(redact);
to both allocate the right amount of memory, and copy the string into the allocated memory.