如何在文本文件中计算分配运算符?
我的任务是在C ++中创建一个程序,该程序以顺序模式处理文本文件。数据必须一次从文件一行中读取。请勿将文件的全部内容备份到RAM。文本文件包含句法正确的C ++代码,我必须计算其中有多少个分配运算符。
我唯一能想到的是制作一个搜索模式的功能,然后计算出它们出现的次数。我将每个分配操作员插入模式,然后将所有计数总结在一起。但这是不起作用的,因为如果我插入模式“ =”许多操作员,例如“%=”或“+=”,也被计数。甚至诸如“!=”或“ = =”之类的操作员被计数,但它们都被计数不应该是因为它们是比较操作员。
我的代码给出了答案7,但真正的答案应为5。
#include <iostream>
#include <fstream>
using namespace std;
int patternCounting(string pattern, string text){
int x = pattern.size();
int y = text.size();
int rez = 0;
for(int i=0; i<=y-x; i++){
int j;
for(j=0; j<x; j++)
if(text[i+j] !=pattern[j]) break;
if(j==x) rez++;
}
return rez;
}
int main()
{
fstream file ("test.txt", ios::in);
string rinda;
int skaits=0;
if(!file){cout<<"Nav faila!"<<endl; return 47;}
while(file.good()){
getline(file, rinda);
skaits+=patternCounting("=",rinda);
skaits+=patternCounting("+=",rinda);
skaits+=patternCounting("*=",rinda);
skaits+=patternCounting("-=",rinda);
skaits+=patternCounting("/=",rinda);
skaits+=patternCounting("%=",rinda);
}
cout<<skaits<<endl;
return 0;
}
文本文件的内容:
#include <iostream>
using namespace std;
int main()
{
int z=3;
int x=4;
for(int i=3; i<3; i++){
int f+=x;
float g%=3;
}
}
请注意,作为酷刑测试,以下代码在较旧的C ++标准上具有0个作业,而在较新的标准上,由于取消了Trigraphs,因此具有0个作业。
// = Torture test
int a = 0; int b = 1;
int main()
{
// The next line is part of this comment until C++17 ??/
a = b;
struct S
{
virtual void foo() = 0;
void foo(int, int x = 1);
S& operator=(const S&) = delete;
int m = '==';
char c = '=';
};
const char* s = [=]{return "=";}();
sizeof(a = b);
decltype(a = b) c(a);
}
My task is to create a program in C ++ that processes a text file in sequential mode. The data must be read from the file one line at a time. Do not back up the entire contents of the file to RAM. The text file contains syntactically correct C++ code and I have to count how many assignment operators are there.
The only thing I could think of was making a function that searches for patterns and then counts how many times they appear. I insert every assignment operator as a pattern and then sum all the counts together. But this does not work because if I insert the pattern "=" many operators such as "%=" or "+=" also get counted in. And even operators like "!=" or "==" get counted, but they shouldn't because they are comparison operators.
My code gives the answer 7 but the real answer should be 5.
#include <iostream>
#include <fstream>
using namespace std;
int patternCounting(string pattern, string text){
int x = pattern.size();
int y = text.size();
int rez = 0;
for(int i=0; i<=y-x; i++){
int j;
for(j=0; j<x; j++)
if(text[i+j] !=pattern[j]) break;
if(j==x) rez++;
}
return rez;
}
int main()
{
fstream file ("test.txt", ios::in);
string rinda;
int skaits=0;
if(!file){cout<<"Nav faila!"<<endl; return 47;}
while(file.good()){
getline(file, rinda);
skaits+=patternCounting("=",rinda);
skaits+=patternCounting("+=",rinda);
skaits+=patternCounting("*=",rinda);
skaits+=patternCounting("-=",rinda);
skaits+=patternCounting("/=",rinda);
skaits+=patternCounting("%=",rinda);
}
cout<<skaits<<endl;
return 0;
}
Contents of the text file:
#include <iostream>
using namespace std;
int main()
{
int z=3;
int x=4;
for(int i=3; i<3; i++){
int f+=x;
float g%=3;
}
}
Note that as a torture test, the following code has 0 assignments on older C++ standards and one on newer ones, due to the abolition of trigraphs.
// = Torture test
int a = 0; int b = 1;
int main()
{
// The next line is part of this comment until C++17 ??/
a = b;
struct S
{
virtual void foo() = 0;
void foo(int, int x = 1);
S& operator=(const S&) = delete;
int m = '==';
char c = '=';
};
const char* s = [=]{return "=";}();
sizeof(a = b);
decltype(a = b) c(a);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
代码有多个问题。
第一个相当平凡的问题是您对文件读数的处理。诸如
之类的循环while(file.good())…
实际上是始终一个错误:您需要测试getline
而不是测试返回值呢接下来,您的
patterncounting
函数从根本上起作用,因为它不考虑注释和字符串(也没有C ++的其他特点,但这些似乎不符合您的作业范围)。分别计算不同的分配运算符也是没有意义的。第三个问题是您的测试案例会错过许多边缘案例(并且是无效的C ++)。这是一个更好的测试案例,即(我认为)从您的作业中练习所有有趣的边缘案例:
我已经注释了每一行,并注明了我们现在应该计算多少事件。
现在我们有了测试用例,我们可以开始实现实际的计数逻辑。请注意,对于可读性,功能名称通常遵循模式“动词主题”。因此,代替
staterscounting
更好的函数名称将是countpattern
。但是我们不会计算任意模式,我们将计算任务。因此,我将使用count Assignments
(或者,使用我首选的C ++命名约定:count_assignments
)。现在,此功能需要做什么?
=
的出现不是分配:没有专用的C ++解析器,这是一个相当艰巨的任务!您将需要实现基本的词汇分析仪 )对于C ++。
首先,您将需要表示我们关心的每种情况:
借助此情况,我们可以开始编写
count_assignments
函数的轮廓:您可以看到,我们迭代了在字符串的字符上(
for(c:str)
)。接下来,我们处理当前可能处于的每个状态。 ,但是
/=
是我们要计算的分配!)。这有点有些黑客 - 对于真正的Lexer,我会将这些案例分为不同的状态。对于功能骨架而言,这么多。现在,我们需要实现实际逻辑 - 即,我们需要根据当前(和先前的)字符和当前状态来决定该做什么。
为了让您开始,这是
案例状态::开始
:非常小心:以上将超过比较
==
,因此我们需要调整计数一旦我们在案例状态::比较
中,请看到当前和上一个字符都是=
。我会让您对其余的实施。
请注意,与您的初始尝试不同,此实现不会区分单独的分配运算符(
=
,+=
等),因为没有必要: '所有人都自动计数。There are multiple issues with the code.
The first, rather mundane issue, is your handling of file reading. A loop such as
while (file.good()) …
is virtually always an error: you need to test the return value ofgetline
instead!Next, your
patternCounting
function fundamentally won’t work since it doesn’t account for comments and strings (nor any of C++’s other peculiarities, but these seem to be out of scope for your assignment). It also doesn’t really make sense to count different assignment operators separately.The third issue is that your test case misses lots of edge cases (and is invalid C++). Here’s a better test case that (I think) exercises all interesting edge cases from your assignment:
I’ve annotated each line with a comment indicating up to how many occurrences we should have counted by now.
Now that we have a test case, we can start implementing the actual counting logic. Note that, for readability, function names generally follow the pattern “verb subject”. So instead of
patternCounting
a better function name would becountPattern
. But we won’t count arbitrary patterns, we will count assignments. So I’ll usecountAssignments
(or, using my preferred C++ naming convention:count_assignments
).Now, what does this function need to do?
=
that are not assignments:Without a dedicated C++ parser, that’s a rather tall order! You will need to implement a rudimentary lexical analyser (short: lexer) for C++.
First off, you will need to represent each of the situations we care about with its own state:
With this in hand, we can start writing the outline of the
count_assignments
function:As you can see, we iterate over the characters of the string (
for (c : str)
). Next, we handle each state we could be currently in.The
prev_char
is necessary because some of our lexical tokens are more than one character in length (e.g. comments start by//
, but/=
is an assignment that we want to count!). This is a bit of a hack — for a real lexer I would split such cases into distinct states.So much for the function skeleton. Now we need to implement the actual logic — i.e. we need to decide what to do depending on the current (and previous) character and the current state.
To get you started, here’s the
case state::start
:Be very careful: the above will over-count the comparison
==
, so we will need to adjust that count once we’re insidecase state::comparison
and see that the current and previous character are both=
.I’ll let you take a stab at the rest of the implementation.
Note that, unlike your initial attempt, this implementation doesn’t distinguish the separate assignment operators (
=
,+=
, etc.) because there’s no need to do so: they’re all counted automatically.Clang编译器具有倾倒语法树(也称为AST)的功能。如果您具有句法正确的C ++代码(您没有),则可以计算分配运算符的数量,例如使用以下命令行(在Unixoid OS上):
请注意,这只会匹配真实的评估,而不是复制初始化,也可以使用
=
字符,但在句法上是不同的(例如,在这种情况下未调用Overloaded=
运算符)。如果要计算复合分配和/或复制初始化,则可以尝试在输出AST中查找相应的行,并将它们添加到
egrep
搜索模式中。The clang compiler has a feature to dump the syntax tree (also called AST). If you have syntactically correct C++ code (which you don't have), you can count the number of assignment operators for example with the following command line (on a unixoid OS):
Note however that this will only match real assigments, not copy initializations, which also can use the
=
character, but are something syntactically different (for example an overloaded=
operator is not called in that case).If you want to count the compound assignments and/or the copy initializations as well, you can try to look for the corresponding lines in the output AST and add them to the
egrep
search pattern.实际上,您的任务非常困难。
想想例如c ++ 原始字符串文字源线,内部有任意
=
)。或asm
语句做一些添加的语句。...也要考虑(对于某些声明的
int x;
)ax ++
(哪个对于简单变量,相当于x = x+1;
,而语义上是一个分配运算符 - 但不是语法)。我的建议:选择一个开源C ++编译器。我碰巧知道 gcc 内部。
使用GCC,您可以编写自己的 gcc plugin gimple sistions。
还要想想 quine nore programs 编码 c ++
... 。
In practice, your task is incredibly difficult.
Think for example of C++ raw string literals (you could have one spanning dozen of source lines, with arbitrary
=
inside them). Or ofasm
statements doing some addition....Think also of increment operators like (for some declared
int x;
) ax++
(which is equivalent tox = x+1;
for a simple variable, and semantically is an assignment operator - but not syntactically).My suggestion: choose one open source C++ compiler. I happen to know GCC internals.
With GCC, you can write your own GCC plugin which would count the number of Gimple assignments.
Think also of Quine programs coded in C++...
NB: budget months of work.