用 C(或只是算法)编写程序来搜索和替换文件中的字符的有效方法是什么?
用户将在运行时提供 2 个字符串,例如“asdf”“qwer”, 现在,每个出现的“a”都应替换为“q”,“s”替换为“w”,“d”替换为“e”,“f”替换为“r” 字符串的长度可能会有所不同。 现在的重点是要操作的文件很大,3-4 TB,所以我们需要一个效率为“n”或“n(log(n))”的高效程序,一系列 if...else不会有帮助。 给出的提示是: 1.>该文件没有特殊字符或空格。它仅由小写字符组成 2.> 程序应该利用文件中只有 26 个字符的事实。 3.>最后,使用字符的 ascii 值以某种方式完成解决方案。
其他细节 文件应该是关于一个人的论文,所以它不是一个序列。 是的,我们必须按顺序读取整个文件,唯一不应该做的就是对每个字符进行比较,即 if(a)then(q)elseif(s)then(w)....something...更有效率???
请帮忙
the user will provide 2 string at run-time such as "asdf" "qwer",
now every occurrence of 'a' should be replaced by 'q', 's' by 'w', 'd' by 'e' and 'f' by 'r'
the length of string may vary.
now the point is the file to be operated on is huge, 3-4 terabytes,so we need an efficient program of an efficiency of "n" or "n(log(n))", a sequence of if...else wont help.
hints given are:
1.>the file has no special characters or white spaces. It just consists of LOWER CASE characters
2.>the program should use the fact that there are only 26 characters in the file.
3.>finally the solution is someway done using the ascii values of the characters.
Additional Details
File is supposed to be a thesis on a person,so its not a sequence.
and ya we have to read the whole file sequentially,the only thing that should not be done is a comparison for every character,that is if(a)then(q)elseif(s)then(w)....somethin… more efficient???
Please help
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在程序开头创建一个包含 26 个字符的数组。然后替换这个数组中你想要的那些。然后解析整个文件,用表值替换每个字符。
由于文件很大,我跳过了文件的读写以及分块。
Create an array at the beginning of the program containing 26 characters. Then replace the ones you want in this array. Then parse the whole file replacing every characters with your table values.
I've skipped read and write of the file as well as the chunking since file is huge.
您首先搜索“要替换”字符串中的第一个字符,一旦找到一个实例,您就开始处理“要替换”字符串,检查每个后续字符,如果找到完全匹配,那么您将替代品。
如果字符串的长度并不总是相同,您将需要读入文件并将修改后的文件写出?我建议这将分块完成,除非您可以在内存中托管 4TB。
基本的伪代码是:
You'd start by searching for the first character in the 'to be replaced' string, once you found an instance you start working through your 'to be replaced' string checking each subsequent character, if a complete match is found then you make the replacement.
If the strings are not always the same length, you are going to need to read the file in and write the modified file out? I'd suggest that this would be done in chunks, unless you can host 4TB in memory.
The basic pseudo code would be: