seekg不去档案开始

发布于 2025-02-01 14:44:43 字数 942 浏览 4 评论 0原文

我正在尝试制作一个随机名称生成器。问题在于,要获取文件中的行计数,我必须循环浏览它。

因此,当我需要在getRandomName()中再次循环浏览它以获取名称时,它已经到达文件的末尾,

我尝试使用seekg(0,std ::: ios :: beg),但由于某种原因行不通。

int getLineCount(std::fstream &names) {
  int count{};
  while (names) {
    std::string name;
    getline(names, name);
    ++count;
  };
  // last line is empty
  return count - 1;
}

std::string getRandomName(std::fstream &names, int lineCount) {
  int randomNum{getRandomNumber(1, lineCount)};
  std::string name;
  names.seekg(1, std::ios::beg); // here i try to go to the beginning but it doesnt work
  for (int i{0}; i < randomNum; ++i) {
    names >> name;
  };
  return name;
};

int main() {
  std::srand(static_cast<unsigned int>(std::time(nullptr)));
  std::rand();
  std::fstream names{"names.txt"};

  int lineCount{getLineCount(names)};
  std::cout << getRandomName(names, lineCount);
}

I'm trying to make a random name generator. The problem is that in order to get the count of lines in the file, I have to loop through it.

So when I need to loop through it again in getRandomName() to get a name, it has already reached the end of the file

I tried solving the issue with seekg(0, std::ios::beg) but it doesn't work for some reason.

int getLineCount(std::fstream &names) {
  int count{};
  while (names) {
    std::string name;
    getline(names, name);
    ++count;
  };
  // last line is empty
  return count - 1;
}

std::string getRandomName(std::fstream &names, int lineCount) {
  int randomNum{getRandomNumber(1, lineCount)};
  std::string name;
  names.seekg(1, std::ios::beg); // here i try to go to the beginning but it doesnt work
  for (int i{0}; i < randomNum; ++i) {
    names >> name;
  };
  return name;
};

int main() {
  std::srand(static_cast<unsigned int>(std::time(nullptr)));
  std::rand();
  std::fstream names{"names.txt"};

  int lineCount{getLineCount(names)};
  std::cout << getRandomName(names, lineCount);
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

昵称有卵用 2025-02-08 14:44:43

问题在于,文件流认为到达文件末尾是错误条件,并将根据失败位和EOF-BIT设置。只要该状态持续存在,任何进一步的文件操作都会失败。您可以通过 clear错误状态 - 如果您这样做,那么您将能够按预期进行。

如果经常需要这些查找,那么可能值得考虑在std :: vector&lt; std :: string&gt; - - 如果您必须处理非常大的数据(因此会引起分页效果)这将更加有效。即使具有进行分页效果,但是有足够大的磁盘空间,您仍然可以在每个查找方面都变得更好,因为您最多必须从磁盘上加载一个内存页面。

如果您只需要一次查找,那么您可能会完全相处,而没有getlinecount函数 - 从整个最大范围中选择一个随机值,然后数量数量,直到找到所需的行 - 到达文件的末尾。如果后者发生,请根据发现和迭代文件又重新计算随机索引。您的文件越大,您只需要一次迭代一次就越大,如果您仍然需要进行两次,无论如何都不会丢失...但是请注意,此方法需要您的随机数生成器同等分布的随机数生成器随机数!

这也适用于多个呼叫,尽管收益的机会变得越来越小,就像每一次呼叫的机会都有超出文件大小至少增加一旦增加的机会。

The problem is that a file stream considers reaching the end of the file as an error condition and sets the according bits, both the fail-bit and the EOF-bit. As long as this state persists, any further file operations fail. You can set the stream back to the normal operating state by clearing the error state, though – if you do so, then you'll be able to proceed as intended.

If need those lookups frequently then it might be worth to consider buffering the data lines within a std::vector<std::string> – unless if you have to handle extremely large data (thus provoking paging effects) this would be far more efficient. Even with paging effects, but with large enough disk space available you still get better for every lookup as you'd have to load at most one memory page back from disk.

If you need the lookup just once then you might get along without the getLineCount function entirely – select a random value from entire maximum range and just count the number of lines until you found the desired line – or the end of file got reached. If the latter happens, then recalculate the random index based upon the number of lines found and iterate of over the file again. The larger your file is, the greater is the chance that you only need to iterate once, and if you still need to do twice, nothing is lost anyway... Note, though, that this approach requires your random number generator generating equally distributed random numbers!

This would work for multiple calls as well, though the chance of a benefit get's smaller as with every further call the chance of reading beyond file size at least once increases.

无人接听 2025-02-08 14:44:43

您的getlinecount()函数with getline> getline> getline () 通过文件,直到什么都无法读取为止。到达结尾时,错误状态已设置为使用,用使用eofbit

流中的所有后续操作都将失败,包括seekg(0,std :: ios :: beg);,直到您 names.clear(); 错误状态。

顺便说一句,在getline()上循环避免getline()在循环主体中失败,并使-1不必要。您可以做的另一件事是使您的函数中立在文件的读取位置中。它是可选的BU,我会更与您的功能名称一致,这表明它只是得到了一些东西,而不是它将流到最后的流程。

int getLineCount(std::fstream &names) {
    int count{};
    std::string name;
    auto old_pos = names.tellg();   // backup current position
    while (getline(names, name)) 
        ++count;
    names.clear();                  // reset eof error caused by loop
    names.seekg (old_pos, std::ios::beg);  // restore position        
    return count;
}

与随机位置无关,

您的位置可能会导致不一致,如果一行上的名称可以包含whitespaces,因为&gt;&gt;读取空间分离的字符串而不是完整的行。例如,如果您的文件有两行:

 Bjarne Stroustrup
 B.W.Kernighan

您的随机读取可以返回bjarnestroustrup ,但从不。因此,请再次使用getline()在计数随机行时更好地读取它们。

Your getLineCount() functions reads with getline() through the file until nothing can be read anymore. When it arrives at the end, an error state is set, with eofbit.

All subsequent actions on the stream will fail, including seekg(0, std::ios::beg);, until you names.clear(); the error state.

By the way, looping on getline() avoids getline() to fail in the loop body, and makes the -1 unnecessary. Another thing you could do is to make your function neutral for the read position of the file. It's optional bu would me more consistent with the name of your function which suggests that it just gets something, not that it consumes the stream to the end.

int getLineCount(std::fstream &names) {
    int count{};
    std::string name;
    auto old_pos = names.tellg();   // backup current position
    while (getline(names, name)) 
        ++count;
    names.clear();                  // reset eof error caused by loop
    names.seekg (old_pos, std::ios::beg);  // restore position        
    return count;
}

Not related

Your random position might lead to inconsistencies, if names on a line can include whitespaces, because >> reads space separated strings and not full lines. E.g. if your file has two lines:

 Bjarne Stroustrup
 B.W.Kernighan

Your random read could return Bjarne or Stroustrup but never B.W.Kernighan because there are 2 lines but 3 space separated strings. So better read the random line as you count them, using getline() again.

迟到的我 2025-02-08 14:44:43

您的第一步应该从添加日志开始(我还修复了次要问题,例如读取数据的不一致,1而不是 0 等)。

#define LOG(x) std::cerr << __LINE__ << " " #x " = "<< x << '\n'

int getRandomNumber(int a, int b)
{
    static std::random_device rd; 
    static std::mt19937 gen(rd());
    std::uniform_int_distribution<int> distrib(a, b);

    return distrib(gen);
}

int getLineCount(std::istream &names) {
  int count{};
  std::string name;
  while (getline(names, name)) {
    ++count;
    LOG(count);
    LOG(names.tellg());
  };
  return count - name.empty();
}

std::string getRandomName(std::istream &names, int lineCount) {
  int randomNum{getRandomNumber(1, lineCount)};
  LOG(randomNum);
  std::string name;
  LOG(names.tellg());
  names.seekg(0, std::ios::beg);
  LOG(names.tellg());
  for (int i{0}; i < randomNum; ++i) {
    getline(names, name);
  };
  return name;
};

int main() {
  std::ifstream names{"names.txt"};
  LOG(names.tellg());

  int lineCount{getLineCount(names)};
  LOG(lineCount);
  std::cout << getRandomName(names, lineCount);
}

这会产生此输出 https://wandbox.org/permlink/permlink/ppermlink/eplhqmbg1s8awe9f </

44 names.tellg() = 0
23 count = 1
24 names.tellg() = 13
23 count = 2
24 names.tellg() = 27
23 count = 3
24 names.tellg() = 40
23 count = 4
24 names.tellg() = 53
23 count = 5
24 names.tellg() = -1
47 lineCount = 5
31 randomNum = 4
33 names.tellg() = -1
35 names.tellg() = -1

code /代码>表示流处于错误状态。

这很明显,您已经读取文件到末尾,因此尝试读取beound文件,并设置了错误标志。

设置错误标志后,流程将无法使用,直到标志被清除为止。因此,只需在适当地点修复问题中添加names.clear(); https:/// wandbox.org/permlink/us6b3jw3v6jfwlpx

Your first step should start from adding logs (I've also fixed minor issues, like inconsistent reading of data, 1 instead 0 and so on).

#define LOG(x) std::cerr << __LINE__ << " " #x " = "<< x << '\n'

int getRandomNumber(int a, int b)
{
    static std::random_device rd; 
    static std::mt19937 gen(rd());
    std::uniform_int_distribution<int> distrib(a, b);

    return distrib(gen);
}

int getLineCount(std::istream &names) {
  int count{};
  std::string name;
  while (getline(names, name)) {
    ++count;
    LOG(count);
    LOG(names.tellg());
  };
  return count - name.empty();
}

std::string getRandomName(std::istream &names, int lineCount) {
  int randomNum{getRandomNumber(1, lineCount)};
  LOG(randomNum);
  std::string name;
  LOG(names.tellg());
  names.seekg(0, std::ios::beg);
  LOG(names.tellg());
  for (int i{0}; i < randomNum; ++i) {
    getline(names, name);
  };
  return name;
};

int main() {
  std::ifstream names{"names.txt"};
  LOG(names.tellg());

  int lineCount{getLineCount(names)};
  LOG(lineCount);
  std::cout << getRandomName(names, lineCount);
}

This produces this output https://wandbox.org/permlink/EPLHqMBg1s8Awe9F :

44 names.tellg() = 0
23 count = 1
24 names.tellg() = 13
23 count = 2
24 names.tellg() = 27
23 count = 3
24 names.tellg() = 40
23 count = 4
24 names.tellg() = 53
23 count = 5
24 names.tellg() = -1
47 lineCount = 5
31 randomNum = 4
33 names.tellg() = -1
35 names.tellg() = -1

-1 indicates that stream is in error state.

And this is obvious you have read file to the end, so there was attempt to read beound file and error flag is set.

When error flag is set, stream is unusable until flag is cleared. So just adding names.clear(); in proper place fixes issue: https://wandbox.org/permlink/Us6b3Jw3v6JFwlpX

傲鸠 2025-02-08 14:44:43

在您的getlinecount()函数中使用while(names.peek()!= eof)而不是。

In your getLineCount() function use while (names.peek() != EOF) instead.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文