通过 std::map 进行第 n 次迭代时出现段错误

发布于 2024-11-25 06:45:25 字数 2778 浏览 1 评论 0原文

解决方案:请参阅下面 Bo Persson 的帖子和我的评论。

我的地图出现分段错误。让我困惑的是,对键的 n-1 次迭代有效,但在第 n 次迭代时出现段错误。更令人困惑的是,迭代段出错的键值(使用迭代器)在上一次迭代中找到,甚至在代码的更早位置找到。

我尝试使用 valgrind 分析段错误,但是我得到了消息的无限循环,“信号 11 从线程 0 中删除”。因此,valgrind 不是很有用。

出现段错误的地图称为 site_depths。以下是插入值的方式:

map<string,unsigned short*> site_depths;
map<string,unsigned int>::iterator it;
map<string,unsigned short*>::iterator insert_it;
unsigned int size = 0;
string key = "";

// go through each key pair of CHROM_SIZES to build the site_depth map
for (it=CHROM_SIZES.begin(); it != CHROM_SIZES.end(); it++) {
    key = it->first;
    size = it->second;

    unsigned short *array = new unsigned short[size];

    insert_it = site_depths.end();
    site_depths.insert(insert_it, pair<string,unsigned short*>(key,array));
}

我已检查以确保所有添加的值都有效。密钥和尺寸都打印到控制台。

紧接着,我测试 find() 和 [] 访问是否对 seg 错误的键值起作用(这也有效):

cout << "schill found: " << site_depths.find("lcl_NM_000999")->first << endl;
unsigned short* test_array = site_depths["lcl_NM_000999"];

然后,当我解析文本文件时,它会在 find() 上出现 seg 错误,或者如果我推荐它out,在 [] 访问上:

            string line;
            string chromosome;
            unsigned int start;
            unsigned int end;
            unsigned int i;
            char* values[3];
            unsigned short* sites;
            map<string,unsigned short*>::iterator iter_end = site_depths.end();

        while (getline(in,line)) {
            //use C strtok to tokenize the line
            char cstr[line.size()+1];
            strcpy(cstr,line.c_str());

            char *pch = strtok(cstr, "  ");

            // tokenize three columns
            for (i=0; i<3 || pch != NULL; i++) {
                values[i] = pch;
                pch = strtok(NULL, "    ");
            }

            chromosome = values[0];
            start = atoi(values[1])-1;  //must subtract 1 to correspond to 0 index
            end = atoi(values[2])-1;

            // get appropriate array pointer
            if (site_depths.find(chromosome) == iter_end) {
                cerr << "WARNING: Chromosome name in Input file does not match .len file." << endl;
                cerr << " Exiting script." << endl;
                exit(EXIT_FAILURE);
            }
            sites = site_depths[chromosome];

            // increment over range
            for (i=start; i<end; i++) {
                sites[i]++;
            }
        }

出现段错误的情况是在键“lcl_NM_000998”上,试图找到键“lcl_NM_000999”。这没有意义,因为之前的 getline() 迭代找到了键值“lcl_NM_000998”。我已经通过手动迭代地图来检查以确保情况确实如此。

我已经检查过以确保我之前的代码在之前的代码中没有出现段错误,但是标记化看起来很好。我的代码在测试用例中的这个位置总是出现段错误。有人有想法吗!?

Solution: See Bo Persson's post and my comment below.

I am getting a segmentation fault with my map. What confuses me is that the n-1 iterations over the keys works but then seg faults on the nth iteration. To add to the confusion, the key value that the iteration seg faults on is found (with an iterator) in the pervious iteration and even earlier in the code.

I tried to profile the seg fault with valgrind, however i get an infinite loop of the message, "signal 11 being dropped from thread 0". Hence, valgrind is not very useful.

The map that is seg faulting is called site_depths. Here is how the values are inserted:

map<string,unsigned short*> site_depths;
map<string,unsigned int>::iterator it;
map<string,unsigned short*>::iterator insert_it;
unsigned int size = 0;
string key = "";

// go through each key pair of CHROM_SIZES to build the site_depth map
for (it=CHROM_SIZES.begin(); it != CHROM_SIZES.end(); it++) {
    key = it->first;
    size = it->second;

    unsigned short *array = new unsigned short[size];

    insert_it = site_depths.end();
    site_depths.insert(insert_it, pair<string,unsigned short*>(key,array));
}

I've checked to be sure that all values added work. Both the key and size print to console.

Immediately after, i test to see if find() and [] access work on the key value that seg faults (this also works):

cout << "schill found: " << site_depths.find("lcl_NM_000999")->first << endl;
unsigned short* test_array = site_depths["lcl_NM_000999"];

Then when i parse the text file, it will seg fault on find() or if i commend it out, on the [] access:

            string line;
            string chromosome;
            unsigned int start;
            unsigned int end;
            unsigned int i;
            char* values[3];
            unsigned short* sites;
            map<string,unsigned short*>::iterator iter_end = site_depths.end();

        while (getline(in,line)) {
            //use C strtok to tokenize the line
            char cstr[line.size()+1];
            strcpy(cstr,line.c_str());

            char *pch = strtok(cstr, "  ");

            // tokenize three columns
            for (i=0; i<3 || pch != NULL; i++) {
                values[i] = pch;
                pch = strtok(NULL, "    ");
            }

            chromosome = values[0];
            start = atoi(values[1])-1;  //must subtract 1 to correspond to 0 index
            end = atoi(values[2])-1;

            // get appropriate array pointer
            if (site_depths.find(chromosome) == iter_end) {
                cerr << "WARNING: Chromosome name in Input file does not match .len file." << endl;
                cerr << " Exiting script." << endl;
                exit(EXIT_FAILURE);
            }
            sites = site_depths[chromosome];

            // increment over range
            for (i=start; i<end; i++) {
                sites[i]++;
            }
        }

The case where it segfaults is on key "lcl_NM_000998", trying to find key "lcl_NM_000999". This doesn't make sense as the previous getline() iteration finds the key value "lcl_NM_000998". I've checked to be sure this is the case by manually iterating through the map.

I've checked to be sure that my code before wasn't seg faulting in previous code, but the tokenizing looks fine. My code always segfaults at this location in my test case. Does anyone have ideas!?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

这不仅是对存在性的测试

unsigned short* test_array = site_depths["lcl_NM_000999"];

,而且还向 site_depths 中插入一个节点,但在 second 成员中使用空指针。

该代码还相信 startend 始终在 size 给定的数组大小范围内。验证这一点不会有什么坏处!

This isn't just a test for presence

unsigned short* test_array = site_depths["lcl_NM_000999"];

but also inserts a node into site_depths, but with a null pointer in the second member.

The code also trusts that start and end are always within range of the array size given by size. Wouldn't hurt to validate that!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文