动态选择CSV文件中的特定列

发布于 2025-01-23 15:53:27 字数 943 浏览 3 评论 0原文

我有此CSV文件：

id,name,mark
20203923380,Lisa Hatfield,62
20200705173,Jessica Johnson,59
20205415333,Adam Harper,41
20203326467,Logan Nolan,77

我正在尝试使用此代码处理它：

 try (Stream<String> stream = Files.lines(Paths.get(String.valueOf(csvPath)))) {
                DoubleSummaryStatistics statistics = stream
                        .map(s -> s.split(",")[index]).skip(1)
                        .mapToDouble(Double::valueOf)
                        .summaryStatistics();
} catch (IOException e) // more code

我想通过其名称获取列。

我想我需要验证 index 是用户以整数输入的列的索引，例如：

int index = Arrays.stream(stream).indexOf(columnNS);

但是它不起作用。

该流应该具有以下值，例如：

列：“ mark”

62、59、41、77

原文

I have this CSV file:

id,name,mark
20203923380,Lisa Hatfield,62
20200705173,Jessica Johnson,59
20205415333,Adam Harper,41
20203326467,Logan Nolan,77

And I'm trying to process it with this code:

 try (Stream<String> stream = Files.lines(Paths.get(String.valueOf(csvPath)))) {
                DoubleSummaryStatistics statistics = stream
                        .map(s -> s.split(",")[index]).skip(1)
                        .mapToDouble(Double::valueOf)
                        .summaryStatistics();
} catch (IOException e) // more code

I want to get the column by its name.

I guess I need to validate the index to be the index of the column the user enters as an integer, like this:

int index = Arrays.stream(stream).indexOf(columnNS);

But it doesn't work.

The stream is supposed to have the following values, for example:

Column: "mark"

62, 59, 41, 77

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

成熟的代价 2025-01-30 15:53:27

我需要验证索引是用户以整数输入的列的索引...但是它不起作用。

Arrays.stream(stream).indexOf(columnNS)

流IPA中没有方法indexof。我不确定流（流）是什么意思，但是这种方法是错误的。

为了获得有效的索引，您需要列的名称。并基于 name ，您必须分析从文件中检索到的第一行。就像您的示例中使用列名称“ Mark”一样，您需要找出此名称是否存在于第一行中以及其索引是什么。

我想要的是通过其名称获取列...流是假设的...

流旨在状态。它们是在Java中引入的，以提供表达和清晰的构造代码的方式。即使您设法将状态有条件的逻辑塞入流中，您也会失去这一优势，最终获得比纯循环相比，表现不太清晰的代码（剩余：迭代解决方案几乎总是执行更好的 ）。

因此，您要保持代码清洁，可以选择：使用迭代方法解决此问题，或者放弃以动态在流中动态确定列的索引的要求。

这就是您可以根据loops的列名动态读取文件数据的任务：

public static List<String> readFile(Path path, String columnName) {
    List<String> result = new ArrayList<>();
    try(var reader = Files.newBufferedReader(path)) {
        int index = -1;
        String line;
        while ((line = reader.readLine()) != null) {
            String[] arr = line.split("\\p{Punct}");
            if (index == -1) {
                index = getIndex(arr, columnName);
                continue; // skipping the first line
            }
            result.add(arr[index]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return result;
}
// validation logic resides here
public static int getIndex(String[] arr, String columnName) {
    int index = Arrays.asList(arr).indexOf(columnName);
    if (index == -1) {
        throw new IllegalArgumentException("Given column name '" + columnName + "' wasn't found");
    }
    return index;
}
// extracting statistics from the file data
public static DoubleSummaryStatistics getStat(List<String> list) {
    return list.stream()
        .mapToDouble(Double::parseDouble)
        .summaryStatistics();
}

public static void main(String[] args) {
    DoubleSummaryStatistics stat = getStat(readFile(Path.of("test.txt"), "mark"));
}

I need to validate the index to be the index of the column the user enters as an integer ... But it doesn't work.

Arrays.stream(stream).indexOf(columnNS)

There is no method indexOf in the Stream IPA. I'm not sure what did you mean by stream(stream) but this approach is wrong.

In order to obtain the valid index, you need the name of the column. And based on the name, you have to analyze the very first line retrieved from the file. Like in your example with column name "mark", you need to find out whether this name is present in the first row and what its index is.

What I want is to get the column by it's name ... The stream is supposed ...

Streams are intended to be stateful. They were introduced in Java in order to provide to expressive and clear way of structuring the code. And even if you manage to cram stateful conditional logic into a stream, you'll lose this advantage and end up with convoluted code that is less clear performant than plain loop (remainder: iterative solution almost always performs better).

So you want to keep your code clean, you can choose either: to solve this problem using iterative approach or relinquish the requirement to determine the index of the column dynamically inside the stream.

That's how you can address the task of reading the file data dynamically based on the column name with loops:

public static List<String> readFile(Path path, String columnName) {
    List<String> result = new ArrayList<>();
    try(var reader = Files.newBufferedReader(path)) {
        int index = -1;
        String line;
        while ((line = reader.readLine()) != null) {
            String[] arr = line.split("\\p{Punct}");
            if (index == -1) {
                index = getIndex(arr, columnName);
                continue; // skipping the first line
            }
            result.add(arr[index]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return result;
}
// validation logic resides here
public static int getIndex(String[] arr, String columnName) {
    int index = Arrays.asList(arr).indexOf(columnName);
    if (index == -1) {
        throw new IllegalArgumentException("Given column name '" + columnName + "' wasn't found");
    }
    return index;
}
// extracting statistics from the file data
public static DoubleSummaryStatistics getStat(List<String> list) {
    return list.stream()
        .mapToDouble(Double::parseDouble)
        .summaryStatistics();
}

public static void main(String[] args) {
    DoubleSummaryStatistics stat = getStat(readFile(Path.of("test.txt"), "mark"));
}

回复收藏 0 原文

~没有更多了~