如何使用Java中的hashmap记录频率更快?
我正在研究一个决策树,该算法从文件中具有记录字符串频率的一部分。该文件有30,000个案例和1.68MB的大小。
我尝试使用hashmap来做到这一点,在我的主要算法代码中,替换方法运行约9000万次,并花了我大约30秒。我能做得更快吗?
下面有我的主要算法代码的简化代码,我花了大约10秒钟的时间。
Map<String, Integer> classesCount = new HashMap<>();
int target = 900000000;
classesCount.put("a", 0);
classesCount.put("b", 0);
for(int i = 0; i < target; i++) {
if (i % 2 == 0) {
classesCount.replace("a", classesCount.get("a") + 1);
}
else {
classesCount.replace("b", classesCount.get("b") + 1);
}
}
为了使我的实际代码更加清楚,我有一个类值,并且我在主方法中有一个值类别,这是下面的值类。
public class Value<T extends Comparable<T>> implements Comparable<Value<T>> {
public T value;
public String result;
public Value(T value, String result) {
this.value = value;
this.result = result;
}
public int compareTo(Value<T> v) {
return value.compareTo(v.value);
}
}
这是下面的主要方法代码。假设ArrayofValue已经有很多元素,并且每个值的结果都具有“ A”和“ B”:
Map<String, Integer> classesCountA = new HashMap<>();
Map<String, Integer> classesCountB = new HashMap<>();
Value[] arrayOfValue = new Value[];
int splitIndex = 55;
classesCountA.put("a", 0);
classesCountA.put("b", 0);
classesCountB.put("a", 0);
classesCountB.put("b", 0);
for(int i = 0; i < arrayOfValue.length; i++) {
if(i < splitIndex) {
classesCountA.replace(arrayOfValue[i].result, classesCount.get(arrayOfValue[i].result) + 1);
}
else {
classesCountB.replace(arrayOfValue[i].result, classesCount.get(arrayOfValue[i].result) + 1);
}
}
I'm studying a decision tree, and the algorithm has a part of record string frequency from file. This file have 30,000 cases and 1.68MB size.
I try to using HashMap to do this, in my main algorithm code, the replace method run about 900 milion times and took me about 30 seconds. Any way I can do it faster?
There are simplify code of my main algorithm code below, it took me about 10 second.
Map<String, Integer> classesCount = new HashMap<>();
int target = 900000000;
classesCount.put("a", 0);
classesCount.put("b", 0);
for(int i = 0; i < target; i++) {
if (i % 2 == 0) {
classesCount.replace("a", classesCount.get("a") + 1);
}
else {
classesCount.replace("b", classesCount.get("b") + 1);
}
}
To make it more clear my actual code, I have a class Value, and I have an array of Value class in main method, this is Values class as below.
public class Value<T extends Comparable<T>> implements Comparable<Value<T>> {
public T value;
public String result;
public Value(T value, String result) {
this.value = value;
this.result = result;
}
public int compareTo(Value<T> v) {
return value.compareTo(v.value);
}
}
this is main method code as below. assume arrayOfValue already have many element and every Value's result just have "a" and "b":
Map<String, Integer> classesCountA = new HashMap<>();
Map<String, Integer> classesCountB = new HashMap<>();
Value[] arrayOfValue = new Value[];
int splitIndex = 55;
classesCountA.put("a", 0);
classesCountA.put("b", 0);
classesCountB.put("a", 0);
classesCountB.put("b", 0);
for(int i = 0; i < arrayOfValue.length; i++) {
if(i < splitIndex) {
classesCountA.replace(arrayOfValue[i].result, classesCount.get(arrayOfValue[i].result) + 1);
}
else {
classesCountB.replace(arrayOfValue[i].result, classesCount.get(arrayOfValue[i].result) + 1);
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您根本不需要替换地图的值。与钥匙相比,允许映射值是可变的,因此您只需要一个可变的结构来保持每个值的频率。
因此,您可以这样做(简化):
在我的计算机上,它比
替换速度快5倍(键,get(key) + 1)
方法。You don't need to replace the map's value at all. In contrast to keys map values are allowed to be mutable so all you need is a mutable structure to hold the frequency for each value.
Thus you could do it like this (simplified):
On my machine that's about 5x faster than the
replace(key, get(key) + 1)
approach.