通过 for 循环了解 Java 的多线程
我有一些我认为无法多线程的代码,也许我错了。我想在集群系统上执行此代码,但我不确定如何针对此类部署扩展它。
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintStream;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
public class Coord {
public int a,b,c,d,e,f;
public static void main(String[] args) throws IOException {
FileOutputStream out = new FileOutputStream("/Users/evanlivingston/2b.txt");
PrintStream pout = new PrintStream(out);
Scanner sc = new Scanner(new File("/Users/evanlivingston/1.txt"));
List<Coord> coords = new ArrayList<Coord>();{
// for each line in the file
while(sc.hasNextLine()) {
String[] numstrs = sc.nextLine().split("\\s+");
Coord c = new Coord();
c.a = Integer.parseInt(numstrs[1]);
c.b = Integer.parseInt(numstrs[2]);
c.c = Integer.parseInt(numstrs[3]);
c.d = Integer.parseInt(numstrs[4]);
c.e = Integer.parseInt(numstrs[5]);
c.f = Integer.parseInt(numstrs[6]);
coords.add(c);
}
// now you have all coords in memory
{
for(int i=0; i<coords.size(); i++ )
for( int j=0; j<coords.size(); j++)
{
Coord c1 = coords.get(i);
Coord c2 = coords.get(j);
double foo = ((c1.a - c2.a) * (c1.a - c2.a)) *1 ;
double goo = ((c1.b - c2.b) * (c1.b - c2.b)) *1 ;
double hoo = ((c1.c - c2.c) * (c1.c - c2.c)) *2 ;
double joo = ((c1.d - c2.d) * (c1.d - c2.d)) *2 ;
double koo = ((c1.e - c2.e) * (c1.e - c2.e)) *4 ;
double loo = ((c1.f - c2.f) * (c1.f - c2.f)) *4 ;
double zoo = Math.sqrt(foo + goo + hoo + joo + koo + loo);
DecimalFormat df = new DecimalFormat("#.###");
pout.println(i + " " + j + " " + df.format(zoo));
System.out.println(i);
}
pout.flush();
pout.close();
}
}
}
}
我感谢任何人可以提供的任何帮助。
I've got some code that I don't think is able to be multithreaded, perhaps I'm wrong. I'd like to make execute this code on a clustered system but I'm unsure of how to scale it for such a deployment.
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintStream;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
public class Coord {
public int a,b,c,d,e,f;
public static void main(String[] args) throws IOException {
FileOutputStream out = new FileOutputStream("/Users/evanlivingston/2b.txt");
PrintStream pout = new PrintStream(out);
Scanner sc = new Scanner(new File("/Users/evanlivingston/1.txt"));
List<Coord> coords = new ArrayList<Coord>();{
// for each line in the file
while(sc.hasNextLine()) {
String[] numstrs = sc.nextLine().split("\\s+");
Coord c = new Coord();
c.a = Integer.parseInt(numstrs[1]);
c.b = Integer.parseInt(numstrs[2]);
c.c = Integer.parseInt(numstrs[3]);
c.d = Integer.parseInt(numstrs[4]);
c.e = Integer.parseInt(numstrs[5]);
c.f = Integer.parseInt(numstrs[6]);
coords.add(c);
}
// now you have all coords in memory
{
for(int i=0; i<coords.size(); i++ )
for( int j=0; j<coords.size(); j++)
{
Coord c1 = coords.get(i);
Coord c2 = coords.get(j);
double foo = ((c1.a - c2.a) * (c1.a - c2.a)) *1 ;
double goo = ((c1.b - c2.b) * (c1.b - c2.b)) *1 ;
double hoo = ((c1.c - c2.c) * (c1.c - c2.c)) *2 ;
double joo = ((c1.d - c2.d) * (c1.d - c2.d)) *2 ;
double koo = ((c1.e - c2.e) * (c1.e - c2.e)) *4 ;
double loo = ((c1.f - c2.f) * (c1.f - c2.f)) *4 ;
double zoo = Math.sqrt(foo + goo + hoo + joo + koo + loo);
DecimalFormat df = new DecimalFormat("#.###");
pout.println(i + " " + j + " " + df.format(zoo));
System.out.println(i);
}
pout.flush();
pout.close();
}
}
}
}
I appreciate any help anyone can offer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
将内部 for 循环拆分为单独的任务看起来是使该过程成为多线程的一个很好的选择。这是使用 ExecutorService 和 Futures 实现
集群的一种方法,我认为 Hazelcast 提供了一个很好的解决方案,它允许您定义共享的 ExecutorService 和共享的 Collections。您将需要两种类型的节点,单个节点负责所有 I/O 并创建坐标列表以及提交任务。以及一个简单执行任务的处理节点。这就是我对如何做这件事的全部看法。但是,如果您的数据集足够小以适合内存,则可能不值得花费如此多的精力来拆分处理。
Splitting the inner for loop into separate tasks looks like a good candidate for where to make this process multithreaded. Here is one way this could be done with an ExecutorService and Futures
For clustering, I think Hazelcast offers a good solution that will allow you to define a shared ExecutorService and shared Collections. You would need two flavors of nodes, the single node responsible for all I/O and creating the list of Coords as well as submitting the tasks. And a processing node which simply executes the tasks. That is all my opinion of how I might do it. However, if your dataset is small enough to fit in memory it is likely not worth the effort to split up the processing this much.
对我来说它看起来非常可并行。为什么不让线程一次处理一行数据?您可以使用
AtomicInteger
来记录工作线程已声明的行数。每个线程都会执行 counter.getAndIncrement 来获取要处理的行(如果它返回 coords.size() 或更高,则线程应该终止),然后执行所有操作该行的数学运算,然后重复。打印可能会乱序,但您可以用结果填充一些缓冲区,然后在最后快速打印所有内容。
It looks very parallelizable to me. Why don't you have threads process one row of data at a time? You could use an
AtomicInteger
to keep a count of how many rows have been claimed by worker threads. Each thread would do acounter.getAndIncrement
to get a row to work on (if it returnscoords.size()
or higher, the thread should terminate), then do all the math for that row, and repeat.The printing would be out of order, but you could instead fill some buffers with the results, then quickly print everything at the end.