解析大文本文件效率
为了简单起见,我有一个文本文件,其中的条目用 ; 分隔。并将每一行解析为一个对象。问题是该文本文件包含近 10 000 行。
我还需要为正在解析的每个对象创建键,以便我可以在搜索界面中过滤结果。
在模拟器中解析文本并添加键大约需要 16 秒。我在这里做错了什么吗?或者有更有效的方法吗?
这是我的数据库单例:
public class Database {
private static Database instance = null; private final Map<String, List<Stop>> mDict = new ConcurrentHashMap<String, List<Stop>>();
public static Database getInstance() { if (instance == null) { instance = new Database(); } return instance; } public List<Stop> getMatches(String query) {
List<Stop> list = mDict.get(query);
return list == null ? Collections.EMPTY_LIST : list;
}
private boolean mLoaded = false;
/**
* Loads the words and definitions if they haven't been loaded already.
*
* @param resources Used to load the file containing the words and definitions.
*/
public synchronized void ensureLoaded(final Resources resources) {
if (mLoaded) return;
new Thread(new Runnable() {
public void run() {
try {
loadStops(resources);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}).start();
}
private synchronized void loadStops(Resources resources) throws IOException
{
if (mLoaded) return;
Log.d("database", "loading stops");
InputStream inputStream = resources.openRawResource(R.raw.allstops);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
try {
String line;
while((line = reader.readLine()) != null) {
String[] strings = TextUtils.split(line, ";");
addStop(strings[0], strings[1], strings[2]);
}
} finally {
reader.close();
}
Log.d("database", "loading stops completed");
mLoaded = true;
}
private void addStop(String name, String district, String id) {
Stop stop = new Stop(id, name, district);
int len = name.length();
for (int i = 0; i < len; i++) {
String prefix = name.substring(0, len - i).toLowerCase();
addMatch(prefix, stop);
}
}
private void addMatch(String query, Stop stop) {
List<Stop> matches = mDict.get(query);
if (matches == null) {
matches = new ArrayList<Stop>();
mDict.put(query, matches);
}
matches.add(stop);
}
}
这是一些示例数据:
Mosseporten Senter;Norge;9021014089003000;59.445422;10.701055;273
Oslo Bussterminal;Norge;9021014089004000;59.911369;10.759665;273
Långegärde;Strömstad;9021014026420000;58.891462;11.007767;68
Västra bryggan;Strömstad;9021014026421000;58.893080;11.009997;7
Vettnet;Strömstad;9021014026422000;58.903184;11.020739;7
Ekenäs;Strömstad;9021014026410000;58.893610;11.048821;7
Kilesand;Strömstad;9021014026411000;58.878472;11.052983;7
Ramsö;Strömstad;9021014026430000;58.831531;11.067402;7
Sarpsborg;Norge;9021014089002000;59.280937;11.111763;273
Styrsö;Strömstad;9021014026180000;58.908110;11.115818;7
Capri/Källviken;Strömstad;9021014026440000;58.965200;11.124384;63
Lindholmens vändplan;Strömstad;9021014026156000;58.890212;11.128393;64
Öddö;Strömstad;9021014026190000;58.923490;11.130767;7
Källviksdalen;Strömstad;9021014026439000;58.962414;11.131962;64
Husevägen;Strömstad;9021014026505000;58.960094;11.133535;274
Caprivägen;Strömstad;9021014026284000;58.958404;11.134281;64
Stensviks korsväg;Strömstad;9021014026341000;59.001499;11.137203;63
Kungbäck;Strömstad;9021014026340000;59.006056;11.140313;63
Kase;Strömstad;9021014026173000;58.957649;11.141904;274
Due to simplicity i have a text file with entries separated by ; and parses every line into an object. The problem is that the text file contains almost 10 000 rows.
I also need to create keys for each object im parsing so i can filter the results in a search interface.
It takes almost 16 seconds in emulator to parse the text and add the keys. I'm i doing something wrong here? Or is there a more efficient way?
Here is my database singleton:
public class Database {
private static Database instance = null; private final Map<String, List<Stop>> mDict = new ConcurrentHashMap<String, List<Stop>>();
public static Database getInstance() { if (instance == null) { instance = new Database(); } return instance; } public List<Stop> getMatches(String query) {
List<Stop> list = mDict.get(query);
return list == null ? Collections.EMPTY_LIST : list;
}
private boolean mLoaded = false;
/**
* Loads the words and definitions if they haven't been loaded already.
*
* @param resources Used to load the file containing the words and definitions.
*/
public synchronized void ensureLoaded(final Resources resources) {
if (mLoaded) return;
new Thread(new Runnable() {
public void run() {
try {
loadStops(resources);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}).start();
}
private synchronized void loadStops(Resources resources) throws IOException
{
if (mLoaded) return;
Log.d("database", "loading stops");
InputStream inputStream = resources.openRawResource(R.raw.allstops);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
try {
String line;
while((line = reader.readLine()) != null) {
String[] strings = TextUtils.split(line, ";");
addStop(strings[0], strings[1], strings[2]);
}
} finally {
reader.close();
}
Log.d("database", "loading stops completed");
mLoaded = true;
}
private void addStop(String name, String district, String id) {
Stop stop = new Stop(id, name, district);
int len = name.length();
for (int i = 0; i < len; i++) {
String prefix = name.substring(0, len - i).toLowerCase();
addMatch(prefix, stop);
}
}
private void addMatch(String query, Stop stop) {
List<Stop> matches = mDict.get(query);
if (matches == null) {
matches = new ArrayList<Stop>();
mDict.put(query, matches);
}
matches.add(stop);
}
}
Here is some sample data:
Mosseporten Senter;Norge;9021014089003000;59.445422;10.701055;273
Oslo Bussterminal;Norge;9021014089004000;59.911369;10.759665;273
Långegärde;Strömstad;9021014026420000;58.891462;11.007767;68
Västra bryggan;Strömstad;9021014026421000;58.893080;11.009997;7
Vettnet;Strömstad;9021014026422000;58.903184;11.020739;7
Ekenäs;Strömstad;9021014026410000;58.893610;11.048821;7
Kilesand;Strömstad;9021014026411000;58.878472;11.052983;7
Ramsö;Strömstad;9021014026430000;58.831531;11.067402;7
Sarpsborg;Norge;9021014089002000;59.280937;11.111763;273
Styrsö;Strömstad;9021014026180000;58.908110;11.115818;7
Capri/Källviken;Strömstad;9021014026440000;58.965200;11.124384;63
Lindholmens vändplan;Strömstad;9021014026156000;58.890212;11.128393;64
Öddö;Strömstad;9021014026190000;58.923490;11.130767;7
Källviksdalen;Strömstad;9021014026439000;58.962414;11.131962;64
Husevägen;Strömstad;9021014026505000;58.960094;11.133535;274
Caprivägen;Strömstad;9021014026284000;58.958404;11.134281;64
Stensviks korsväg;Strömstad;9021014026341000;59.001499;11.137203;63
Kungbäck;Strömstad;9021014026340000;59.006056;11.140313;63
Kase;Strömstad;9021014026173000;58.957649;11.141904;274
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该将信息添加到 SQLite 数据库中,并将应用程序与 res/raw 中的数据库一起发送。
此外,db 文件通常可以有效地压缩为 zip 文件。
有关详细信息,请参阅此内容:使用数据库发送应用程序
You should add the information into a SQLite database and ship the app with the database in res/raw.
Additionally, the db file can often be effectively compressed into a zip file.
See this for more information: Ship an application with a database
将该数据加载到内存中的最快方法是将其直接放入 .java 文件中。例如
stopNames={"Stop1", "Stop2", ...}; latitudes={...};
我在我的公共交通应用程序中执行此操作,以这种方式加载类似数量的数据只需不到一秒,过滤是即时的。
The fastest way to load that data into memory is to place it right into .java file. E.g.
stopNames={"Stop1", "Stop2", ...}; latitudes={...};
I do this in my public transit app, loading similar amounts of the data this way takes under a second, filtering is instant.