Rebol:内存不足的问题
我在下面运行了这个程序,其中包含数百个符号的符号列表,有时它说内存不足,而我确实重用了相同的变量,为什么?
base-url: http://www.google.com/finance/historical
download-directory: "askpoweruser/stock-download/google/files/"
column-header: "Time;Open;High;Low;Close;Volume"
#debug: true
symbol_list: parse/all {GOOG AAPL MSFT INDEXDJX:.DJI} " "
ans: ask {symbols by default "GOOG AAPL MSFT INDEXDJX:.DJI": }
if (ans <> "") [symbol_list: parse/all ans " "]
;do code-block/2
foreach symbol symbol_list [
url0: rejoin [base-url "?q=" symbol]
dir: make-dir/deep to-rebol-file download-directory
either none? filename: find symbol ":" [
filename: symbol
url: rejoin [url0 "&output=csv"]
either not error? try [content: read url][
out-string: copy rejoin [column-header newline]
quotes: parse/all content ",^/"
reversed-quotes: reverse quotes
foreach [v c l h o d] reversed-quotes [
either not (error? try [d: to-date d]) [
d: rejoin [d/year "-" d/month "-" d/day]
append out-string rejoin [d ";" o ";" h ";" l ";" c ";" v newline]
][
;print [d "is not a date"]
;input
]
]
filename: rejoin [filename "_" "1440"]
write to-rebol-file rejoin [dir filename ".csv"] out-string
print filename
][
print ["Error for symbol" symbol]
]
][
filename: replace/all replace/all filename ":" "" "." ""
out: copy []
for i 0 1 1 [
p: i
url: rejoin [url0 "&start=" (p * 200) "&num=" ((p + 1) * 200)]
content: read url
rule: [to "<table" thru "<table" to ">" thru ">"
to "<table" thru "<table" to ">" thru ">"
to "<table" thru "<table" to ">" thru ">"
copy quotes to </table> to end
]
parse content rule
parse quotes [
some [to "<td" thru "<td" to ">" thru ">" [copy x to "<" | copy x to end] (append out replace/all x "^/" "")]
to end
]
if #debug [
write/lines to-rebol-file rejoin [dir filename "_" p ".html"] quotes
]
]
if #debug [
write to-rebol-file rejoin [dir filename "_temp" ".txt"] mold out
]
out-string: copy rejoin [column-header newline]
out: reverse out
foreach [v c l h o d] out [
d: parse/all d " ,"
d: to-date rejoin [d/4 "-" d/1 "-" d/2]
d: rejoin [d/year "-" d/month "-" d/day]
append out-string replace/all rejoin [d ";" o ";" h ";" l ";" c ";" v newline] "," ""
]
filename: rejoin [filename "_" "1440"]
write to-rebol-file rejoin [dir filename ".csv"] out-string
print filename
]
]
要获取符号列表,您可以使用这个(rebol 在字母 H 之前崩溃了):
alphabet: [A B C D E F G H I J K L M N O P Q R S T U V W X Y Z]
symbol-list: copy []
rule: [
to <table class="quotes">
some [ to {<A href="/stockquote} to ">" thru ">" copy symbol to "<" (append symbol-list symbol)]
to </table>
]
foreach letter alphabet [
content: read to-url rejoin ["http://www.eoddata.com/stocklist/NYSE/" letter ".htm"]
parse content rule
probe symbol-list
write/append %askpoweruser/stock-download/symbol-list-nyse.txt mold symbol-list
]
I have run this program below with a symbol_list of a few hundreds symbol and at some moment it says not enough memory whereas I do reuse the same variable why ?
base-url: http://www.google.com/finance/historical
download-directory: "askpoweruser/stock-download/google/files/"
column-header: "Time;Open;High;Low;Close;Volume"
#debug: true
symbol_list: parse/all {GOOG AAPL MSFT INDEXDJX:.DJI} " "
ans: ask {symbols by default "GOOG AAPL MSFT INDEXDJX:.DJI": }
if (ans <> "") [symbol_list: parse/all ans " "]
;do code-block/2
foreach symbol symbol_list [
url0: rejoin [base-url "?q=" symbol]
dir: make-dir/deep to-rebol-file download-directory
either none? filename: find symbol ":" [
filename: symbol
url: rejoin [url0 "&output=csv"]
either not error? try [content: read url][
out-string: copy rejoin [column-header newline]
quotes: parse/all content ",^/"
reversed-quotes: reverse quotes
foreach [v c l h o d] reversed-quotes [
either not (error? try [d: to-date d]) [
d: rejoin [d/year "-" d/month "-" d/day]
append out-string rejoin [d ";" o ";" h ";" l ";" c ";" v newline]
][
;print [d "is not a date"]
;input
]
]
filename: rejoin [filename "_" "1440"]
write to-rebol-file rejoin [dir filename ".csv"] out-string
print filename
][
print ["Error for symbol" symbol]
]
][
filename: replace/all replace/all filename ":" "" "." ""
out: copy []
for i 0 1 1 [
p: i
url: rejoin [url0 "&start=" (p * 200) "&num=" ((p + 1) * 200)]
content: read url
rule: [to "<table" thru "<table" to ">" thru ">"
to "<table" thru "<table" to ">" thru ">"
to "<table" thru "<table" to ">" thru ">"
copy quotes to </table> to end
]
parse content rule
parse quotes [
some [to "<td" thru "<td" to ">" thru ">" [copy x to "<" | copy x to end] (append out replace/all x "^/" "")]
to end
]
if #debug [
write/lines to-rebol-file rejoin [dir filename "_" p ".html"] quotes
]
]
if #debug [
write to-rebol-file rejoin [dir filename "_temp" ".txt"] mold out
]
out-string: copy rejoin [column-header newline]
out: reverse out
foreach [v c l h o d] out [
d: parse/all d " ,"
d: to-date rejoin [d/4 "-" d/1 "-" d/2]
d: rejoin [d/year "-" d/month "-" d/day]
append out-string replace/all rejoin [d ";" o ";" h ";" l ";" c ";" v newline] "," ""
]
filename: rejoin [filename "_" "1440"]
write to-rebol-file rejoin [dir filename ".csv"] out-string
print filename
]
]
To get the list of symbols you can use this (rebol crashed above before letter H):
alphabet: [A B C D E F G H I J K L M N O P Q R S T U V W X Y Z]
symbol-list: copy []
rule: [
to <table class="quotes">
some [ to {<A href="/stockquote} to ">" thru ">" copy symbol to "<" (append symbol-list symbol)]
to </table>
]
foreach letter alphabet [
content: read to-url rejoin ["http://www.eoddata.com/stocklist/NYSE/" letter ".htm"]
parse content rule
probe symbol-list
write/append %askpoweruser/stock-download/symbol-list-nyse.txt mold symbol-list
]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以在一个循环中放置一个“STAT 函数”,以尝试找出是否以及在何处发生内存泄漏。
内存不足错误通常发生在以下情况之一或类似的情况下:
附加到在带有循环的函数开始时未清除或复制的系列
在树和整个树外部引用某些(单个?)子元素的情况下,完整的数据树不会重置为无(在每个叶子和分支)数据块最终被困在内存中,无法自行释放
某些堆栈溢出(无休止的递归或循环)有时会被错误地报告为内存错误。
单个项目的分配呈指数增长......就像图像一样!在两个轴上将每次传递乘以 10 时的分配...有效地增加了两个数量级,这通常在 n*10k 范围内的数字上失败。
GC 中最大的项目有时永远不会根据次优 R2 GC 取消分配(大图像可能有此症状)。
递归解析规则正在创建数据,并且单个规则是无限的(在像 [rule | none ] 这样的规则上发生得非常快。在这种情况下,none 有效地相当于永远。
you can put a 'STAT function within one of your loops to try and figure out if and where a memory leak is occuring.
out of memory errors usually occur in one of these situations or something similar:
appending to a series which is not cleared or copied at the start of a function with a loop
a complete tree of data is not reset to none (at each leaf and branch) in a situation where some (a single?) sub elements are referenced outside the tree and the whole data block ends up being caught in the ram unable to free itself
when printing a very large string or nested tree of large objects (for example a VID face contains a reference to the complete styleshet, so printing the window of a big app usually fails.).
some stack overflows (endless recursions or loops) are sometimes incorrectly reported as memory errors.
allocation of a single item grows exponentially... like image! allocation when multiplying each pass by 10 on both axes... effectively increasing two orders of magnitude which usually fails at numbers in the n*10k range.
the largest item in the GC sometimes never deallocates as per the suboptimal R2 GC (large images may have this symptom).
recursive parse rules are creating data and a single rule is infinite (it happens very rapidly on rules like [ rule | none ] . none effectively equivalent to forever in this case.
它工作没有问题。但一些注意事项:
It works without problems. but some notes: