Vim:技术方法
我有一个 2,500 行的文档,它是数据表(excel)的输出。它是一致的,并且在文档中重复了大约 100 次 - 尽管每行的重复并不完美,因为每个周期的数据略有不同。在每个(25 行)周期中,我可以收集至少 30 条信息,以便上传到定制的上传器中,以填充网站中的数据库。
我的第一个想法是使用 submatch(*) 来搜索/替换,以正确构造捕获的数据,以便将其上传到我自己的数据库中,例如:
data1:data2,data3a|data3b|data3c,data4:data5
乍一看,有足够的 vim 寄存器来开始、追加,然后替换和转储在下一个周期之前(覆盖)寄存器 - 然后重复。但将来我可能想扩展这种数据捕获,这可能会耗尽我的寄存器(az,0-9),并且很难跟踪什么是什么(计算分隔符)。因此,我正在考虑使用函数来传递捕获的文本以及调用它的名称(绕过替换/子匹配的想法),以便在每个周期结束时设置(let)以进行检索和正确的格式化。我看到这样的函数:
function SetVar(varname, varval)
exe "let @".a:varname." = '".a:varval."'"
endfunction
我会捕获如下数据:
:/sectionHeader/sectionFooter/g/(pieceOfInfo)/call SetVar('varname1',@)/
其中sectionHeader 和sectionFooter 定义文档内的循环部分(范围)。我可能会使用 RegExps 来捕获这些节名称,并使用名称的一部分来标记变量(而不是 varname1) - 或者可能是像“i”这样的递增变量。
然后格式化最终输出,如下所示:
varname1:varname2,varname3|varname4|varname5,varname6:varname7
我认为这会更容易维护,因为变量名称可以变得有意义,从而跟踪上传过程(以及未来可能的扩展)。
问题:
这是否有意义?作为此解决方案的架构方法是否合理?
您能提出更好的方法吗?
I've got a 2,500 line document that is the output of a datasheet (excel). It's consistent and repeating about 100 times within the document - though not perfectly repeating per line because the data per cycle varies slightly. On each (25-line) cycle, I could gather at least 30 pieces of information for uploading into a custom-built uploader to fill dbs in a website.
My first thought is to search/replace using the submatch(*) to structure the captured data correctly for my use in uploading it into my own db, like:
data1:data2,data3a|data3b|data3c,data4:data5
At first glance there is enough vim registers to begin, append and then replace and dump (overwrite) the registers before the next cycle - and then repeat. But in the future I might want to extend this capturing of data, which might max out my registers (a-z, 0-9) AND make it difficult to keep track of what's what (counting delimiters). So I am contemplating functions to pass the captured text along with a name to call it (bypassing the replace/submatch idea) in order to be set (let) for retrieval and proper formatting at the end of each cycle. I see a function like:
function SetVar(varname, varval)
exe "let @".a:varname." = '".a:varval."'"
endfunction
I would capture the data like:
:/sectionHeader/sectionFooter/g/(pieceOfInfo)/call SetVar('varname1',@)/
where the sectionHeader and sectionFooter define the cycling portion (range) within the document. I would probably use RegExps to capture these section names and use a portion of the name to label the variable (instead of varname1) - or maybe an incrementing variable like "i".
and then format the final output like:
varname1:varname2,varname3|varname4|varname5,varname6:varname7
I would think this would be much easier to maintain as the variable names could be made to make sense, thus keeping track through the upload process (and possible future expansion).
Questions:
Does this make sense and is it reasonable as an architectural approach to this solution?
Can you suggest a better approach?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
awk 脚本不是更好的选择吗?您可以进行相同类型的搜索和搜索。替换,并有一个单独的输出文件,awk 的逐行操作应该避免您在 Vim 中尝试执行此操作时可能遇到的一些问题。
当然,如果您永远不会重复这个过程,那么 Vim 可能是一个不错的选择。
Wouldn't an awk script be a better choice for this? You could do the same sort of search & replace, and have a separate output file, and awk's line-by-line operation should avoid some of the issues you might encounter in trying to do this in Vim.
Of course, if you'll never repeat this process, Vim might not be a bad call.