编写大规模文件管理脚本的最佳语言
国家公园管理局的自然声音计划每年都会收集数 TB 的数据来测量声景。您认为,管理大量文件和文件类型的最佳可用脚本语言是什么?我们希望轻松设计和运行高效的用户友好脚本,以根据单个静态层次结构搜索和检索/创建可能位于不同目录中的文件副本。操作系统很可能是 Windows。谢谢!
The National Park Service's Natural Sounds Program collects multiple terabytes of data each year measuring soundscapes. In your opinion, what is best available scripting language to manage massive amounts of files and file types? We would like to easily design and run efficient user-friendly scripts to search for and retrieve/create copies of files that may be located in different directories according a single static hierarchy. The OS will most likely be windows. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用开发人员最熟悉的一种。您从中获得的生产力收益几乎肯定会超过一种语言相对于另一种语言可能具有的任何优势。
Use the one your developers are most familiar with. The productivity gains you'll get from that will almost certainly beat out any advantages that one language may have over another.
使用Python。这很容易学习。每个人都可以轻松转换。
当您搜索目录或搜索文件外部的元数据时,文件的大小并不重要。即便如此,您很少需要读取整个声音样本文件来去除元数据。
此外,如果您经常这样做,您可能需要考虑
将所有元数据提取到关系数据库。
使用关系数据库作为声音样本文件的复杂“索引”。
每个文件添加或更改都将通过应用程序完成,该应用程序将文件更改与数据库更新同步,以确保数据库索引实际上与文件系统匹配。
您的大部分搜索可能会变成 SQL 查询。
Use Python. It's easy to learn. Everyone can easily convert.
The size of the files doesn't much matter when you're searching directories or searching for metadata outside the files. Even so, you rarely need to read an entire sound sample file to strip off the metadata.
Also, if you're doing this frequently, you might want to consider
Extract all the metadata to a relational database.
Use the relational database as a complex "index" to the sound sample files.
Each file add or change would be done through an application that synchronized file changes with database updates to assure that the database index actually matches the filesystem.
The bulk of your searches might become SQL queries.
我真的不知道您要在脚本语言中寻找什么,但埃里克是对的,您应该使用所有开发人员都熟悉的东西。但是,如果您(还)没有开发人员并且正在从头开始设计项目(和团队),则可以选择 C++ 或 .Net(C# 或 VB)。
C++ 提供更强大的编程和性能,而 C# 和 VB.Net 提供更快的生产。不管 .Net 的生产优势如何,我认为对于大量文件和数据来说,文件类型,您将从 C++ 获得最佳的整体满意度。在我看来,最好的用户友好设计除了单击按钮或从列表中选择选项之外,只需要很少的用户输入。
I don't really know what your are going to be looking for in a scripting language, but Eric is right that you should use something all your developers are familiar with. However, if you don't have developers (yet) and are designing the project (and team) from the ground up, C++ or .Net (C# or VB).
While C++ offers more powerful programming and performance, C# and VB.Net offer quicker production. Regardless of .Net's production advantage, I would think that for massive amounts of files & file types, you will have the best overall satisfaction from C++. In my opinion, the best user friendly design requires very little user input other than clicking buttons or selecting options from a list.