将 UNIX 目录树结构获取为 JSON 对象

发布于 2024-10-04 17:11:40 字数 357 浏览 2 评论 0原文

我正在尝试构建一个可视化文件结构的浏览器应用程序,因此我想将文件结构打印到 JSON 对象中。

我尝试过使用通过管道传输到 sed 的“ls”的多种变体,但似乎 find 效果最好。

现在我只是尝试使用命令

find ~ -maxdepth ? -name ? -type d -print

并标记路径变量

我已经尝试过简单的 ajax 和 PHP-exec this,但是数组遍历非常慢。 我想直接从 bash 脚本执行此操作,但我不知道如何获取关联数组的引用传递以递归地将所有标记化路径变量添加到树中。

有没有更好的或既定的方法来做到这一点?

谢谢!

I'm trying to build a browser application that visualizes file structures, so I want to print the file structure into a JSON object.

I've tried using many variation of 'ls' piped to sed, but it seems like find works best.

Right now I'm just trying to use the command

find ~ -maxdepth ? -name ? -type d -print

And tokenize the path variables

I've tried just simple ajax with PHP-exec this, but the array-walking is really slow.
I was thinking to do it straight from bash script, but I can't figure out how to get the pass-by-reference for associative arrays to recursively add all the tokenized path variables to the tree.

Is there a better or established way to do this?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

时间海 2024-10-11 17:11:40

9年后...
使用 tree 应该可以完成这项工作。

树 ~ -J -L ? -P '?' -d --noreport

其中:

  • -J 输出为 json
  • -L 最大级别深度(相当于查找 -max深度)
  • -P 要包含的模式(相当于查找 -name)
  • -d 仅目录(相当于查找-类型d)

9 years later...
Using tree should do the job.

tree ~ -J -L ? -P '?' -d --noreport

where:

  • -J output as json
  • -L max level depth (equiv to find -maxdepth)
  • -P pattern to include (equiv. to find -name)
  • -d directories only (equiv. to find -type d)
浅紫色的梦幻 2024-10-11 17:11:40

我不知道您的应用程序的要求是什么,但解决您的问题(以及许多其他问题)的一种解决方案是将实际的文件系统布局隐藏在抽象层后面。

本质上,您编写了两个线程。第一个方法抓取文件结构并创建其内容的数据库表示。第二个线程响应浏览器请求,查询第一个线程创建的数据库,并生成 JSON(即普通的 Web 请求处理程序线程)。

通过抽象底层存储结构(文件系统),您可以创建一个可以添加并发性、处理 IO 错误等的层。当有人更改结构内的文件时,Web 客户端将看不到它,直到“scraper”线程检测到更改并更新数据库。然而,由于 Web 请求与读取底层文件结构无关,而只是查询数据库,因此响应时间应该很快。

哈特哈,
内特。

I don't know what your application's requirements are, but one solution that solves your problem (and a number of other problems) is to hide the actual file system layout behind an abstraction layer.

Essentially, you write two threads. The first scrapes the file structures and creates a database representation of their contents. The second responds to browser requests, queries the database created by the first thread, and generates your JSON (i.e. a normal web request handler thread).

By abstracting the underlying storage structure (the file system), you create a layer that can add concurrency, deal with IO errors, etc. When someone changes a file within the structure, it's not visible to web clients until the "scraper" thread detects the change and updates the database. However, because web requests are not tied to reading the underlying file structure and merely query a database, response time should be fast.

HTH,
nate.

贪了杯 2024-10-11 17:11:40

遍历磁盘总是会比理想情况慢,这仅仅是因为需要完成所有查找。如果这对您来说不是问题,我的建议是努力消除开销......从最小化 fork() 调用的数量开始。然后,您可以将结果缓存您认为合适的时间。

既然您已经提到了 PHP,我的建议是用 PHP 编写整个服务器端系统并使用 DirectoryIteratorRecursiveDirectoryIterator 类。这是一个 SO 答案,类似于以下内容您要求使用前者实现什么。

如果磁盘 I/O 开销是一个问题,我的建议是按照 mlocate 它将目录列表与目录 ctime 一起缓存,并使用 stat() 来比较 ctime,并且仅重新读取内容已更改的目录。

我在 PHP 中没有做太多文件系统工作,但是,如果有帮助的话,我可以为您提供基本 mlocate 样式 updateb 过程的 Python 实现。 (我用它来索引文件,如果我的驱动器出现故障,则必须手动从 DVD+R 恢复这些文件,因为它们太大而无法轻松放入我的 rdiff 备份目标驱动器)

Walking the disk is always going to be slower than ideal, simply because of all the seeking that needs to be done. If that's not a problem for you, my advice would be to work to eliminate overhead... starting with minimizing the number of fork() calls. Then you can just cache the result for however long you feel is appropriate.

Since you've already mentioned PHP, my suggestion is to write your entire server-side system in PHP and use the DirectoryIterator or RecursiveDirectoryIterator classes. Here's an SO answer for something similar to what you're asking for implemented using the former.

If disk I/O overhead is a problem, my advice is to implement a system along the lines of mlocate which caches the directory listing along with the directory ctimes and uses stat() to compare ctimes and only re-read directories whose contents have changed.

I don't do much filesystem work in PHP, but, if it'd help, I can offer you a Python implementation of the basic mlocate-style updatedb process. (I use it to index files which have to be restored from DVD+R manually if my drive ever fails because they're too big to fit on my rdiff-backup target drive comfortably)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文