如何验证上传的文件是否完整
我有一个系统,用户可以通过 FTP 服务器或 html 表单上传 CSV 文件。在我这边,脚本轮询上传目录并处理找到的新文件。一些用户将通过从 Excel 导出来创建 CSV,而其他用户将使用自己的脚本以编程方式创建它。
我目前关心的是:我如何才能 100% 确定我的处理脚本所处理的文件是完整的 - 换句话说,它不是部分文件(正在进行中、上传失败等)?
如果文件格式是更结构化的,例如 XML,那么通过检查 XML 结构是否有效(即:结束标记),我可以 100% 确信文件是完整的。
有没有一个好的方法可以保证上传的CSV文件完整,无负担&迷惑不太懂技术的用户,他们只是上传从电子表格程序导出的文件(即:提供文件内容的 md5 将超出他们的能力)。
I have an system with which users can upload a CSV file via an FTP server, or via a html form. On my end, a script polls the uploads directory and processes new files found. Some users will create the CSV by exporting it from Excel, while others will programmatically create it with scripts of their own.
My concern at the moment is: How can I be 100% certain that the file my processing script acts on is complete - in other words that it isn't a partial file (in progress, failed upload, etc)?
If the file format was something more structured, like XML, I'd be 100% confident that the file is complete by checking that the XML structure is valid (ie: closing tags).
Is there a good way to ensure that the uploaded CSV file is complete, without burdening & confusing less technical users who are simply uploading a file exported from a spreadsheet program (ie: providing an md5 of the file contents would be beyond them).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
过去在设计 CSV 文件格式时,我总是添加页眉和页脚行,如下所示:
大多数 CSV 文件格式都有页眉来标记列,页脚的目的是表明文件已完成。页脚包含一个简单的行计数,在循环文件内容时很容易审核。对于用户来说不太复杂。
When designing CSV file formats in the past, I've always added a header and footer line as follows:
Most CSV file formats have a header to label the columns, the purpose of the footer is to indicate the file is completed. The footer contains a simple line count, which is easy to audit when looping through the file's contents. Not too complicated for users.
只要上传文件的文件大小与原始文件的文件大小匹配,您就可以进行交叉检查。
You could crosscheck whenever the filesize of the uploaded file matches the filesize of the original file.