Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 10 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(3)
如何将源中的前 1 GB 数据复制到新文件中,然后向后搜索最后一个换行符,然后截断新文件。然后您就知道第一个文件有多大,然后对第二个新文件重复该过程,从该点到稍后的 1 GB。对我来说,几乎任何语言都很简单(你提到了 C#,我最近没有使用过它,但它当然可以轻松完成这项工作)。
您没有明确说明是否需要将标题行(如果有)复制到每个结果文件中。同样,应该很简单——只需在将数据复制到每个文件之前执行此操作即可。
您还可以采用以下方法:在 Unix 上使用
tar
或在 Windows 上使用一些类似 Zip 的实用程序来一般性地拆分文件,然后告诉您的大文件挑战伙伴从该格式重建文件。或者,也许简单地压缩 CSV 文件就可以了,并让您在实践中低于限制。How about just copying the first 1 GB of data from the source into a new file, then searching backward for the last newline, and truncating the new file after that. Then you know how large the first file is, and you repeat the process for a second new file from that point to 1 GB later. Seems straightforward to me in just about any language (you mentioned C#, which I haven't used recently, but certainly it can easily do the job).
You didn't make it clear whether you need to copy the header line (if any) to each of the resulting files. Again, should be straightforward--just do it prior to the copying of data into each of the files.
You could also take the approach of just generically splitting the files using
tar
on Unix or some Zip-like utility on Windows, then telling your large-file-challenged partner to reconstruct the file from that format. Or maybe simply compressing the CSV file would work, and get you under the limit in practice.您只需要注意几件事:
There are just a few things you need to take care of:
在 bash/终端提示符中,写入:
.. then
.. 只需计算文件中的行数,将其除以 X,输入要分割的数字,您就有 X 个小于 1.1GB 的文件(如果 x = 文件大小/1.1 )
In a bash/terminal prompt, write:
.. then
.. simply count the number of lines in the file, divide it by X, feed the number to split and you have X files less than 1.1GB (if x = filesize/1.1)