帮助读取文本文件中的换行符
我有一个 TXT 文件,需要通过应用程序导入,但由于某种原因,我需要先在写字板中打开它,然后在导入之前保存它。 我猜这与换行符有关。 因为如果我先在记事本中打开它,则没有换行符,但如果我用写字板打开它,则行会分开。
有谁知道为什么会发生这种情况以及如何避免手动打开文件并用写字板保存? 该应用程序是用 vb 6 编写的(哎呀!)。
谢谢你的帮助
I have a TXT file that i need to import via an application, but for some reason i need to open it in wordpad first and then save it before importing it. I'm guessing it has to do with Line Breaks. Cause if i open it in notepad first, there are no line breaks, but if i open it with wordpad the lines are seperated.
Does anyone know why this occurs and how i can avoid having to manually open a file and save it with wordpad?
The app is written in vb 6 (Yikes!).
Thanks for any help
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是一个行结束问题。 您的代码(和记事本)想要查看回车符 (CR)/换行符 (LF) 对,这可能是仅 CR (Macintosh) 或仅 LF (Unix) 文件。 写字板更宽容,并且在保存时显然(尚未测试)为您保存 CR/LF 对。
您可以更改应用程序中的代码以查找任何结尾并轻松处理它们:只需停止将 vbCrLf 作为一对查找并查找其中一个作为行尾即可。 我自己的策略是扫描 CR 或 LF 并消耗后面的所有 CR/LF 字符:这也会清除空白行。
This is a line ending problem. Your code (and notepad) want to see Carriage Return(CR)/Line Feed(LF) pairs, and this is probably CR only (Macintosh) or LF only (Unix) file. Wordpad is more forgiving, and upon save is apparently (haven't tested it) saving CR/LF pairs for you.
You can change your code in the application to look for any of the endings and treat them easily: just stop looking for vbCrLf as a pair and look for either as the end of line. My own strategy is to scan for CR or LF and the consume all CR/LF characters that followed: this clears blank lines as well.
文件的每行末尾可能只有一个回车符 (CR) 或换行符 (LF)。
在 Windows 中,每行末尾都需要一个 CR 和 LF 字符。 这可以在 VB6 中通过使用常量 vbCRLF 轻松完成。
另一方面,如果您是读取文件的人,则可以确定缺少哪一个并在读取文件时手动添加它(即使用替换功能将 CR 转换为 CRLF 或 LF 转换为 CRLF)。
The file probably has only a Carriage Return (CR) or a Line Feed (LF) character at the end of each line.
In Windows, you need both a CR and LF character at the end of each line. This can easily be done in VB6 by using the constant vbCRLF.
On the flip side, if you are the one reading the file, you can determine which one is missing and manually add it in as you read the file (ie, using the replace function to convert CR into CRLF or LF into CRLF).
除非这些文件非常大并且性能至关重要,否则可以通过 ADODB.Stream 对象轻松完成按行读取它们。
这不仅可以处理多个行分隔符(Stream.LineSeparator = adCR、adCRLF 或 adLF),还可以用于处理包含 Unicode (UTF-16)、UTF-8、系统代码页 ANSI 和替代“ANSI”编码的文件对于其他语言环境。
例如,如果您有一个包含俄语区域设置中的“ANSI”的文本文件,您可以设置 Stream.Charset = "koi8-r" 并通过正确转换为 VB6 Unicode (UTF-16) 读取数据:
字符集默认为值“unicode”(UTF-16),但要使用默认代码页以 ANSI 读取或写入流,您可以将其设置为“ascii”。
HKCR\MIME\Database\Charset 包含可用值。
Unless these files are very large and performance is critical, reading them by line can be accomplished easily via the ADODB.Stream object.
Not only will this handle several line delimiters (Stream.LineSeparator = adCR, adCRLF, or adLF) it can also be used to process files containing Unicode (UTF-16), UTF-8, system codepage ANSI, and alternative "ANSI" encodings for other locales.
For example if you have a text file that contains "ANSI" from a Russian language locale you can set Stream.Charset = "koi8-r" and read the data with proper translation into VB6 Unicode (UTF-16):
Charset defaults to the value "unicode" (UTF-16) but to read or write the Stream in ANSI with the default codepage you can set it to "ascii" instead.
HKCR\MIME\Database\Charset contains the available values.