解析文本文件的行,其中值由不同数量的空格字符分隔
我需要在不同的数组中获取公司名称及其股票代码。 这是我存储在 txt 文件中的数据:
3M Company MMM
99 Cents Only Stores NDN
AO Smith Corporation AOS
Aaron's, Inc. AAN
等等
我如何使用正则表达式或其他一些技术来做到这一点?
I need to get the company name and its ticker symbol in different arrays. Here is my data which is stored in a txt file:
3M Company MMM
99 Cents Only Stores NDN
AO Smith Corporation AOS
Aaron's, Inc. AAN
and so on
How would I do this using regex or some other techniques?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
迭代每一行,并使用正则表达式收集数据:
反向引用
$1
将包含公司名称,$2
将包含股票代码。您还可以使用两个或三个空格分隔符将字符串分成两部分,并修剪生成的两个字符串。 仅当您确定公司名称和股票代码始终由足够的空格分隔,并且公司名称本身不包含那么多空格时,此方法才有效。
Iterate over each line, and collect the data with a regular expression:
The backreference
$1
will contain the company name,$2
will contain the ticker symbol.You can also split the string in two with a two or three-space delimiter and trim the resulting two strings. This only works if you are sure the company name and ticker symbol are always separated by enough spaces, and the company name itself doesn't contain that amount of spaces.
文本文件的格式是强加给你的吗? 如果您可以选择,我建议您不要使用空格来分隔文本文件中的字段。 相反,使用 | 或 $$ 或您可以放心的内容不会出现在内容中,然后将其拆分为数组。
Is the format of the text file imposed on you? If you have the choice, I'd suggest you don't use spaces to separate the fields in the text file. Instead, use | or $$ or something you can be assured won't appear in the content, then just split it to an array.
尝试这个正则表达式:
也许具有更多 PHP 经验的人可以使用
preg_split
或类似的东西。Try this regular expression:
Perhaps someone with more PHP experience could flesh out a code example using
preg_split
or something similar.使用可变空格作为两列文本之间的分隔符,有多种方法可以做到这一点。
您可以使用
file()
逐行处理文本文件,并使用preg_split()
在变量空格上分隔文本,变量空格后跟一系列大写字母后跟字符串末尾,或者您可以将file_get_contents()
与preg_match_all()
结合使用,然后使用array_column()
提取捕获的两个列。 虽然后者可能会快一点,因为它只进行 1 个preg_
函数调用,但该决定可能取决于开发人员的编码品味和输入文本的复杂性。代码:(演示)
或:
With variable whitespaces as the delimiter between your two columns of text, there will be several ways to do this.
You could process the text file line-by-line with
file()
and usepreg_split()
to separate the text on variable spaces that are followed by a sequence of uppercase letters followed by the end of the string, or you could usefile_get_contents()
withpreg_match_all()
then extract the two captured columns witharray_column()
. While the latter may be a little faster since it only makes 1preg_
function call, the decision is likely to come down to the developer's coding tastes and the complexity of the input text.Code: (Demo)
Or: