使用 jQuery 将复杂的标点字符串拆分为大于 2 个字符的常规单词
正如标题所示:
我试图将句子拆分为逗号分隔的字符串或数组,其中包含长度大于 2 个字符且唯一的已清理单词(已删除重复项)。
示例字符串可能是:
$sString = 'Stackoverflow's users are awesome!!! Stackoverflow, is the "best" technical questions and answers website on the interwebnet!';
完成的文章:
$sStringAfterProcessing = 'stackoverflow, users, are, awesome, the, best, technical, questions, and, answers, website, interwebnet';
请注意,第一个堆栈流已删除 ,标点符号和重复项均已删除。
这看起来可能会变得非常复杂。
欢迎提出建议,非常感谢所有帮助。
As the title suggests:
I'm trying to split sentences into either a comma-separated string or array consisting of sanitized words greater than 2 characters in length and unique (duplicates removed).
An example string might be:
$sString = 'Stackoverflow's users are awesome!!! Stackoverflow, is the "best" technical questions and answers website on the interwebnet!';
Finished article:
$sStringAfterProcessing = 'stackoverflow, users, are, awesome, the, best, technical, questions, and, answers, website, interwebnet';
Note the first stackflow has the 's removed, punctuation and duplicates are removed.
This seems like it could get very complicated.
Suggestions welcome and all help is much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这里...
将会产生:
示例: http://jsfiddle.net/ktFj2/1/
或者,采用数组格式:
示例: http://jsfiddle.net/nnKV8/
更新: 从数组中删除重复项 (和长度
< 2
的项目),如下所示:Here goes...
will yield:
Example: http://jsfiddle.net/ktFj2/1/
Or, in array format:
Example: http://jsfiddle.net/nnKV8/
Update: To remove duplicates from the array (and items with length
< 2
), something like this:这是一个基本方法(已编辑):
Here is a basic way(edited):
在正则表达式中放置任何其他标点符号,或使用 \W 表示非字母数字字符。
这将在 str 中产生一个实际单词数组。
然后,迭代 newStrings 并仅打印长度 >= 2 的元素!
Put any other punctuation in the regex, or use \W for non-alphanumeric characters.
This will yield an array of actual words in str.
Then, iterate through newStrings and print only the elements whose length >= 2!