如何将文本转换为使用Kotlin分组的单词列表?

发布于 2025-02-01 22:02:47 字数 900 浏览 4 评论 0原文

我有类似的字符串文本:

val text: String = "aa bb cc aa bb aa aa / <" 

我首先尝试跳过特殊字符,例如&lt;*/&amp;^$,然后将单词组成类似的对象单词列表:

data class Word(val id: Int, val text: String, val count: Int)

listOf(Word(1, aa, 4), Word(2, bb, 2), Word(3, cc, 1)) 

这是我的方法,但是它需要3个不好的循环加上锅炉板代码

 val wordWithCountMap = mutableMapOf<String, Int>()
 text.trim().split(" ").forEach { word ->
        if (word.isNotEmpty() && word.isNotBlank()) {
            val key = regex.replace(word, "")
            wordWithCountMap[key] = wordWithCountMap[word]?.plus(1) ?: 1
        }
    }

 val wordList = arrayListOf<Word>()
 wordWithCountMap.onEachIndexed { index, entry ->
      wordList.add(
         Word(
                id = index, text = entry.key,
                count = entry.value
         )
      )
  }

I have String text like that:

val text: String = "aa bb cc aa bb aa aa / <" 

I am try first to skip special characters like <*/&^$ , then group words into list of object word like that:

data class Word(val id: Int, val text: String, val count: Int)

listOf(Word(1, aa, 4), Word(2, bb, 2), Word(3, cc, 1)) 

This is my approach but it requires 3 loops which is bad plus the boiler plate code

 val wordWithCountMap = mutableMapOf<String, Int>()
 text.trim().split(" ").forEach { word ->
        if (word.isNotEmpty() && word.isNotBlank()) {
            val key = regex.replace(word, "")
            wordWithCountMap[key] = wordWithCountMap[word]?.plus(1) ?: 1
        }
    }

 val wordList = arrayListOf<Word>()
 wordWithCountMap.onEachIndexed { index, entry ->
      wordList.add(
         Word(
                id = index, text = entry.key,
                count = entry.value
         )
      )
  }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

深居我梦 2025-02-08 22:02:47
val text: String = "aa bb cc aa bb aa aa / <"

data class Word(
  val id: Int,
  val text: String,
  val count: Int
)

val result = text
  .split("\\b".toRegex())
  .filter { it.any { char -> char.isLetterOrDigit() } }
  .groupingBy { it }
  .eachCount()
  .entries
  .sortedByDescending { it.value }   // mabye remove this line (see @mattFreake's comment below)
  .mapIndexed { index, textCount -> Word(index + 1, textCount.key, textCount.value) }

result.forEach(::println)
val text: String = "aa bb cc aa bb aa aa / <"

data class Word(
  val id: Int,
  val text: String,
  val count: Int
)

val result = text
  .split("\\b".toRegex())
  .filter { it.any { char -> char.isLetterOrDigit() } }
  .groupingBy { it }
  .eachCount()
  .entries
  .sortedByDescending { it.value }   // mabye remove this line (see @mattFreake's comment below)
  .mapIndexed { index, textCount -> Word(index + 1, textCount.key, textCount.value) }

result.forEach(::println)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文