可以从r中 *.txt文件中生成所有单词的单词计数
我在 lorem ipsum 上生成了10000个随机单词,并保存为TXT文件。然后编写以下代码:
r代码:
art <- read.delim(file.choose()) # selecting the txt file from local machine
art_u <- unlist(art) # unlisting the words from a single string
art_split <- strsplit(art_u, split = " ", fixed = T) # spliting the words
art_sep <- c() # creating an empty vector to store splitted words
for (i in art_split){art_sep=c(art_sep, i)} # storing the words into the vector
art_fac <- factor(art_sep) # factorizing the words from the vector
art_sum <- summary(art_fac) # getting result with counts
art_wc_df <- as.data.frame(art_sum) # turning the result into a dataframe
在创建的数据帧中,在99个观测/行之后,第100个观察顿/行作为其他以大量计数。它是在Rstudio和RGUI中尝试的,但给出了相同的结果。无法弄清楚怎么了。有什么方法可以修复它,还是编码错误?
NB:尝试 rstudio 2021.09.1构建372, rgui x64 4.1.2
I generated 10000 random words at Lorem Ipsum and saved as txt file. Then wrote following code:
R Code:
art <- read.delim(file.choose()) # selecting the txt file from local machine
art_u <- unlist(art) # unlisting the words from a single string
art_split <- strsplit(art_u, split = " ", fixed = T) # spliting the words
art_sep <- c() # creating an empty vector to store splitted words
for (i in art_split){art_sep=c(art_sep, i)} # storing the words into the vector
art_fac <- factor(art_sep) # factorizing the words from the vector
art_sum <- summary(art_fac) # getting result with counts
art_wc_df <- as.data.frame(art_sum) # turning the result into a dataframe
In the created dataframe, after 99 observations/rows, the 100th observaton/row comes as others with a large count. It was tried both in RStudio and RGui, but gives the same result. Can't figure out what went wrong. Is there any way to fix it, or the coding went wrong?
NB: Tried on RStudio 2021.09.1 Build 372, RGui x64 4.1.2
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论