我的问题是我不明白该如何去SVM,实际上我有20%的knn错误,所以我想改进此统计数据,我在vcorpus上放置的HTML文件,清洁,放入DTM,找出最频繁的单词,然后我使用1000个文件来整理1个文件的好类(我有7个类)。下面的代码:
corpusEntrainement <- VCorpus(DirSource("training", recursive=T))
corpusCleanEntrainement <- nettoyage(corpusEntrainement)
motsFrequentsEntrainement <- findFreqTerms(corpusMatrice,lowfreq = 400, highfreq = 1200)
corpusDocReduitEntrainement <- DocumentTermMatrix(corpusCleanEntrainement,list(dictionary=motsFrequentsEntrainement))
dataReduitEntrainement <- as.matrix(corpusDocReduitEntrainement[, motsFrequentsEntrainement])
classesEntrainement<-c(rep(1,150),rep(2,150),rep(3,150),rep(4,150),rep(5,150),rep(6,150),rep(7,150))
matriceFinaleEntrainement <- cbind(dataReduitEntrainement,"classes"=classesEntrainement)
这是我清理语料库并获得最终效果的方式。我认为代码的其他部分将很简单,我只想在SVM中移动文档。
谢谢 !
my problem is that i don't understand how to go to SVM, actually i have 20% mistakes in KNN so i want to improve this stat, i work on html files that i put in a VCorpus, clean, put in a DTM, figure out the most frequents word and then i use like 1000 files to sort out the good classes for 1 file (i have 7 classes). code below :
corpusEntrainement <- VCorpus(DirSource("training", recursive=T))
corpusCleanEntrainement <- nettoyage(corpusEntrainement)
motsFrequentsEntrainement <- findFreqTerms(corpusMatrice,lowfreq = 400, highfreq = 1200)
corpusDocReduitEntrainement <- DocumentTermMatrix(corpusCleanEntrainement,list(dictionary=motsFrequentsEntrainement))
dataReduitEntrainement <- as.matrix(corpusDocReduitEntrainement[, motsFrequentsEntrainement])
classesEntrainement<-c(rep(1,150),rep(2,150),rep(3,150),rep(4,150),rep(5,150),rep(6,150),rep(7,150))
matriceFinaleEntrainement <- cbind(dataReduitEntrainement,"classes"=classesEntrainement)
So this is how i clean my corpus and get a final as.matrix, how from this i can move from svm ? i think the others part of the code will be simple i just want to move the docs in SVM.
Thanks !
发布评论
评论(1)
我假设您正在寻找如何培训SVM模型(在问题中并不清楚)。
请注意,您可以将课程转换为一个因素:
例如,请参见例如有关详细信息。
I'm assuming that you're looking for how to train a SVM model (it's not very clear in the question).
Note that you may to convert the class as a factor before:
See for instance this tutorial for details.