FCM 聚类数值数据和 csv/excel 文件

发布于 2024-12-09 01:42:43 字数 565 浏览 0 评论 0原文

您好,我问了一个上一个问题,它给出了合理的答案,我认为我回到了正轨,Matlab 中的模糊 c 均值 tcp 转储聚类 问题是以下 tcp/udp 数据的预处理阶段,我想通过 matlabs fcm 聚类算法运行这些数据。我问题:

1)我如何或什么是将单元格中的文本数据转换为数值的最佳方法?数值应该是多少?

编辑:我在 Excel 中的数据现在如下所示:

在此处输入图像描述

0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal.

Hi I asked a previous question that gave a reasonable answer and I thought I was back on track, Fuzzy c-means tcp dump clustering in matlab the problem is the preprocessing stage of the below tcp/udp data that I would like to run through matlabs fcm clustering algorithm.My question:

1) how do i or what would be the best method to convert the text data in the cells to a numeric value? what should the numeric value be?

Edit: My data in excel looks like this now:

enter image description here

0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

挽梦忆笙歌 2024-12-16 01:42:43

下面是我如何将数据读入 MATLAB 的示例。您需要两件事:以逗号分隔格式的数据本身,以及 特征列表及其类型(数字、标称)。

%# read the list of features
fid = fopen('kddcup.names','rt');
C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1);
fclose(fid);

%# determine type of features
C{2} = regexprep(C{2}, '.

您还可以查看统计工具箱中的 DATASET 类。

,''); %# remove "." at the end attribNom = [ismember(C{2},'symbolic');true]; %# nominal features %# build format string used to read/parse the actual data frmt = cell(1,numel(C{1})); frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number frmt( ismember(C{2},'symbolic') ) = {'%s'}; %# nominal features: read as string frmt = [frmt{:}]; frmt = [frmt '%s']; %# add the class attribute %# read dataset fid = fopen('kddcup.data','rt'); C = textscan(fid, frmt, 'Delimiter',','); fclose(fid); %# convert nominal attributes to numeric ind = find(attribNom); G = cell(numel(ind),1); for i=1:numel(ind) [C{ind(i)},G{i}] = grp2idx( C{ind(i)} ); end %# all numeric dataset M = cell2mat(C);

您还可以查看统计工具箱中的 DATASET 类。

Here is an example how I would read the data into MATLAB. You need two things: the data itself which is in comma-separated format, as well as the list of features along with their types (numeric,nominal).

%# read the list of features
fid = fopen('kddcup.names','rt');
C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1);
fclose(fid);

%# determine type of features
C{2} = regexprep(C{2}, '.

You could also look into the DATASET class from the Statistics Toolbox.

,''); %# remove "." at the end attribNom = [ismember(C{2},'symbolic');true]; %# nominal features %# build format string used to read/parse the actual data frmt = cell(1,numel(C{1})); frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number frmt( ismember(C{2},'symbolic') ) = {'%s'}; %# nominal features: read as string frmt = [frmt{:}]; frmt = [frmt '%s']; %# add the class attribute %# read dataset fid = fopen('kddcup.data','rt'); C = textscan(fid, frmt, 'Delimiter',','); fclose(fid); %# convert nominal attributes to numeric ind = find(attribNom); G = cell(numel(ind),1); for i=1:numel(ind) [C{ind(i)},G{i}] = grp2idx( C{ind(i)} ); end %# all numeric dataset M = cell2mat(C);

You could also look into the DATASET class from the Statistics Toolbox.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文