Matlab 聚类和数据格式

发布于 2024-12-09 09:20:04 字数 2953 浏览 1 评论 0原文

从上一个问题开始 FCM 聚类数字数据和 csv/excel 文件< /a> 我现在试图弄清楚如何获取输出的信息并创建一个可用的 .dat 文件以在 matlab 中进行聚类。

    %# read the list of features
fid = fopen('kddcup.names','rt');
C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1);
fclose(fid);

%# determine type of features
C{2} = regexprep(C{2}, '.$','');              %# remove "." at the end
attribNom = [ismember(C{2},'symbolic');true]; %# nominal features

%# build format string used to read/parse the actual data
frmt = cell(1,numel(C{1}));
frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number
frmt( ismember(C{2},'symbolic') ) = {'%s'};   %# nominal features: read as string
frmt = [frmt{:}];
frmt = [frmt '%s'];                           %# add the class attribute

%# read dataset
fid = fopen('kddcup.data','rt');
C = textscan(fid, frmt, 'Delimiter',',');
fclose(fid);

%# convert nominal attributes to numeric
ind = find(attribNom);
G = cell(numel(ind),1);
for i=1:numel(ind)
    [C{ind(i)},G{i}] = grp2idx( C{ind(i)} );
end

%# all numeric dataset
M = cell2mat(C);

我有几种类型的数据,如下所示:

在此处输入图像描述

我尝试了以下方法来创建 .dat 文件但出现了错误:

>> a = load('matlab.mat');
>> save 'matlab.dat' a -ascii
Warning: Attempt to write an unsupported data type
to an ASCII file.
    Variable 'a' not written to file. 
>> a = load('data.mat');
>> save 'matlab.dat' a -ascii
Warning: Attempt to write an unsupported data type
to an ASCII file.
    Variable 'a' not written to file. 
>> save 'matlab.dat' a 
>> findcluster('matlab.dat')
??? Error using ==> load
Number of columns on line 1 of ASCII file
C:\Users\Garrith\Documents\MATLAB\matlab.dat
must be the same as previous lines.

Error in ==> findcluster>localloadfile at 471
       load(filename);

Error in ==> findcluster at 160
       localloadfile(filename, param);

Matlabs聚类工具适用于多维数据集,但仅显示在两个 方面。然后,您使用 x 和 y 轴进行比较,但我不太确定是否能够根据当前数据创建聚类二维分析?

我需要做的是规范化我上一篇文章中的 m 文件 FCM 聚类数值数据和 csv/excel 文件

要标准化数据:

  1. 查找最小和最大数据集

  2. 归一化比例最小值和最大值

  3. 数据集中的数字

  4. 归一化值

所以第一个问题是如何找到数据集中的最小和最大数字(m)

步骤1: 找到数据集中的最大和最小值,并用变量大写 A 和大写 B 表示它们:

Lets say minimum number A = 92000 
and max number say B = 64525000

第 2 步标准化 确定最小和最大数字并将变量设置为小写 a 和 b 不确定如何在 matlab 中执行此操作(不确定如何标准化数据)

set the minimum = a = 1
set the maximum = b = 10

第 3 步 使用以下方程计算任意数字 x 的归一化值

A = 92000
B = 64525000
a = 1
b = 10
x = 2214000

a + (x - A)(b - a)/(B - A)
1+(2214000 - 92000)(10-1)/(6425000 - 92000)
= 4.01

Leading on from a previous question FCM Clustering numeric data and csv/excel file Im now trying to figure out how to take the outputed information and create a workable .dat file for use with clustering in matlab.

    %# read the list of features
fid = fopen('kddcup.names','rt');
C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1);
fclose(fid);

%# determine type of features
C{2} = regexprep(C{2}, '.

I have several types of data which looks like this:

enter image description here

I tried the below method to create a .dat file but came up with the error:

>> a = load('matlab.mat');
>> save 'matlab.dat' a -ascii
Warning: Attempt to write an unsupported data type
to an ASCII file.
    Variable 'a' not written to file. 
>> a = load('data.mat');
>> save 'matlab.dat' a -ascii
Warning: Attempt to write an unsupported data type
to an ASCII file.
    Variable 'a' not written to file. 
>> save 'matlab.dat' a 
>> findcluster('matlab.dat')
??? Error using ==> load
Number of columns on line 1 of ASCII file
C:\Users\Garrith\Documents\MATLAB\matlab.dat
must be the same as previous lines.

Error in ==> findcluster>localloadfile at 471
       load(filename);

Error in ==> findcluster at 160
       localloadfile(filename, param);

Matlabs clustering tool works on multi-dimensional data sets, but only displays on two
dimensions. You then use the x and y axis to compare against but im not quite sure if I will be able to create a clustering 2d analysis from the current data?

What I need to do is normalize the m file from my previous post FCM Clustering numeric data and csv/excel file

To normalize the data:

  1. find the minimum and maximum dataset

  2. Normalized scale minimum and maximum

  3. Number in the data set

  4. Normalized value

So first question is how do I find the minimum and maximum numbers in my dataset(m)

Step 1:
Find the largest and smallest values in the data set and represent them with the variables capital A and capital B:

Lets say minimum number A = 92000 
and max number say B = 64525000

Step 2 normalize
Identify the smallest and largest numbers and set the variables to lower case a and b
unsure how to do this in matlab (not sure how you normalize the data to start with)

set the minimum = a = 1
set the maximum = b = 10

step 3
calculate the normalized value of any number x using the equation

A = 92000
B = 64525000
a = 1
b = 10
x = 2214000

a + (x - A)(b - a)/(B - A)
1+(2214000 - 92000)(10-1)/(6425000 - 92000)
= 4.01
,''); %# remove "." at the end attribNom = [ismember(C{2},'symbolic');true]; %# nominal features %# build format string used to read/parse the actual data frmt = cell(1,numel(C{1})); frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number frmt( ismember(C{2},'symbolic') ) = {'%s'}; %# nominal features: read as string frmt = [frmt{:}]; frmt = [frmt '%s']; %# add the class attribute %# read dataset fid = fopen('kddcup.data','rt'); C = textscan(fid, frmt, 'Delimiter',','); fclose(fid); %# convert nominal attributes to numeric ind = find(attribNom); G = cell(numel(ind),1); for i=1:numel(ind) [C{ind(i)},G{i}] = grp2idx( C{ind(i)} ); end %# all numeric dataset M = cell2mat(C);

I have several types of data which looks like this:

enter image description here

I tried the below method to create a .dat file but came up with the error:

Matlabs clustering tool works on multi-dimensional data sets, but only displays on two
dimensions. You then use the x and y axis to compare against but im not quite sure if I will be able to create a clustering 2d analysis from the current data?

What I need to do is normalize the m file from my previous post FCM Clustering numeric data and csv/excel file

To normalize the data:

  1. find the minimum and maximum dataset

  2. Normalized scale minimum and maximum

  3. Number in the data set

  4. Normalized value

So first question is how do I find the minimum and maximum numbers in my dataset(m)

Step 1:
Find the largest and smallest values in the data set and represent them with the variables capital A and capital B:

Step 2 normalize
Identify the smallest and largest numbers and set the variables to lower case a and b
unsure how to do this in matlab (not sure how you normalize the data to start with)

step 3
calculate the normalized value of any number x using the equation

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

过气美图社 2024-12-16 09:20:04

查看问题中间的错误。 a = load(matfile) 返回一个结构,基于 ASCII 的 MAT 文件格式不支持该结构。尝试阅读文档。

Looking at the errors in the middle of your question. a = load(matfile) returns a structure, which is not supported by the ASCII-based MAT-file format. Try reading the documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文