If you have experience with C++ and C it will be easier learning Matlab.
Regarding your topic to use, i suggest you see the link above, and try to find something that you like that can be applied to NN, search acm, ieee or other repositories for papers about NN and see if you can find also studies or reports about the topic you may be looking for.
If you're serious about using a neural network for you culminating project it's well worth hour.
As for text vs music. Neural networks are great classifiers. They are fairly easy to teach with static data that has a true/false,on/off classification. A little bit more challenging when the network needs to classify the input into sets.
Neural networks have the most trouble with streaming data. There are some well known techniques to get this to work, but your intuition as to which will work well is not enough. You'll need to look at what other scientists and done and duplicate their technique. Otherwise you run a giant risk of creating a problem space NN are poorly suited to learn from.
I don't think you'll get interesting results streaming the music's wave form through a neural network. You'll need to pre process the data into a usable format.
The last thing you'll need is LOTS of data. The more the better. You need the baked data and it's classification. Hundreds of thousands. You will not be able to classify some by hand to create a learning data set.
So considering all this Text classification is much more doable than music.
Neural neworks need a HUGE corpus of data. Wikipedia is huge, and has lots of meta information about each page (popularity, quality, edit counts, etc ). Google can also get a large set of data that has a particluar classification, say "happy dogs" vs "sad dogs", or just "dogs" where google's rank is it classification.
抱歉,我可能把事情过于简单化了,但我希望能稍微驱散迷雾。简单的神经网络是一种近似函数(称为 f)的方法,从(通常)R^n(维度为 n 的真实向量空间)到 R^m 等。假设m=1。您不是寻求基于一组样本 (p,f(p)) 逼近函数的多项式 P(x_1,..,x_n),而是寻求在 s(a_1*s( b_11*x_1+b_n1*x_n)+...+a_t*s(b_1t*x_1+b_nt*x_n)) 其中 s 是,对于例如,“sigmoid”函数,这样这个奇怪的函数就能很好地匹配您的样本。
据说动机是生物学的。 “训练算法”包括连续调整上面的 a_i、b_ij 值,以便通过某种最速下降的变体,样本点 p 处的结果函数值“平均”更接近 f(p),其中,据称,在某些情况下具有良好的行为。 NN 在 90 年代被大量炒作所包围,但考虑到它的真正目标是根据其样本来近似未知函数(与“模仿人脑”或类似的炒作目标相反) ,许多其他近似方案被建议用于相同的范围 - 例如SVM(“支持向量机”),它有一个更吸引人的理由(在您看到寻求“正确的内核”的黑魔法之后,通常也会产生误导)研究文章中的工作)。
然而,重点是,只要您为工作选择了正确的“功能”(即找到一种好方法将您的音乐样本转换为 100 维的点,例如, 向量空间),因此类型 X 的点将“靠近”类型 X 的其他点,类型 Y 的点将“靠近”类型 Y 的点,并且类型 X 的点将远离类型 X 的点。类型 Y 的点,您可以使用 NN、SVM、决策树或任何您喜欢的其他方法来分离类型(尽管精度和效率可能会有所不同)。关键是找到正确的功能集 - 至少如果我们从这个意义上理解 AI(但如果这是唯一的意义,我认为 IBM Watson 是不可能的..)
Sorry, I would be oversimplifying things, but I wish to disperse the mist a little bit. A simple neural network is a way to approximate a function, call it f, from (usually) R^n (real vector space of dimension n) to R^m and like. Suppose m=1. Instead of seeking a polynomial P(x_1,..,x_n) approximating your function based on a set of samples (p,f(p)), you seek to find the parameters a_i, b_ij in something like s(a_1*s(b_11*x_1+b_n1*x_n)+...+a_t*s(b_1t*x_1+b_nt*x_n)) where s is, for example, the "sigmoid" function, so that this strange function matches your samples well.
The motivation is supposedly biological. The "training algorithm" consists of successively adjusting the values of a_i, b_ij above so that the values of the resulting function at sample points p get closer "on the average" to f(p), via some variant of steepest descent, which, it is claimed, has good behaviour in some cases. NN were surrounded by a lot of hype in the 90-s, but considering its real objective which was to approximate an unknown function based on its samples (contrary to the hyped objective which was to "imitate the human brain" or something like this), many other approximation schemes were suggested for the same scope - for example SVM ("support vector machines"), which have a more appealing justification (often misleading as well, after you see the black magic of seeking the "right kernel" for the job in research articles).
The point is however, that as long as you choose the right "features" for the job (i.e. find a good way to translate your music samples into points in a 100-dimensional, say, vector space), so that points of genre X will lie "close" to other points of genre X, and the points of genre Y will lie "close" to the points of genre Y, and points of genre X will lie far apart from points of genre Y, you may use NN, SVM, decision trees or whatever else you like to separate the genres (the precision and efficiency may vary, though). The point is to find the right set of features - at least if we understand AI in this sense (but if this was the only sense, I think that IBM Watson would not be possible..)
In my experience, Music Genre Classification would be too hard for an undergraduate project. The problem is that before you can get to the "fun stuff" of applying a neural net classifier, you'll need to do all kinds of signal-processing groundwork to produce meaningful feature vectors for the net. Consider beats per minute: it's reliable for certain types of music, but far from all. If you still want to go ahead, look at using something like libxtract as a base tool.
http://www.heatonresearch.com/encog 在这里我学习了所有与神经网络相关的人工智能算法。这个 API 是一个很好的方法来了解您需要什么,以便了解正在发生的事情。此后,我为自己创建了一个 API,在模拟 Boid 选择时经常使用它。并不是真的必要,但它确实有效。
http://www.heatonresearch.com/encog this is where I learned all about AI Algorithms pertaining to Neural Networks. This API is a very good way to learn what you need in order to understand whats going on. I have since created an API for myself that I use frequently when simulating Boid choices. Not really necessary but it works.
发布评论
评论(5)
看看 https://archive.ics.uci.edu/ml/datasets。 php
看看你是否找到了你喜欢的话题。
如果您有 C++ 和 C 的经验,学习 Matlab 会更容易。
关于您要使用的主题,我建议您查看上面的链接,并尝试找到您喜欢的可以应用于 NN 的东西,搜索 acm、ieee 或其他存储库以获取有关 NN 的论文,看看您是否也可以找到研究或报告关于您可能正在寻找的主题。
祝你好运。
have a look at https://archive.ics.uci.edu/ml/datasets.php
and see if you find some topic that you like.
If you have experience with C++ and C it will be easier learning Matlab.
Regarding your topic to use, i suggest you see the link above, and try to find something that you like that can be applied to NN, search acm, ieee or other repositories for papers about NN and see if you can find also studies or reports about the topic you may be looking for.
Good luck.
有一个非常好的关于神经网络的谷歌技术讲座。
youtube.com/watch?v=AyzOUbkUf3M
如果您真的想使用神经网络对于您的最终项目来说,这是非常值得的。
至于文字与音乐。神经网络是很好的分类器。使用具有真/假、开/关分类的静态数据来教授它们相当容易。当网络需要将输入分类为集合时,更具挑战性。
神经网络在处理流数据时遇到的问题最多。有一些众所周知的技术可以让它发挥作用,但您对哪种技术效果好的直觉是不够的。您需要看看其他科学家所做的事情并复制他们的技术。否则,您将面临创建神经网络不适合学习的问题空间的巨大风险。
我认为通过神经网络传输音乐波形不会得到有趣的结果。您需要将数据预处理为可用的格式。
您最不需要的就是大量数据。越多越好。您需要烘焙数据及其分类。数十万。您将无法手动对某些数据进行分类来创建学习数据集。
因此,考虑到所有这些文本分类比音乐更可行。
神经网络需要大量数据。维基百科非常庞大,每个页面都有大量元信息(受欢迎程度、质量、编辑次数等)。谷歌还可以获得大量具有特定分类的数据,例如“快乐的狗”与“悲伤的狗”,或者只是“狗”,其中谷歌的排名就是它的分类。
There is a really good Google Tech Talk about Neural networks.
youtube.com/watch?v=AyzOUbkUf3M
If you're serious about using a neural network for you culminating project it's well worth hour.
As for text vs music. Neural networks are great classifiers. They are fairly easy to teach with static data that has a true/false,on/off classification. A little bit more challenging when the network needs to classify the input into sets.
Neural networks have the most trouble with streaming data. There are some well known techniques to get this to work, but your intuition as to which will work well is not enough. You'll need to look at what other scientists and done and duplicate their technique. Otherwise you run a giant risk of creating a problem space NN are poorly suited to learn from.
I don't think you'll get interesting results streaming the music's wave form through a neural network. You'll need to pre process the data into a usable format.
The last thing you'll need is LOTS of data. The more the better. You need the baked data and it's classification. Hundreds of thousands. You will not be able to classify some by hand to create a learning data set.
So considering all this Text classification is much more doable than music.
Neural neworks need a HUGE corpus of data. Wikipedia is huge, and has lots of meta information about each page (popularity, quality, edit counts, etc ). Google can also get a large set of data that has a particluar classification, say "happy dogs" vs "sad dogs", or just "dogs" where google's rank is it classification.
抱歉,我可能把事情过于简单化了,但我希望能稍微驱散迷雾。简单的神经网络是一种近似函数(称为 f)的方法,从(通常)R^n(维度为 n 的真实向量空间)到 R^m 等。假设m=1。您不是寻求基于一组样本 (p,f(p)) 逼近函数的多项式 P(x_1,..,x_n),而是寻求在 s(a_1*s( b_11*x_1+b_n1*x_n)+...+a_t*s(b_1t*x_1+b_nt*x_n)) 其中 s 是,对于例如,“sigmoid”函数,这样这个奇怪的函数就能很好地匹配您的样本。
据说动机是生物学的。 “训练算法”包括连续调整上面的 a_i、b_ij 值,以便通过某种最速下降的变体,样本点 p 处的结果函数值“平均”更接近 f(p),其中,据称,在某些情况下具有良好的行为。 NN 在 90 年代被大量炒作所包围,但考虑到它的真正目标是根据其样本来近似未知函数(与“模仿人脑”或类似的炒作目标相反) ,许多其他近似方案被建议用于相同的范围 - 例如SVM(“支持向量机”),它有一个更吸引人的理由(在您看到寻求“正确的内核”的黑魔法之后,通常也会产生误导)研究文章中的工作)。
然而,重点是,只要您为工作选择了正确的“功能”(即找到一种好方法将您的音乐样本转换为 100 维的点,例如,
向量空间),因此类型 X 的点将“靠近”类型 X 的其他点,类型 Y 的点将“靠近”类型 Y 的点,并且类型 X 的点将远离类型 X 的点。类型 Y 的点,您可以使用 NN、SVM、决策树或任何您喜欢的其他方法来分离类型(尽管精度和效率可能会有所不同)。关键是找到正确的功能集 - 至少如果我们从这个意义上理解 AI(但如果这是唯一的意义,我认为 IBM Watson 是不可能的..)
Sorry, I would be oversimplifying things, but I wish to disperse the mist a little bit. A simple neural network is a way to approximate a function, call it f, from (usually) R^n (real vector space of dimension n) to R^m and like. Suppose m=1. Instead of seeking a polynomial P(x_1,..,x_n) approximating your function based on a set of samples (p,f(p)), you seek to find the parameters a_i, b_ij in something like s(a_1*s(b_11*x_1+b_n1*x_n)+...+a_t*s(b_1t*x_1+b_nt*x_n)) where s is, for example, the "sigmoid" function, so that this strange function matches your samples well.
The motivation is supposedly biological. The "training algorithm" consists of successively adjusting the values of a_i, b_ij above so that the values of the resulting function at sample points p get closer "on the average" to f(p), via some variant of steepest descent, which, it is claimed, has good behaviour in some cases. NN were surrounded by a lot of hype in the 90-s, but considering its real objective which was to approximate an unknown function based on its samples (contrary to the hyped objective which was to "imitate the human brain" or something like this), many other approximation schemes were suggested for the same scope - for example SVM ("support vector machines"), which have a more appealing justification (often misleading as well, after you see the black magic of seeking the "right kernel" for the job in research articles).
The point is however, that as long as you choose the right "features" for the job (i.e. find a good way to translate your music samples into points in a 100-dimensional, say,
vector space), so that points of genre X will lie "close" to other points of genre X, and the points of genre Y will lie "close" to the points of genre Y, and points of genre X will lie far apart from points of genre Y, you may use NN, SVM, decision trees or whatever else you like to separate the genres (the precision and efficiency may vary, though). The point is to find the right set of features - at least if we understand AI in this sense (but if this was the only sense, I think that IBM Watson would not be possible..)
根据我的经验,音乐流派分类对于本科项目来说太难了。问题是,在开始应用神经网络分类器的“有趣的东西”之前,您需要做各种信号处理基础工作来为网络生成有意义的特征向量。考虑每分钟的节拍数:它对于某些类型的音乐来说是可靠的,但远非全部。如果您仍然想继续,请考虑使用 libxtract 之类的东西作为基本工具。
In my experience, Music Genre Classification would be too hard for an undergraduate project. The problem is that before you can get to the "fun stuff" of applying a neural net classifier, you'll need to do all kinds of signal-processing groundwork to produce meaningful feature vectors for the net. Consider beats per minute: it's reliable for certain types of music, but far from all. If you still want to go ahead, look at using something like libxtract as a base tool.
http://www.heatonresearch.com/encog
在这里我学习了所有与神经网络相关的人工智能算法。这个 API 是一个很好的方法来了解您需要什么,以便了解正在发生的事情。此后,我为自己创建了一个 API,在模拟 Boid 选择时经常使用它。并不是真的必要,但它确实有效。
http://www.heatonresearch.com/encog
this is where I learned all about AI Algorithms pertaining to Neural Networks. This API is a very good way to learn what you need in order to understand whats going on. I have since created an API for myself that I use frequently when simulating Boid choices. Not really necessary but it works.