合并数组和词频
我正在循环浏览一份有 41 段的文档。对于每个段落,我尝试 [1] 首先将字符串分解为数组,然后获取该段落的词频。然后我想合并所有段落的数据并获得整个文档的词频。
我能够获取给定段落的“单词”及其“频率”的数组,但我在合并每个段落的结果以获得整个文档的“单词频率”时遇到困难。这是我所拥有的:
function sectionWordFrequency($sectionFS)
{
$section_frequency = array();
$filename = $sectionFS . ".xml";
$xmldoc = simplexml_load_file('../../editedtranscriptions/' . $filename);
$xmldoc->registerXPathNamespace("tei", "http://www.tei-c.org/ns/1.0");
$paraArray = $xmldoc->xpath("//tei:p");
foreach ($paraArray as $p)
{
$para_frequency = (array_count_values(str_word_count(strtolower($p), 1)));
$section_frequency[] = $para_frequency;
}
return array_merge($section_frequency);
}
/// now I call the function, sort it, and try to display it
$section_frequency = sectionWordFrequency($fs);
ksort($section_frequency);
foreach ($section_frequency as $word=>$frequency)
{
echo $word . ": " . $frequency . "</br>";
}
现在我得到的结果是:
1:数组 2:数组 3:数组 4:数组
非常感谢任何帮助。
So I'm cycling through a document with 41 paragraphs. For each paragraph I'm trying to [1] first break the string into an array, and then get the word frequency of the paragraph. I then want to combine the data from all paragraphs and get the word frequency of the whole document.
I'm able to get array that gives me the "word" and its "frequency" for a given pargraph but I'm having trouble merging the results from each paragraph so as to get the "word frequency of the whole document. Here is what I have:
function sectionWordFrequency($sectionFS)
{
$section_frequency = array();
$filename = $sectionFS . ".xml";
$xmldoc = simplexml_load_file('../../editedtranscriptions/' . $filename);
$xmldoc->registerXPathNamespace("tei", "http://www.tei-c.org/ns/1.0");
$paraArray = $xmldoc->xpath("//tei:p");
foreach ($paraArray as $p)
{
$para_frequency = (array_count_values(str_word_count(strtolower($p), 1)));
$section_frequency[] = $para_frequency;
}
return array_merge($section_frequency);
}
/// now I call the function, sort it, and try to display it
$section_frequency = sectionWordFrequency($fs);
ksort($section_frequency);
foreach ($section_frequency as $word=>$frequency)
{
echo $word . ": " . $frequency . "</br>";
}
Right now the result I get is:
1: Array
2: Array
3: Array
4: Array
Any help is greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
尝试
用此
替换这一行,然后
Try to replace this line
with this
and then