合并数组和词频

发布于 2024-12-07 18:08:07 字数 989 浏览 1 评论 0原文

我正在循环浏览一份有 41 段的文档。对于每个段落,我尝试 [1] 首先将字符串分解为数组,然后获取该段落的词频。然后我想合并所有段落的数据并获得整个文档的词频。

我能够获取给定段落的“单词”及其“频率”的数组,但我在合并每个段落的结果以获得整个文档的“单词频率”时遇到困难。这是我所拥有的:

function sectionWordFrequency($sectionFS)
{
$section_frequency = array();
$filename = $sectionFS . ".xml";
$xmldoc = simplexml_load_file('../../editedtranscriptions/' . $filename);
$xmldoc->registerXPathNamespace("tei", "http://www.tei-c.org/ns/1.0");
$paraArray = $xmldoc->xpath("//tei:p");

foreach ($paraArray as $p)
{
$para_frequency = (array_count_values(str_word_count(strtolower($p), 1)));
$section_frequency[] = $para_frequency;
}


return array_merge($section_frequency);
}

/// now I call the function, sort it, and try to display it
$section_frequency = sectionWordFrequency($fs); 
ksort($section_frequency);

foreach ($section_frequency as $word=>$frequency)
{
 echo $word . ": " . $frequency . "</br>";
}

现在我得到的结果是:

1:数组 2:数组 3:数组 4:数组

非常感谢任何帮助。

So I'm cycling through a document with 41 paragraphs. For each paragraph I'm trying to [1] first break the string into an array, and then get the word frequency of the paragraph. I then want to combine the data from all paragraphs and get the word frequency of the whole document.

I'm able to get array that gives me the "word" and its "frequency" for a given pargraph but I'm having trouble merging the results from each paragraph so as to get the "word frequency of the whole document. Here is what I have:

function sectionWordFrequency($sectionFS)
{
$section_frequency = array();
$filename = $sectionFS . ".xml";
$xmldoc = simplexml_load_file('../../editedtranscriptions/' . $filename);
$xmldoc->registerXPathNamespace("tei", "http://www.tei-c.org/ns/1.0");
$paraArray = $xmldoc->xpath("//tei:p");

foreach ($paraArray as $p)
{
$para_frequency = (array_count_values(str_word_count(strtolower($p), 1)));
$section_frequency[] = $para_frequency;
}


return array_merge($section_frequency);
}

/// now I call the function, sort it, and try to display it
$section_frequency = sectionWordFrequency($fs); 
ksort($section_frequency);

foreach ($section_frequency as $word=>$frequency)
{
 echo $word . ": " . $frequency . "</br>";
}

Right now the result I get is:

1: Array
2: Array
3: Array
4: Array

Any help is greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

薄荷梦 2024-12-14 18:08:07

尝试

$section_frequency[] = $para_frequency;

用此

$section_frequency = array_merge($section_frequency, $para_frequency);

替换这一行,然后

return $section_frequency

Try to replace this line

$section_frequency[] = $para_frequency;

with this

$section_frequency = array_merge($section_frequency, $para_frequency);

and then

return $section_frequency
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文