从字符串中去除希伯来语格式字符

发布于 2024-12-03 11:22:13 字数 702 浏览 0 评论 0原文

我有一个问题困扰了我几天。

我有一个字符串数组,每个字符串都包含一个希伯来语单词。

这些单词是从 PDF 中提取出来的,并以与 PDF 中显示的顺序相同的顺序出现在数组中。

我想将这些单词按照它们在数组和 PDF 中的顺序重构为一个句子。看起来很简单。

编辑:这是代码,它实际上是我正在循环的XML,我认为它无关紧要,但因为我显示了代码,所以我最好把它弄对:)

foreach($text->TOKEN as $word) {
    $sentence = $sentence . ' ' . $word;
}

/*
This sentence will sometimes (not always) not have the same order as the XML.
Hebrew is read right to left but thats not the issue, I just want to make a 
string in the same order as the words.
*/
echo $sentence;

就像单词有自己的想法并且顺序变得混乱一样对于非希伯来语读者来说,这似乎不符合逻辑顺序。逗号甚至会移动到不同的单词。但情况并非总是如此。

我不会读或不会说希伯来语,但据我所知,该语言中有一些特殊字符可能会影响顺序?我的问题是我必须做什么才能把它们去掉?

我为此使用 PHP。

I have a problem that is kicking my ass for a couple of days now.

I have an array of strings and each string contains a single hebrew word.

These words where ripped from a PDF and appear in the array in the same order as shown in the PDF.

I want to take these words and reconstruct them into a sentence in the order they are in the array and the PDF. Seems very simple.

edit: Here is the code, its actually XML I'm looping through, I think its irrelevant but since I'm showing the code I better have it right :)

foreach($text->TOKEN as $word) {
    $sentence = $sentence . ' ' . $word;
}

/*
This sentence will sometimes (not always) not have the same order as the XML.
Hebrew is read right to left but thats not the issue, I just want to make a 
string in the same order as the words.
*/
echo $sentence;

Its like the words have a mind of their own and the order gets jumbled up to what does not seem like a logical order to a non Hebrew reader. Commas will move around to different words even. But this is not always the case.

I do not read or speak Hebrew but from what I can gather there are some special characters in the language that might be affecting the order? My question is what do I have to do to strip them out?

I'm using PHP for this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

拥有 2024-12-10 11:22:13

在没有看到您的代码的情况下,这里有两个建议:

  1. 使用 print_r 打印出希伯来语单词数组,并查看它们的顺序。
  2. 请记住,希伯来语是从右到左读的,而不是从左到右读的。

否则,请提供更多代码以获得进一步帮助。

Without seeing your code, here are two suggestions:

  1. Print out the array of hebrew words with print_r and see what order they're in.
  2. Keep in mind that Hebrew is read right to left and not left to right.

Otherwise, please provide more of your code for further assistance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文