从字符串中去除希伯来语格式字符
我有一个问题困扰了我几天。
我有一个字符串数组,每个字符串都包含一个希伯来语单词。
这些单词是从 PDF 中提取出来的,并以与 PDF 中显示的顺序相同的顺序出现在数组中。
我想将这些单词按照它们在数组和 PDF 中的顺序重构为一个句子。看起来很简单。
编辑:这是代码,它实际上是我正在循环的XML,我认为它无关紧要,但因为我显示了代码,所以我最好把它弄对:)
foreach($text->TOKEN as $word) {
$sentence = $sentence . ' ' . $word;
}
/*
This sentence will sometimes (not always) not have the same order as the XML.
Hebrew is read right to left but thats not the issue, I just want to make a
string in the same order as the words.
*/
echo $sentence;
就像单词有自己的想法并且顺序变得混乱一样对于非希伯来语读者来说,这似乎不符合逻辑顺序。逗号甚至会移动到不同的单词。但情况并非总是如此。
我不会读或不会说希伯来语,但据我所知,该语言中有一些特殊字符可能会影响顺序?我的问题是我必须做什么才能把它们去掉?
我为此使用 PHP。
I have a problem that is kicking my ass for a couple of days now.
I have an array of strings and each string contains a single hebrew word.
These words where ripped from a PDF and appear in the array in the same order as shown in the PDF.
I want to take these words and reconstruct them into a sentence in the order they are in the array and the PDF. Seems very simple.
edit: Here is the code, its actually XML I'm looping through, I think its irrelevant but since I'm showing the code I better have it right :)
foreach($text->TOKEN as $word) {
$sentence = $sentence . ' ' . $word;
}
/*
This sentence will sometimes (not always) not have the same order as the XML.
Hebrew is read right to left but thats not the issue, I just want to make a
string in the same order as the words.
*/
echo $sentence;
Its like the words have a mind of their own and the order gets jumbled up to what does not seem like a logical order to a non Hebrew reader. Commas will move around to different words even. But this is not always the case.
I do not read or speak Hebrew but from what I can gather there are some special characters in the language that might be affecting the order? My question is what do I have to do to strip them out?
I'm using PHP for this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在没有看到您的代码的情况下,这里有两个建议:
否则,请提供更多代码以获得进一步帮助。
Without seeing your code, here are two suggestions:
Otherwise, please provide more of your code for further assistance.