Java 如何添加重音符号“e”到一个字符串？

发布于 2024-08-30 00:11:22 字数 1138 浏览 11 评论 0原文

在现有帖子 tucuxi 的帮助下 Java 从没有正则的字符串中删除 HTML表达式我构建了一个方法，可以从字符串中解析出任何基本的 HTML 标签。然而，有时，原始字符串包含 html 十六进制字符，例如 &#x00E9（这是带重音的 e）。我已经开始添加功能，将这些转义字符转换为真实字符。

您可能会问：为什么不使用正则表达式？或者第三方库？不幸的是我不能，因为我正在不支持正则表达式的黑莓平台上进行开发，并且我从未能够成功地将第三方库添加到我的项目中。

所以，我已经到了任何 &#x00E9 都被替换为“e”的地步。我现在的问题是，如何将实际的“重音 e”添加到字符串中？

这是我的代码：

public static String removeHTML(String synopsis) {

char[] cs = synopsis.toCharArray();
  String sb = new String();
  boolean tag = false;
  for (int i = 0; i < cs.length; i++) {
   switch (cs[i]) {
   case '<':
    if (!tag) {
     tag = true;
     break;
    }
   case '>':
    if (tag) {
     tag = false;
     break;
    }
   case '&':

    char[] copyTo = new char[7];
    System.arraycopy(cs, i, copyTo, 0, 7);

    String result = new String(copyTo);

    if (result.equals("&#x00E9")) {
     sb += "e";
    }

    i += 7;
    break;
   default:
    if (!tag)
     sb += cs[i];
   }
  }

  return sb.toString();
 }

谢谢！

原文

With the help of tucuxi from the existing post Java remove HTML from String without regular expressions I have built a method that will parse out any basic HTML tags from a string. Sometimes, however, the original string contains html hexadecimal characters like é (which is an accented e). I have started to add functionality which will translate these escaped characters into real characters.

You're probably asking: Why not use regular expressions? Or a third party library? Unfortunately I cannot, as I am developing on a BlackBerry platform which does not support regular expressions and I have never been able to successfully add a third party library to my project.

So, I have gotten to the point where any é is replaced with "e". My question now is, how do I add an actual 'accented e' to a string?

Here is my code:

public static String removeHTML(String synopsis) {

char[] cs = synopsis.toCharArray();
  String sb = new String();
  boolean tag = false;
  for (int i = 0; i < cs.length; i++) {
   switch (cs[i]) {
   case '<':
    if (!tag) {
     tag = true;
     break;
    }
   case '>':
    if (tag) {
     tag = false;
     break;
    }
   case '&':

    char[] copyTo = new char[7];
    System.arraycopy(cs, i, copyTo, 0, 7);

    String result = new String(copyTo);

    if (result.equals("é")) {
     sb += "e";
    }

    i += 7;
    break;
   default:
    if (!tag)
     sb += cs[i];
   }
  }

  return sb.toString();
 }

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

北城半夏 2024-09-06 00:11:22

Java 字符串是 unicode。

sb += '\u00E9';   # lower case  e + '
sb += '\u00C9';   # upper case  E + '

Java Strings are unicode.

sb += '\u00E9';   # lower case  e + '
sb += '\u00C9';   # upper case  E + '

回复收藏 0 原文

懵少女 2024-09-06 00:11:22

您可以在 Java 中打印出您喜欢的任何字符，因为它使用 Unicode 字符集。

要找到您想要的字符，请查看此处的图表：

http://www.unicode.org/charts /

在拉丁语补充文档中，您将看到重音字符的所有 unicode 数字。例如，您应该看到 é 列出的十六进制数字 00E9。所有拉丁重音字符的数字都在本文档中，因此您应该会发现这非常有用。

要打印字符串中的 use 字符，只需使用 Unicode 转义序列 \u 后跟字符代码，如下所示：

System.out.print("Let's go to the caf\u00E9");

会生成：“Let's go to the cafe”

根据您使用的 Java 版本，您可能会找到 StringBuilders (或者 StringBuffers（如果是多线程）也比使用 + 运算符连接字符串更有效。

You can print out just about any character you like in Java as it uses the Unicode character set.

To find the character you want take a look at the charts here:

http://www.unicode.org/charts/

In the Latin Supplement document you'll see all the unicode numbers for the accented characters. You should see the hex number 00E9 listed for é for example. The numbers for all Latin accented characters are in this document so you should find this pretty useful.

To print use character in a String, just use the Unicode escape sequence of \u followed by the character code like so:

System.out.print("Let's go to the caf\u00E9");

Would produce: "Let's go to the café"

Depending in which version of Java you're using you might find StringBuilders (or StringBuffers if you're multi-threaded) more efficient than using the + operator to concatenate Strings too.

回复收藏 0 原文

撕心裂肺的伤痛 2024-09-06 00:11:22

试试这个：

  if (result.equals("é")) {
     sb += char(130);
    }

而不是

  if (result.equals("é")) {
     sb += "e";
    }

问题是，您没有在“e”字符的顶部添加重音符号，而是将其作为一个单独的字符一起使用。这个站点列出了字符的ascii代码。

try this:

  if (result.equals("é")) {
     sb += char(130);
    }

instead of

  if (result.equals("é")) {
     sb += "e";
    }

The thing is that you're not adding an accent to the top of the 'e' character, but rather that is a separate character all together. This site lists out the ascii codes for characters.

回复收藏 0 原文