如何解析特定文本?

发布于 2024-12-05 01:50:08 字数 1305 浏览 0 评论 0原文

我想使用 jsoup 只是解析告诉你标题的中间文本。

http://www.upcominggames.com/2113/Halo+Combat+Evolved+周年纪念日/ http://www.upcominggames.com/478/Gears+of+War+ 3/

jsoup 标签是什么来解析它并仅提取文章?

上述两篇文章的通用选择器是什么?

编辑:

我想做的是解析这部分

战争机器 3 事实
《战争机器 3》是一款由微软发行、Epic Games 开发的第三人称射击游戏,将于 2011 年 9 月 20 日在美国、澳大利亚和欧洲发行,并于 9 月 22 日在日本发行。

战争机器 3 剧情简介
《战争机器 3》是《战争机器》三部曲的 Xbox 360 独家结局,让玩家体验到关于生存、希望和兄弟情谊的激动人心的体验和故事。这款第三人称射击游戏以比以往更多的色彩和细节戏剧性地引导玩家穿越令人兴奋的世界。此外,其令人兴奋的多人游戏模式也会让玩家在完成战役后仍想要更多。

战争机器 3 游戏
任何玩过《战争机器》游戏的人在玩《战争机器 3》时都会感到熟悉,但这并不意味着他们不会面临一些新的惊喜。环境更加细致和身临其境,增加了《战争机器》系列闻名的兴奋感和惊险感。《战争机器 3》的敌人比《战争机器》系列的前几部作品更多,将为玩家提供一个品牌当他们试图拯救人类免遭彻底毁灭时,他们面临着新的挑战。如果玩家拥有 3D 电视,他们将能够以 3D 方式玩这款新游戏,获得完全身临其境的体验。

战争机器 3 多人游戏
《战争机器 3》中新增的多人游戏功能使该游戏比《战争机器 2》向前迈出了一大步。从专门的服务器来处理配对开始,Epic Games 付出了大量的努力来打造迄今为止最好的《战争机器》体验。通过捕获领袖、山丘之王和其他多人游戏模式,玩家将能够在激动人心的死亡竞赛中与其他玩家在线游戏。

我想将粗体解析为单独的文本视图,然后在它下面加载其内容。 基本上就是上面的样子。

如果您突出显示文本并单击“查看选择源”,您将看到我试图解析的内容,

我熟悉 jsoup。只是在这部分需要一些帮助。

I want to use jsoup just to parse the middle text telling you about the title.

http://www.upcominggames.com/2113/Halo+Combat+Evolved+Anniversary/
http://www.upcominggames.com/478/Gears+of+War+3/

What would the jsoup tags be to parse this and extract just the article?

What would be a common selector for the two articles above?

EDIT:

What i want to do is parse this part

Gears of War 3 Facts
Gears of War 3 is a third-person shooter published by Microsoft and developed by Epic Games, and it is set to be released on September 20, 2011 in the US, Australian and Europe and on September 22 in Japan.

Gears of War 3 Synopsis

This Xbox 360 exclusive conclusion to the Gears of War trilogy, Gears of War 3 places players in the middle of an exciting experience and story of survival, hope and brotherhood. This third-person shooter dramatically leads players through the exciting world with more color and detail than ever before. Plus, its exciting multiplayer mode will lead players wanting more even after they’ve finished the campaign.

Gears of War 3 Gameplay
Anyone who has played a Gears of War game will feel familiar when they play Gears of War 3, but that doesn’t meant they won’t be faced with a few new surprised. The environments are much more detailed and immersive, adding to the excitement and thrill the Gears of War franchise is known for. Featuring more enemies than previous installments of the Gears of War series, Gears of War 3 will offer players a brand new challenge as they try to save the human race from complete destruction. If players own a 3D TV, they’ll be able to play this new installment in 3D to have a completely immersive experience.

Gears of War 3 Multiplayer
The multiplayer additions to Gears of War 3 make the game a big step up from Gears of War 2. Starting with dedicated servers to handle matchmaking, Epic Games has put a lot of effort into making this the best Gears experience yet. With Capture the Leader, King of the Hill and other multiplayer modes, players will be able to take their game online against other players in exciting deathmatches.

I want to parse the Bold into a seperate textView and then under it i want to load its content.
Basically just how it is above.

If you hilight the text and click view selection source youll see what i trying to parse

I am familiar with jsoup. Just need some help on this part.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

じ违心 2024-12-12 01:50:08

是的,我明白你的意思。我认为如果您研究网页源代码并找到常见的链接标签和属性,Jsoup 会很容易提取它。可以尝试的方法包括:

  • 获取具有标签“div”的元素
  • 分配“game-desc”的属性“id”

从这两个过滤器返回的文本可能会得到您想要的内容。

例如,

编辑:代码简化为使用 select(...)

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class HaloStuff {
   private static final String TEST_URL_1 = "http://www.upcominggames.com/" +
        "2113/Halo+Combat+Evolved+Anniversary/";
   private static final String DIV_TAG = "div";
   private static final String ID_ATTR = "id";
   private static final String GAME_DESC = "game-desc";

   public static void main(String[] args) {
      Document jsDoc = null;

      List<String> textList = new ArrayList<String>();

      try {
         jsDoc = Jsoup.connect(TEST_URL_1).get();

         Elements textEles = jsDoc.select("div[id=game-desc]");
         for (Element ele : textEles) {
            System.out.println(ele.text());
         }

      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}

Yes, I get what you're saying. I think that Jsoup would easily extract this if you study the web page source code and find common linking tags and attributes. Ones to try include:

  • get the Elements that have the tag "div"
  • the attribute "id" that is assigned "game-desc"

The text returned from just these two filters will likely get you what you want.

e.g.,

Edit: code simplified to use select(...)

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class HaloStuff {
   private static final String TEST_URL_1 = "http://www.upcominggames.com/" +
        "2113/Halo+Combat+Evolved+Anniversary/";
   private static final String DIV_TAG = "div";
   private static final String ID_ATTR = "id";
   private static final String GAME_DESC = "game-desc";

   public static void main(String[] args) {
      Document jsDoc = null;

      List<String> textList = new ArrayList<String>();

      try {
         jsDoc = Jsoup.connect(TEST_URL_1).get();

         Elements textEles = jsDoc.select("div[id=game-desc]");
         for (Element ele : textEles) {
            System.out.println(ele.text());
         }

      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}
烟若柳尘 2024-12-12 01:50:08

您应该能够这样做:

div#game-desc p

您尝试过哪些方法不起作用?

You should just be able to do it with:

div#game-desc p

What have you tried that's not working?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文