用Java从Internet获取数据

发布于 2024-12-02 06:15:48 字数 652 浏览 0 评论 0原文

我想用 java 为我的大学项目制作以下应用程序。我知道核心java。我想知道我应该为这个项目“专门”阅读什么,因为时间较少:

它将有一个界面来放置您的查询。该字符串将作为对互联网搜索引擎的查询,并在搜索引擎的帮助下找到数据(我们看到的第一个网页(这是我这次应用程序的数据。:)) ).
我不想显示数据。我只想要 HTML 文件或生成的网页的源代码。听起来像 Common Getaway 界面吗?我不知道这件事。

但我认为这是出于同样的目的。如果是这个的话。请指导我了解如何实现这一点。
无论是什么,请指定

  • 问题1:我应该读什么?此时任何直接帮助都不是我的意图。我想自己实现它。
  • 问题 2: 连接到互联网也需要一些 jnlp 知识。

例如。就像在谷歌上我们搜索一些东西一样,它会向我们显示网站的链接。我可以看到这个生成的网页的源代码。我只想让我的应用程序能够运行此页面。

编辑: 我不想只依赖谷歌或任何特定的网络服务器。我想通过我的申请来决定。
另请参阅我的问题 2。

当我发现我们有网站的条件条款时,我是否应该尝试制作我的爬虫。那么我的应用程序不会违反规则吗?嗯,这对我来说很重要。

I thought of making the following application for my college project in java. I know core java. I want to know what should i read "specifically" for this project as there is less time:

It will have an interface to put your query. This string would go as a query to internet search engines and with the help of search engine find the data (the first web page that we see (that is data for my application for this time. :) )).
I do not want to display the data. I just want the HTML file or the source code of the generated web page. Is it sounding like Common Getaway Interface? I do not know about this.

But i think it for the same purpose. If it is this. please guide me to know how to implement this.
Whatever please specify

  • Problem 1 : What should i read ? Any direct help at this point is not my intention. I want to implement it myself.
  • Problem 2 : Is connecting to internet requires some jnlp knowledge too.

for eg. as on google we search something it shows us the links of the websites. I can see the source code of this generated web page. I just want this page for my application to work on.

EDIT:
I do not want to rely on google only or any particular web server. I want to decide that by my application.
Please also refer to my problem 2.

As i discovered that we have Terms of Conditions for websites should i try to make my crawler. Would then my application not breaking the rules . Well its important for me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

无声静候 2024-12-09 06:15:48

阿什什,
这是我推荐的。

  1. 从这些链接了解 JSON 的基础知识(简介lib 下载
  2. 然后查看 Google Web Search JSON API 这里
  3. 了解如何使用 HttpClient 库从服务器获取数据此处
  4. 现在您需要做的是,触发搜索的 get 请求,读取 JSON 响应,使用 #1 中的 JSON 库解析响应,然后您就可以获得搜索结果。
  5. 大多数搜索引擎(Bing 等)都提供 Jason/REST api,因此您可以对其他搜索引擎执行相同的操作。

注意:Jason API 通常在 UI 端的 JavaScritps 中使用,但由于它非常容易且快速学习,因此我建议您这样做。您还可以探索(如果时间允许)基于 XML 的 API。

Ashish,
Here what I would recommend.

  1. Learn the basics of JSON from these links (Introduction ,lib download)
  2. Then look at the Google Web Search JSON API here.
  3. Learn how to GET the data from servers using HttpClient library here.
  4. Now what you have to do is, fire a get request for the search, read the JSON response, parse the response using the JSON lib from #1 and you have the search results.
  5. Most of the search engines (Bing etc) offer Jason/REST apis so you can do the same for other search engines.

Note: Jason APIs are normally used from JavaScritps on the UI side but since its very easy and quick to learn, I suggested you that. You can also explore (if time permits) the XML based APIs also.

想你的星星会说话 2024-12-09 06:15:48
URL url = new URL("http://fooooo.com");
in = new BufferedReader(new InputStreamReader(url.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
  {
    System.out.println(inputLine);
  }

应该足以让你开始。

是的,请检查您是否没有违反网站的使用条款。搜索引擎并不真正喜欢您尝试通过程序访问它们。

许多公司(包括 Google)都有专门为此目的设计的 API。

URL url = new URL("http://fooooo.com");
in = new BufferedReader(new InputStreamReader(url.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
  {
    System.out.println(inputLine);
  }

Should be enough to get you started .

And yes , do check if you are not violating the usage terms of a website . Search Engines dont really like you trying to access them via a program .

Many , Including Google , has APIs specifically designed for this purpose.

百合的盛世恋 2024-12-09 06:15:48

你可以使用 HTMLUnit 做你想做的一切。它就像一个 Web 浏览器,但适用于 Java。在他们的网站上查看一些示例。

you can do everything you want using HTMLUnit. It´s like a web browser but for java. Check some examples at their website.

最单纯的乌龟 2024-12-09 06:15:48

阅读 Java 教程中的“使用 URL”以获得了解 HTMLUnit、HttpClient 等可用库背后的内容

Read "Working with URL's" in the Java tutorial to get an idea what is behind the available libs like HTMLUnit, HttpClient, etc

花想c 2024-12-09 06:15:48

我不想显示数据。我只想要 HTML 文件或生成的网页的源代码。

您可能也不需要 HTML。 Google 使用此 API 将搜索结果作为网络服务提供。对于其他搜索引擎 GIYF 也是如此。您将获得 XML 形式的搜索结果,这对您来说更容易解析。另外,XML 不会包含任何不需要的数据,例如广告。

I do not want to display the data. I just want the HTML file or the source code of the generated web page.

You probably dont need the HTML either. Google provide its search results as a web service using this API. Similarly for other search engine GIYF. You get the search results as XML, which is far more easier for you to parse. Plus the XML wont have any unwanted data like ads.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文