在 Solr 中使用 Scripttransformer 发送 HTTP 请求

发布于 2024-11-25 19:43:24 字数 1894 浏览 1 评论 0原文

我使用 solr 来索引 RSS 提要,并使用 DataImportHandler 来解析 url,然后对它们建立索引。现在我已经实现了一个 Web 服务,它接受 url 并创建缩略图并将其存储在本地目录中。

所以这就是我想要做的:解析 url 后,我想使用该 URL 向 Web 服务发送 Http 请求。 ScriptTransformer 似乎是可行的方法,这是我的 data-config.xml 文件的外观。

    <dataConfig>
    <script> <![CDATA[ function sendURLRequest(row){ 
var url = new java.net.URL("http://***********/GenerateThumbnail?url=http://money.cnn.com/2011/07/20/news/economy/debt_ceiling_deal/index.htm?cnn=yes"); 
url.openConnection().connect(); 
return row; } ]]> 
</script>

  <dataSource type="JdbcDataSource" name="dbSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/solr_sources" user="root" password="******"/>



  <document>

    <entity name="rssFeedItems" rootEntity="false"  dataSource="dbSource"  query="select url from rss_feeds">

      <entity name="rssFeeds" dataSource="urlSource" url="${rssFeedItems.url}" transformer="script:sendURLRequest" processor="XPathEntityProcessor" forEach="/rss/channel/item">
        <field column="title"        xpath="/rss/channel/item/title"/>
        <field column="link"         xpath="/rss/channel/item/link" />
        <field column="description"  xpath="/rss/channel/item/description" />
        <field column="date_published" xpath="/rss/channel/item/pubDate"/>
      </entity>
    </entity>
.................
................

正如您从数据配置文件中看到的,我目前正在测试是否可以通过硬编码虚拟 URL 来实现。

url.openConnection().connect();应该发出 HTTP 请求。但图像没有生成。

我没有看到编译错误。我尝试了打印消息的示例脚本

var v = new java.lang.Runnable() {
                    run: function() { print('********************PRINTING************************'); }
               }
       v.run();

并且它有效。

我什至使用函数名称来强制它抛出一些编译错误,它确实抛出了错误,这表明它能够创建类类型 URL 和 URL Connection 的对象。

有什么建议吗?

I am using solr to index RSS feeds and I am using DataImportHandler to parse the urls and then index them. Now I have implemented a web service that takes a url and creates an thumbnail image and stores it in a local directory.

So here is what I want to do: After the url is parsed, I want to send a Http request to the web service with the URL. ScriptTransformer seemed the way to go and here is how my data-config.xml file looks.

    <dataConfig>
    <script> <![CDATA[ function sendURLRequest(row){ 
var url = new java.net.URL("http://***********/GenerateThumbnail?url=http://money.cnn.com/2011/07/20/news/economy/debt_ceiling_deal/index.htm?cnn=yes"); 
url.openConnection().connect(); 
return row; } ]]> 
</script>

  <dataSource type="JdbcDataSource" name="dbSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/solr_sources" user="root" password="******"/>



  <document>

    <entity name="rssFeedItems" rootEntity="false"  dataSource="dbSource"  query="select url from rss_feeds">

      <entity name="rssFeeds" dataSource="urlSource" url="${rssFeedItems.url}" transformer="script:sendURLRequest" processor="XPathEntityProcessor" forEach="/rss/channel/item">
        <field column="title"        xpath="/rss/channel/item/title"/>
        <field column="link"         xpath="/rss/channel/item/link" />
        <field column="description"  xpath="/rss/channel/item/description" />
        <field column="date_published" xpath="/rss/channel/item/pubDate"/>
      </entity>
    </entity>
.................
................

As you can see from the data-config file, I am currently testing to see if this would work by hard coding a dummy URL.

url.openConnection().connect(); Should make the HTTP Request. But the image is not generated.

I see no compile errors. I tried the example script of printing out a message

var v = new java.lang.Runnable() {
                    run: function() { print('********************PRINTING************************'); }
               }
       v.run();

And it worked.

I even played around with the function names to force it throw some compile errors and it did throw errors which shows that it is able to create the objects of class type URL and URL Connection.

Any suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

向日葵 2024-12-02 19:43:24

我认为您需要做的不仅仅是 connect() 到 URL 来发出 HTTP GET。也许可以尝试:

var url = new java.net.URL("http://***********/GenerateThumbnail?url=http://money.cnn.com/2011/07/20/news/economy/debt_ceiling_deal/index.htm?cnn=yes"); 
var connection = url.openConnection();
connection.connect();
connection.getContent();
return row;

我只是做了一个小实验,因为我很好奇,发现 url.openConnection().connect() 甚至没有真正打开到我的测试服务器的连接。直到我调用 getContent() 客户端才连接并发出 HTTP 请求。也许对于 HTTP 协议,java URL 库不需要打开有状态连接,因此在请求数据之前不会连接(与使用 URL 访问 FTP 地址之类的内容相反)。

I think you need to do more than just connect() to the URL to issue an HTTP GET. Maybe try:

var url = new java.net.URL("http://***********/GenerateThumbnail?url=http://money.cnn.com/2011/07/20/news/economy/debt_ceiling_deal/index.htm?cnn=yes"); 
var connection = url.openConnection();
connection.connect();
connection.getContent();
return row;

I just did a little experiment because I was curious and found that url.openConnection().connect() didn't even actually open a connection to my test server. It wasn't until I called getContent() that the client connected and issued an HTTP request. Perhaps for the HTTP protocol the java URL library doesn't see a need to open a stateful connection and therefore doesn't connect until the data is requested (as opposed to if URL was used to access something like an FTP address).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文