使用 solr 的提取功能时如何将数据添加到动态字段?

发布于 2024-12-10 07:32:43 字数 2816 浏览 3 评论 0原文

我正在使用名为 solr-php-client (http://code.google.com/p/solr-php-client/) 的 PHP 库来与我的 Solr 服务器交互。我可以从文档中提取数据,存储它并搜索它,但我似乎无法让它允许我将自己的数据添加到索引参数中:

$aParams = array
(
    "literal.ClassName_ms" => "File",
    "literal.SS_ID_i" => 73,
    "literal.Name_ms" => "OverviewOfBenefits.pdf",
    "literal.title" => "Overview Of Benefits",
    "literal.Created_dt" => "2011-09-19T13:50:30Z",
    "literal.last_modified_dt" => "2011-10-12T19:33:59Z",
    "literal.SS_Stage_ms" => "Live",
    "literal.ClassNameHierarchy_ms" => array("Object","ViewableData","DataObject","File"),
    "literal.id" => "File_73_Live",
    "fmap.content" => "text",
);

try {

    $oResponse = $oSOLR->extract($sFilePath, $aParams);
    $oSOLR->commit();
    $oSOLR->optimize();

}
catch(Exception $e) {
    var_dump($e);
}

我可以查询“文本”并获取结果:

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
 <lst name="params">
  <str name="indent">on</str>
  <str name="start">0</str>
  <str name="q">text:Overview</str>
  <str name="rows">10</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <arr name="content_type"><str>application/pdf</str></arr>
  <str name="id">File_73_Live</str>
  <date name="last_modified">2011-02-07T16:21:10Z</date>
 </doc>
</result>
</response>

但是我无法查询任何动态字段,即“SS_Stage_ms”:

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
 <lst name="params">
  <str name="indent">on</str>
  <str name="start">0</str>
  <str name="q">SS_Stage_ms:Live</str>
  <str name="rows">10</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>

以下是适用的架构定义:

<field name="id" type="string" indexed="true" stored="true" required="true" /> 
<field name="title" type="text" indexed="true" stored="true" multiValued="true"/>
<field name="text" type="text" indexed="true" stored="false" multiValued="true"/>
<dynamicField name="*_i"  type="int"    indexed="true"  stored="false"/>
<dynamicField name="*_ms"  type="string"  indexed="true"  stored="false" multiValued="true"/>
<dynamicField name="*_dt" type="date"    indexed="true"  stored="false"/>

I'm using a PHP library called solr-php-client (http://code.google.com/p/solr-php-client/) to interface with my Solr server. I can extract data from the document, store it, and search on it, but I can't seem to get it to allow me to add my own data to the parameters for indexing:

$aParams = array
(
    "literal.ClassName_ms" => "File",
    "literal.SS_ID_i" => 73,
    "literal.Name_ms" => "OverviewOfBenefits.pdf",
    "literal.title" => "Overview Of Benefits",
    "literal.Created_dt" => "2011-09-19T13:50:30Z",
    "literal.last_modified_dt" => "2011-10-12T19:33:59Z",
    "literal.SS_Stage_ms" => "Live",
    "literal.ClassNameHierarchy_ms" => array("Object","ViewableData","DataObject","File"),
    "literal.id" => "File_73_Live",
    "fmap.content" => "text",
);

try {

    $oResponse = $oSOLR->extract($sFilePath, $aParams);
    $oSOLR->commit();
    $oSOLR->optimize();

}
catch(Exception $e) {
    var_dump($e);
}

I can query "text" and get results:

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
 <lst name="params">
  <str name="indent">on</str>
  <str name="start">0</str>
  <str name="q">text:Overview</str>
  <str name="rows">10</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <arr name="content_type"><str>application/pdf</str></arr>
  <str name="id">File_73_Live</str>
  <date name="last_modified">2011-02-07T16:21:10Z</date>
 </doc>
</result>
</response>

But I can't query any of the dynamic fields, i.e. "SS_Stage_ms":

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
 <lst name="params">
  <str name="indent">on</str>
  <str name="start">0</str>
  <str name="q">SS_Stage_ms:Live</str>
  <str name="rows">10</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>

Here are the applicable schema definitions:

<field name="id" type="string" indexed="true" stored="true" required="true" /> 
<field name="title" type="text" indexed="true" stored="true" multiValued="true"/>
<field name="text" type="text" indexed="true" stored="false" multiValued="true"/>
<dynamicField name="*_i"  type="int"    indexed="true"  stored="false"/>
<dynamicField name="*_ms"  type="string"  indexed="true"  stored="false" multiValued="true"/>
<dynamicField name="*_dt" type="date"    indexed="true"  stored="false"/>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

把时间冻结 2024-12-17 07:32:43

我切换了模式定义来存储数据,这样我就可以看到 Solr 如何解释字段:

<field name="id" type="string" indexed="true" stored="true" required="true" /> 
<field name="title" type="text" indexed="true" stored="true" multiValued="true"/>
<field name="text" type="text" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="*_i"  type="int"    indexed="true"  stored="true"/>
<dynamicField name="*_ms"  type="string"  indexed="true"  stored="true" multiValued="true"/>
<dynamicField name="*_dt" type="date"    indexed="true"  stored="true"/>

执行此操作后,我发现所有字段都切换为小写。我找到了我的答案(http://wiki.apache.org/solr/ExtractingRequestHandler):

lowernames=true|false - 将所有字段名称映射为带有下划线的小写字母。例如,Content-Type 将映射到 content_type。

默认情况下,“lowernames”设置为 true。我在参数中添加了“lowernames”,将其设置为 false,瞧,它成功了!

I switched the schema definitions to store the data so I could see how the fields were being interpreted by Solr:

<field name="id" type="string" indexed="true" stored="true" required="true" /> 
<field name="title" type="text" indexed="true" stored="true" multiValued="true"/>
<field name="text" type="text" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="*_i"  type="int"    indexed="true"  stored="true"/>
<dynamicField name="*_ms"  type="string"  indexed="true"  stored="true" multiValued="true"/>
<dynamicField name="*_dt" type="date"    indexed="true"  stored="true"/>

After doing this, I found that all of the fields were getting switched to lowercase. I found my answer (http://wiki.apache.org/solr/ExtractingRequestHandler):

lowernames=true|false - Map all field names to lowercase with underscores. For example, Content-Type would be mapped to content_type.

By default "lowernames" is set to true. I added "lowernames" to the parameters, set it to false, and voila, it worked!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文