Solr DataImportHandler 未对所有记录建立索引

发布于 2024-11-27 18:12:22 字数 10167 浏览 1 评论 0原文

当我运行完全导入时,它仅索引 1 个文档。在日志中,我看到它处理了大部分记录(约 300 条记录)。我在日志中没有看到任何错误。为什么这不会索引查询的所有结果?

这是我的 data-config.xml

  <?xml version="1.0" encoding="UTF-8" ?>
     <dataConfig>
    <dataSource type="JdbcDataSource"
        driver="oracle.jdbc.driver.OracleDriver"
        url="URL"
        user="USER"
        password=PASSWORD"
        name="ds1" />
    <dataSource type="JdbcDataSource"
        driver="oracle.jdbc.driver.OracleDriver"
        url="URL"
        user="USER"
        password="PASSWORD"
        name="ds2" />
    <document name="content">
        <entity name="schema" dataSource="ds2" query="select VALUE from app_system_parameters where key = 'atg.current.catalog.schema' and expiration_date is null"> 
            <entity name="apps" dataSource="ds1" query="select CS_APPS_ID, package_name, market_url, price, min_os, supported_form_factor from ${schema.VALUE}.cs_apps">
                <entity name="nonSupportedProducts" dataSource="ds1" query="select product_id from cs_product_not_supported where cs_apps_id = '${apps.CS_APPS_ID}'"/>
                <entity name="rating" dataSource="ds1" query="select avg_overall_rating from cs_rating_summary where product_id = '${apps.CS_APPS_ID}'"/>
                <entity name="product" dataSource="ds1" query="select PARENT_CAT_ID, display_name, description, long_description from ${schema.VALUE}.dcs_product where product_id = '${apps.CS_APPS_ID}'">
                    <entity name="category" dataSource="ds1" query="select display_name as category_name from ${schema.VALUE}.dcs_category where category_id = '${product.PARENT_CAT_ID}'"/>
                </entity>
            </entity>
        </entity>
    </document>
</dataConfig>

模式片段

<field name="VALUE" type="string" indexed="true" stored="true"/>
<field name="CS_APPS_ID" type="string" indexed="true" stored="true" required="true"/>
<field name="package_name" type="text" indexed="true" stored="true"/> 
<field name="display_name" type="text" indexed="true" stored="true"/>
<field name="market_url" type="text" indexed="true" stored="true"/>
<field name="category_name" type="text" indexed="true" stored="true"/>
<field name="avg_overall_rating" type="tdouble" indexed="true" stored="true"/>
<field name="description" type="text" indexed="true" stored="true"/>
<field name="long_description" type="text" indexed="true" stored="true"/>
<field name="price" type="text" indexed="true" stored="true"/>
<field name="min_os" type="text" indexed="true" stored="true"/>
<field name="supported_form_factor" type="text" indexed="true" stored="true"/>
<field name="product_id" type="text" indexed="true" stored="true"/>

<uniqueKey>CS_APPS_ID</uniqueKey>

<defaultSearchField>display_name</defaultSearchField>

这是完全导入的结果

<response>

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>

<lst name="initArgs">

<lst name="defaults">
<str name="config">C:\solr/conf/data-config.xml</str>
</lst>
</lst>
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse"/>

<lst name="statusMessages">
<str name="Total Requests made to DataSource">2634</str>
<str name="Total Rows Fetched">1335</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2011-08-02 19:35:21</str>

<str name="">
Indexing completed. Added/Updated: 1 documents. Deleted 0 documents.
</str>
<str name="Committed">2011-08-02 19:42:36</str>
<str name="Optimized">2011-08-02 19:42:36</str>
<str name="Total Documents Processed">1</str>
<str name="Time taken ">0:7:14.131</str>
</lst>

<str name="WARNING">
This response format is experimental.  It is likely to change in the future.
</str>
</response>

以下是所有查询输出之后的日志结尾: <代码>

  Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 2, 2011 7:42:36 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDelete
s=false)
Aug 2, 2011 7:42:36 PM org.apache.solr.update.SolrIndexWriter close
FINE: Closing Writer DirectUpdateHandler2
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
        commit{dir=C:\solr\data\index,segFN=segments_2,version=1312332478694,gen
eration=2,filenames=[_0.tis, _0.nrm, _0.fnm, _0.tii, _0.frq, segments_2, _0.fdx,
 _0.prx, _0.fdt]
        commit{dir=C:\solr\data\index,segFN=segments_3,version=1312332478697,gen
eration=3,filenames=[_1.prx, _1.fdx, _1.tis, _1.frq, _1.fdt, _1.tii, _1.fnm, _1.
nrm, segments_3]
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1312332478697
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=1,evictions=0,si
ze=1,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si
ze=0,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=
1,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
0,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing Searcher@1f9fd541 main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
        queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=1,evictions=0,si
ze=1,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
        documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=
1,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.SolrWriter readIndexer
Properties
INFO: Read dataimport.properties
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.SolrWriter persist
INFO: Wrote last indexed time to dataimport.properties
Aug 2, 2011 7:42:36 PM org.apache.solr.update.processor.LogUpdateProcessor finis
h
INFO: {deleteByQuery=*:*,add=[prod27350148],optimize=} 0 1
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.DocBuilder execute
INFO: Time taken = 0:7:14.131

<代码>

When I run a full-import it is only indexing 1 document. In the logs I see it processing most of the records (~300 records). I don't see any errors in the logs. Why won't this index all of the results from the query?

Here is my data-config.xml

  <?xml version="1.0" encoding="UTF-8" ?>
     <dataConfig>
    <dataSource type="JdbcDataSource"
        driver="oracle.jdbc.driver.OracleDriver"
        url="URL"
        user="USER"
        password=PASSWORD"
        name="ds1" />
    <dataSource type="JdbcDataSource"
        driver="oracle.jdbc.driver.OracleDriver"
        url="URL"
        user="USER"
        password="PASSWORD"
        name="ds2" />
    <document name="content">
        <entity name="schema" dataSource="ds2" query="select VALUE from app_system_parameters where key = 'atg.current.catalog.schema' and expiration_date is null"> 
            <entity name="apps" dataSource="ds1" query="select CS_APPS_ID, package_name, market_url, price, min_os, supported_form_factor from ${schema.VALUE}.cs_apps">
                <entity name="nonSupportedProducts" dataSource="ds1" query="select product_id from cs_product_not_supported where cs_apps_id = '${apps.CS_APPS_ID}'"/>
                <entity name="rating" dataSource="ds1" query="select avg_overall_rating from cs_rating_summary where product_id = '${apps.CS_APPS_ID}'"/>
                <entity name="product" dataSource="ds1" query="select PARENT_CAT_ID, display_name, description, long_description from ${schema.VALUE}.dcs_product where product_id = '${apps.CS_APPS_ID}'">
                    <entity name="category" dataSource="ds1" query="select display_name as category_name from ${schema.VALUE}.dcs_category where category_id = '${product.PARENT_CAT_ID}'"/>
                </entity>
            </entity>
        </entity>
    </document>
</dataConfig>

schema snippet

<field name="VALUE" type="string" indexed="true" stored="true"/>
<field name="CS_APPS_ID" type="string" indexed="true" stored="true" required="true"/>
<field name="package_name" type="text" indexed="true" stored="true"/> 
<field name="display_name" type="text" indexed="true" stored="true"/>
<field name="market_url" type="text" indexed="true" stored="true"/>
<field name="category_name" type="text" indexed="true" stored="true"/>
<field name="avg_overall_rating" type="tdouble" indexed="true" stored="true"/>
<field name="description" type="text" indexed="true" stored="true"/>
<field name="long_description" type="text" indexed="true" stored="true"/>
<field name="price" type="text" indexed="true" stored="true"/>
<field name="min_os" type="text" indexed="true" stored="true"/>
<field name="supported_form_factor" type="text" indexed="true" stored="true"/>
<field name="product_id" type="text" indexed="true" stored="true"/>

<uniqueKey>CS_APPS_ID</uniqueKey>

<defaultSearchField>display_name</defaultSearchField>

here is the result from the full-import

<response>

<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>

<lst name="initArgs">

<lst name="defaults">
<str name="config">C:\solr/conf/data-config.xml</str>
</lst>
</lst>
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse"/>

<lst name="statusMessages">
<str name="Total Requests made to DataSource">2634</str>
<str name="Total Rows Fetched">1335</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2011-08-02 19:35:21</str>

<str name="">
Indexing completed. Added/Updated: 1 documents. Deleted 0 documents.
</str>
<str name="Committed">2011-08-02 19:42:36</str>
<str name="Optimized">2011-08-02 19:42:36</str>
<str name="Total Documents Processed">1</str>
<str name="Time taken ">0:7:14.131</str>
</lst>

<str name="WARNING">
This response format is experimental.  It is likely to change in the future.
</str>
</response>

Here are the end of the logs after all of the query output:

  Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 2, 2011 7:42:36 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDelete
s=false)
Aug 2, 2011 7:42:36 PM org.apache.solr.update.SolrIndexWriter close
FINE: Closing Writer DirectUpdateHandler2
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
        commit{dir=C:\solr\data\index,segFN=segments_2,version=1312332478694,gen
eration=2,filenames=[_0.tis, _0.nrm, _0.fnm, _0.tii, _0.frq, segments_2, _0.fdx,
 _0.prx, _0.fdt]
        commit{dir=C:\solr\data\index,segFN=segments_3,version=1312332478697,gen
eration=3,filenames=[_1.prx, _1.fdx, _1.tis, _1.frq, _1.fdt, _1.tii, _1.fnm, _1.
nrm, segments_3]
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1312332478697
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=1,evictions=0,si
ze=1,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si
ze=0,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@48164feb main from Searcher@1f9fd541 main
        documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=
1,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@48164feb main
        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
0,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 2, 2011 7:42:36 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher Searcher@48164feb main
Aug 2, 2011 7:42:36 PM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing Searcher@1f9fd541 main
        fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,siz
e=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cum
ulative_inserts=0,cumulative_evictions=0}
        queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=1,evictions=0,si
ze=1,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.0
0,cumulative_inserts=1,cumulative_evictions=0}
        documentCache{lookups=2,hits=1,hitratio=0.50,inserts=1,evictions=0,size=
1,warmupTime=0,cumulative_lookups=2,cumulative_hits=1,cumulative_hitratio=0.50,c
umulative_inserts=1,cumulative_evictions=0}
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.SolrWriter readIndexer
Properties
INFO: Read dataimport.properties
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.SolrWriter persist
INFO: Wrote last indexed time to dataimport.properties
Aug 2, 2011 7:42:36 PM org.apache.solr.update.processor.LogUpdateProcessor finis
h
INFO: {deleteByQuery=*:*,add=[prod27350148],optimize=} 0 1
Aug 2, 2011 7:42:36 PM org.apache.solr.handler.dataimport.DocBuilder execute
INFO: Time taken = 0:7:14.131

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

乖不如嘢 2024-12-04 18:12:22

为了回答我自己的问题,我需要向根实体元素添加一个标志 (rootEntity="false")。这是因为该查询提取一个属性以注入到嵌套实体中,但与嵌套实体的结果无关。

To answer my own question, I needed to add a flag (rootEntity="false") to the root entity element. This is because that query pulls a property to inject into the nested entities but isn't tied to the results of the nested entities.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文