Heroku 上托管的 Rails 应用程序的地理空间和全文搜索
我正在规划一个 Rails 应用程序,该应用程序将托管在 Heroku 上,并且需要地理空间和全文搜索功能。
我知道 Heroku 提供了 WebSolr 和 IndexTank 听起来他们可以完成这项工作,但我想知道这是否可以在 MySQL 和/或 PostgreSQL 中完成,而无需支付任何附加组件?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
根据应用程序的规模,您应该能够轻松地在 MySQL 中完成 FULLTEXT 和 SPATIAL 索引。一旦您的应用程序变得庞大,即具有高并发性的数亿行和每秒数千个请求,您可能需要迁移到另一个解决方案来进行 FULLTEXT 或 SPATIAL 查询。但是,我不建议尽早进行优化,因为它很难正确完成。在可预见的未来,MySQL 应该足够了。
您可以在此处阅读有关 MySQL 中的空间索引的信息。您可以在此处阅读有关 MySQL 全文索引的信息。最后,我建议采取概述的步骤 此处 使您的 schema.rb 文件和 rake 任务能够使用这两种索引类型。
我只使用过 MySQL,但我的理解是 PostgreSQL 也有一个很好的地理空间索引解决方案。
Depending on the scale of your application you should be able to accomplish both FULLTEXT and SPATIAL indexes in MySQL with ease. Once your application gets massive, i.e hundreds of millions of rows with high concurrency and multiples of thousands of requests per second you might need to move to another solution for either FULLTEXT or SPATIAL queries. But, I wouldn't recommend optimize for that early on, since it can be very hard to do properly. For the foreseeable future MySQL should suffice.
You can read about spatial indexes in MySQL here. You can read about fulltext indexes in MySQL here. Finally, I would recommend taking the steps outlined here to make your schema.rb file and rake tasks work with these two index types.
I have only used MySQL for both, but my understanding is that PostgreSQL has a good geo-spatial index solution as well.
如果您在 Heroku 上有数据库,则可以使用 Postgres 对全文搜索的支持: http ://www.postgresql.org/docs/8.3/static/textsearch.html。 Heroku 运行的最旧的服务器(用于共享数据库)位于 8.3 和 8.4 上。最新的是 9.0。
注意到这个小事实的博客文章可以在这里看到:https ://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku.html
显然,那个“texticle”(呵呵,可爱。)插件工作......非常好。据我了解,它甚至会为您创建正确的索引。
基本原理是这样的:postgres 全文搜索速度相当快并且没有任何麻烦(尽管 Rails 集成可能不是很好),尽管它不提供 Solr 或 IndexTank 的花哨功能。确保您阅读了如何正确设置 GIN 和/或 GiST 索引,以及如何使用 tsvector/tsquery 类型。
简短版本:
CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body));
。在这种情况下,“body”是被索引的字段。@@
运算符:SELECT * FROM ... WHERE to_tsvector('english', pgweb.body) @@ to_tsquery('hello & world') LIMIT 30
困难的部分可能是将事物映射回应用程序领域,之前引用的博客文章正在尝试做到这一点。
专用数据库也可以通过 PostGIS 申请,这是一个非常强大且功能齐全的系统,用于索引和查询地理数据。 OpenStreetMap 广泛使用 PostgreSQL 几何类型(内置),许多人将其与 PostGIS 结合起来取得了很好的效果。
这两个(全文搜索、PostGIS)都利用了 Postgres 中的可扩展数据类型和索引基础设施,因此您应该期望它们能够以高性能处理许多很多记录(如果情况看起来像这样,请花一点时间仔细检查情况)被抓了)。您还可以利用以下事实:您能够将这些功能与事务和结构化数据结合使用。例如:
CREATE TABLE products(pk bigserial、价格数字、数量整数、描述文本);
可以轻松地与全文搜索一起使用...任何文本字段都可以,并且可以与常规属性(本例中为价格、数量)相关。If you have a database at Heroku, you can use Postgres's support for Full Text Search: http://www.postgresql.org/docs/8.3/static/textsearch.html. The oldest servers Heroku runs (for shared databases) are on 8.3 and 8.4. The newest are on 9.0.
A blog post noticing this little fact can be seen here: https://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku.html
Apparently, that "texticle" (heh. cute.) addon works...pretty well. It will even create the right indexes for you, as I understand it.
Here's the underlying story: postgres full-text-search is pretty fast and fuss-free (although Rails-integration may not be great), although it does not offer the bells and whistles of Solr or IndexTank. Make sure you read about how to properly set up GIN and/or GiST indexes, and use the tsvector/tsquery types.
The short version:
CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body));
. In this case "body" is the field being indexed.@@
operator:SELECT * FROM ... WHERE to_tsvector('english', pgweb.body) @@ to_tsquery('hello & world') LIMIT 30
The hard part may be mapping things back into application land, the blog post previously cited is trying to do that.
The dedicated databases can also be requisitioned with PostGIS, which is a very powerful and fully featured system for indexing and querying geographical data. OpenStreetMap uses the PostgreSQL geometry types (built-in) extensively, and many people combine that with PostGIS to great effect.
Both of these (full text search, PostGIS) take advantage of the extensible data type and indexing infrastructure in Postgres, so you should expect them to work with high performance for many, many records (spend a little time carefully reviewing the situation if things look busted). You might also take advantage of fact that you are able to leverage these features in combination with transactions and structured data. For example:
CREATE TABLE products (pk bigserial, price numeric, quantity integer, description text);
can just as easily be used with full text search...any text field will do, and it can be in connection with regular attributes (price, quantity in this case).我会使用 Thinking sphinx,这是一个也可部署在 heroku 上的全文搜索引擎。
它内置了地理搜索: http://freelancing-god.github.com /ts/en/geosearching.html
编辑:
Sphynx 几乎已准备好使用 Heroku,请参见此处:http:// Flying-sphinx.com/
I'd use thinking sphinx, a full text search engine also deployable on heroku.
It has geo search built-in: http://freelancing-god.github.com/ts/en/geosearching.html
EDIT:
Sphynx is almost ready for heroku, see here: http://flying-sphinx.com/
IndexTank 现在在 Heroku 上免费提供多达 100k 文档,我们只是还没有更新文档。这可能不足以满足您的需求,但我想我会让您知道以防万一。
IndexTank is now free up to 100k documents on Heroku, we just haven't updated the documentation. This may not be enough for your needs, but I thought I'd let you know just in case.
对于通过 Postgre 进行全文搜索,我推荐 pg_search,目前我自己在 heroku 上使用它。我没有使用 texticle 但从我所看到的 pg_search 最近有更多的开发活动,并且它是建立在texticle(它不会为你添加索引,你必须自己做)。
我现在找不到该线程,但我看到 Heroku 提供了 pg 地理搜索选项,但它处于测试阶段。
我的建议是,如果您无法找到 postgre,解决方案是托管您自己的 SOLR 实例(在 EC2 实例上)并使用 sunspot solr gem 将其与 Rails 集成。
我已经实现了自己的解决方案,并使用了 WebSolr 。基本上,这就是他们为您提供的无忧无虑的 SOLR 实例。值这个钱吗,我觉得不值。对于也使用 sunspot solr 客户端的集成,因此您只需支付 20 美元/40 美元/... 来为您托管 SOLR。我知道您还可以获得备份、维护等服务,但请说我便宜点,我更喜欢我自己的实例。此外,WebSolr 被锁定在 1.4.x 版本的 SOLR 上。
For full text search via Postgre I recommend pg_search, I am using it myself on heroku at the moment. I have not used texticle but from what I can see pg_search has more development activity lately and it has been built upon texticle (it will not add indexes for you, you have to do it yourself).
I cannot find the thread now but I saw that Heroku gave option for pg geo search but it was in beta.
My advice is if you are not able to find postgre solution is to host your own instance of SOLR (on EC2 instance) and use sunspot solr gem to integrate it with rails.
I have implemented my own solution and used WebSolr as well. Basically that is what they give you their own SOLR instance hassle free. Is it worth the money, in my opinion no. For integration that use sunspot solr client as well, so it is just are you going to pay somebody 20$/40$/... to host SOLR for you. I know you also get backups, maintenance etc. but call me cheap I prefer my own instance. Also WebSolr is locked on 1.4.x version of SOLR.