MySQL数据库搜索

发布于 2024-08-13 08:37:56 字数 1459 浏览 9 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

流绪微梦 2024-08-20 08:37:56

你的“搜索引擎”的速度主要取决于三件事:

  1. 你的 SQL 查询
  2. 你的数据库设计
  3. 你的 MySQL 配置

所以不会有“翻转这个开关,你将获得超级性能”。您将需要解决所有这些领域。除此之外,还有许多其他因素也会对性能产生影响。例如:操作系统、硬盘、内存量等。

让我们从 MySQL 配置开始。您应该首先尝试 mysql 查询缓存功能。如果您主要进行读取操作,这可以提高您的性能,因为所有内容都来自缓存并且不需要 I/O 操作。

查询缓存文档

阅读此处: MySQL 区域是您的数据库设计或您选择的数据库引擎。基本上你有三个选择:InnoDB、MyIsam 和 Memory(还有其他的,但我不太了解)。

据我所知MyIsam和Memory仅支持表锁定而不支持行锁定。但同样,如果您主要执行读取操作,这不会影响您。一般来说,它们都比 InnoDB 更快。如果我是你,我会从记忆开始,因为一切都保存在记忆中。但请注意其中的含义:您可能需要更多内存,并且如果服务器崩溃,您将丢失未保存的数据。

另一方面,InnoDB 为您提供了很多数据安全性,并且如果配置正确的话也可以非常快。不幸的是,这是一个广阔的区域。所以我不会涵盖所有内容。首先要做的一件事是将 innodb_buffer_pool_size 设置为内存的 80% 左右。因此,如果您有 10GB RAM,则可以将其设置为 8GB。

如果您的服务器有超过 8 个 CPU,您可能还需要将 innodb_thread_concurrency 设置为更高的数字。您应该使用 2 * CPU 数量。

如果您想了解有关 MySQL 性能的更多信息,您应该喝杯咖啡并阅读此博客:MySQL 性能博客

另一件重要的事情可能是在某些列上使用索引。但我真的无法判断它是否会在你的情况下得到回报,因为我对中文词典的了解有限;)

一般来说,你的主键字段应该有一个索引。除此之外,您还可以对经常查询且很少更改的字段使用索引(索引字段上的每次更改都会使索引无效,因此必须重新编译 -> 性能问题)。

据我所知,它也应该只在列包含大量不同数据的情况下使用。例如,如果您有一个“性别”列,其中仅包含“男性”或“女性”,那么您很可能只会将索引树分成两半。如果您有 100 个用户,最终将有 50 行。但是,如果您为其电话号码使用索引(在大多数情况下该索引是唯一的),那么您最终将只得到一行,这会更有效。

因此,也许您应该对 ch_smpl 列使用索引。

最后但并非最不重要的一点是你的询问。我的第一个建议是选择尽可能少的数据。这意味着要避免这样的查询:

select * from ...

在您的情况下:如果您只想获得“我”的定义,则应该使用此查询:

select definition from dictionary where ch_smpl = '我'

而不是

select * from dictionary where ch_smpl = '我'

还要避免在搜索词前面带有百分号的“Like”语句,因为它会停用搜索词该列的索引。

例如:

select * from dictionary where ch_smpl like '%我'

之后使用百分号

select * from dictionary where ch_smpl like '我%'

您应该仅在术语“最后一条建议” 。正如我之前所说,没有可以翻转的特殊开关。您可以采取很多措施来获得更好的性能。尝试一些事情并衡量性能。

The speed of your "search engine" depends mainly on three things:

  1. Your SQL-Query
  2. Your database design
  3. Your MySQL configuration

So there will be no "flip this switch and you will get super duper performance". You will need to tackle all these areas. In addition to that there are many other things that can have an impact on performance. For example: operating system, hard drive, amount of memory etc.

Lets start with MySQL configuration. You should try out the mysql query caching feature first. If you have mainly read operations this can boost your performance since everything comes from cache and no i/o operations are needed.

Read here: MySQL Documentation on Query Cache

Another important area is your database design or which database engine you choose. Basically you have three options: InnoDB, MyIsam and Memory (There are others but I don't really know them).

As far as I know MyIsam and Memory only support table locking and not row locking. But again, if you mainly do read operations this won't affect you. In general they are both faster than InnoDB. If I were you I would start with Memory since everything is hold in memory. But be aware of the implications: you maybe need more memory and you will lose unsaved data if the server crashes.

InnoDB on the other hand gives you a lot of data safety and can also be pretty fast if you configure it correct. Unfortunetly this is a wide area. So I won't cover it all. One thing to start with is to set innodb_buffer_pool_size to around 80% of your memory. So if you have 10GB of RAM you could set it to 8GB.

If your server has more than 8 CPUs you also might want to set innodb_thread_concurrency to a higher number. You should use 2 * Number of CPUs.

If you want to know more about MySQL performance you should grab a cup of coffee and read this blog: MySQL performance blog

Another important thing could be to use indices on some of your columns. But I can't really tell if it will pay off in your case since my knowledge of the chinese dictionary is limited ;)

Generally speaking your primary key field should have an index. In addition to that you can use indices for fields you often query and that rarely change (every change on an index field invalidates the index so it has to be recompiled -> performance issue).

As far as I know it also should only be used in case where the column holds a lot of different data. If you have for example a column "gender" which only holds "male" or "female" you will most likely only break the index tree in half. If you have 100 users you will end up with 50 rows. But if you would use an index for their phone number, which is in most cases unique, you will end up with only one row which is much more effective.

So maybe you should use an index for the column ch_smpl.

Last but not least your query. My first advice is to select as little data as possible. That means avoid queries like this:

select * from ...

In your case: If you only want to have the definition for 我 you should use this query:

select definition from dictionary where ch_smpl = '我'

and not

select * from dictionary where ch_smpl = '我'

Also avoid "Like"-Statements with the percent symbol in front of the searchterm since it will deactivate the index for this column.

For example:

select * from dictionary where ch_smpl like '%我'

You should use the percent symbol only after the term:

select * from dictionary where ch_smpl like '我%'

One last piece of advice. There is no special switch you can flip as I said before. There are a lot of things you can do to achieve better performance. Try a few things out and measure the performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文