MySQL,选择至少有X个字符匹配的记录

发布于 2024-07-27 20:17:01 字数 492 浏览 4 评论 0原文

我正在努力完成以下任务。 假设我们有一个包含这些字段(ID、内容)的表

1 | 苹果

2 | 菠萝

3 | 应用

4 | 现在

,我正在寻找一个函数来告诉我所有可能的常见匹配项。 例如,如果参数为“3”,则该函数将返回出现在多个记录中的 3 个字符的所有可能字符串。

在这种情况下,我得到“app”,“ppl”,“ple”,“ati”,“tio”,“ion”

如果参数是“4”,我得到:“appl”,“pple”,“atio ","tion"

如果参数是 "5",我得到: "apple","ation"

如果参数是 "6",则不返回任何内容。

到目前为止,我还没有找到一个函数可以完成这个任务。

谢谢!

一些额外信息: 我在带有 MySQL 数据库的 PHP 脚本中使用它。 我真的只想给出字符数作为参数,当然还有要搜索的表。

I am trying to accomplish the following. Let's say we have a table that contains these fields (ID, content)

1 | apple

2 | pineapple

3 | application

4 | nation

now, I am looking for a function that will tell me all possible common matches. For example, if the argument is "3", the function will return all possible strings from 3 characters that appear in more then one record.

In this case, I get "app","ppl","ple","ati","tio","ion"

If the argument is "4", i get: "appl","pple","atio","tion"

If the arugment is "5", i get: "apple","ation"

If the argument is "6", nohting is returned.

Untill now, I did not find a function that accomplishes this.

Thx!

Some extra information:
I am using this in a PHP script with a MySQL database. I really just want to give the amount of characters as an argument and of course the table to search in.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

眼藏柔 2024-08-03 20:17:01

嗯,这有点丑陋,但效果很好。 它是通用 SQL,适用于任何环境。 只需生成多个大于您正在读取的字段的最大长度的子字符串选择即可。 将函数中的数字 50 更改为超出字段长度的数字。 它可能会返回一个非常长的查询,但正如我所说,它会正常工作。 下面是一个 Python 示例:

import sqlite3

c = sqlite3.connect('test.db')

c.execute('create table myTable (id integer, content varchar[50])')
for id, content in ((1,'apple'),(2,'pineapple'),(3,'application'),(4,'nation')):
    c.execute('insert into myTable values (?,?)', [id,content])

c.commit();

def GenerateSQL(substrSize):
    subqueries = ["select substr(content,%i,%i) AS substr, count(*) AS myCount from myTable where length(substr(content,%i,%i))=%i group by substr(content,%i,%i) " % (i,substrSize,i,substrSize,substrSize,i,substrSize)  for i in range(50)]
    sql = 'select substr FROM \n\t(' + '\n\tunion all '.join(subqueries) + ') \nGROUP BY substr HAVING sum(myCount) > 1'
    return sql

print GenerateSQL(3)

print c.execute(GenerateSQL(3)).fetchall()

生成的查询如下所示:

select substr FROM 
    (select substr(content,0,3) AS substr, count(*) AS myCount from myTable where length(substr(content,0,3))=3 group by substr(content,0,3) 
    union all select substr(content,1,3) AS substr, count(*) AS myCount from myTable where length(substr(content,1,3))=3 group by substr(content,1,3) 
    union all select substr(content,2,3) AS substr, count(*) AS myCount from myTable where length(substr(content,2,3))=3 group by substr(content,2,3) 
    union all select substr(content,3,3) AS substr, count(*) AS myCount from myTable where length(substr(content,3,3))=3 group by substr(content,3,3) 
    union all select substr(content,4,3) AS substr, count(*) AS myCount from myTable where length(substr(content,4,3))=3 group by substr(content,4,3) 
    ... ) 
GROUP BY substr HAVING sum(myCount) > 1

它产生的结果是:

[(u'app',), (u'ati',), (u'ion',), (u'nat',), (u'pin',), (u'ple',), (u'ppl',), (u'tio',)]

Well, this is kind of ugly, but it does work fine. It's generic SQL and will work in any environment. Simply generate a number of selects of a substring that is greater than the maximum length of the field that you're reading. Change the number 50 in the function to a number that exceeds your fieldlength. It may return a realllly long query, but like I said, it'll work fine. Here is an example in Python:

import sqlite3

c = sqlite3.connect('test.db')

c.execute('create table myTable (id integer, content varchar[50])')
for id, content in ((1,'apple'),(2,'pineapple'),(3,'application'),(4,'nation')):
    c.execute('insert into myTable values (?,?)', [id,content])

c.commit();

def GenerateSQL(substrSize):
    subqueries = ["select substr(content,%i,%i) AS substr, count(*) AS myCount from myTable where length(substr(content,%i,%i))=%i group by substr(content,%i,%i) " % (i,substrSize,i,substrSize,substrSize,i,substrSize)  for i in range(50)]
    sql = 'select substr FROM \n\t(' + '\n\tunion all '.join(subqueries) + ') \nGROUP BY substr HAVING sum(myCount) > 1'
    return sql

print GenerateSQL(3)

print c.execute(GenerateSQL(3)).fetchall()

The query generated looks like:

select substr FROM 
    (select substr(content,0,3) AS substr, count(*) AS myCount from myTable where length(substr(content,0,3))=3 group by substr(content,0,3) 
    union all select substr(content,1,3) AS substr, count(*) AS myCount from myTable where length(substr(content,1,3))=3 group by substr(content,1,3) 
    union all select substr(content,2,3) AS substr, count(*) AS myCount from myTable where length(substr(content,2,3))=3 group by substr(content,2,3) 
    union all select substr(content,3,3) AS substr, count(*) AS myCount from myTable where length(substr(content,3,3))=3 group by substr(content,3,3) 
    union all select substr(content,4,3) AS substr, count(*) AS myCount from myTable where length(substr(content,4,3))=3 group by substr(content,4,3) 
    ... ) 
GROUP BY substr HAVING sum(myCount) > 1

And the results it produces are:

[(u'app',), (u'ati',), (u'ion',), (u'nat',), (u'pin',), (u'ple',), (u'ppl',), (u'tio',)]
清晰传感 2024-08-03 20:17:01

很抱歉,我已经有一段时间没有玩 php 了。 我没有合适的测试环境,但我很快设计了一种在 c# 3.5

伪代码中执行此操作的方法:使用指定长度和长度的字符串构建一个表。 旁边出现的次数。 选择计数 > 的位置 1:

    static void Main(string[] args)
    {

        string[] data = { "apple", "pinapple", "application", "nation" };
        string[] result = my_func(3,data);

        foreach (string str in result)
        {
            Console.WriteLine(str);
        }
        Console.ReadKey();
    }

    private static string[] my_func(int l, string[] data)
    {
        Dictionary<string,int> dict = new Dictionary<string,int>();
        foreach (string str in data)
        {
            for (int i = 0; i < str.Length - l + 1; i++)
            {
                string part = str.Substring(i, l);
                if (dict.ContainsKey(part))
                {
                    dict[part]++;
                }else {
                    dict.Add(part,1);
                }
            }
        }
        var result = from k in dict.Keys
                where dict[k] > 1
                orderby dict[k] descending
                select k;

        return result.ToArray<string>();
    }

I'm sorry as I haven't been playing with php for a while & I don't have a proper test environment for it, but I quickly devised a way of doing this in c# 3.5

pseudocode: build a table with strings of the specified length & a count of occurences next to it. Select where count > 1:

    static void Main(string[] args)
    {

        string[] data = { "apple", "pinapple", "application", "nation" };
        string[] result = my_func(3,data);

        foreach (string str in result)
        {
            Console.WriteLine(str);
        }
        Console.ReadKey();
    }

    private static string[] my_func(int l, string[] data)
    {
        Dictionary<string,int> dict = new Dictionary<string,int>();
        foreach (string str in data)
        {
            for (int i = 0; i < str.Length - l + 1; i++)
            {
                string part = str.Substring(i, l);
                if (dict.ContainsKey(part))
                {
                    dict[part]++;
                }else {
                    dict.Add(part,1);
                }
            }
        }
        var result = from k in dict.Keys
                where dict[k] > 1
                orderby dict[k] descending
                select k;

        return result.ToArray<string>();
    }
坦然微笑 2024-08-03 20:17:01

一种明显的选择是使用 REGEX。 我以前没有这方面的经验,但这可能对你有帮助:
http://dev.mysql.com/doc/refman/5.1/ en/regexp.html

您需要找到一个合适的表达式来满足您的需要。

One obvious option is to use REGEX. I have no prior experience in this but this might be of help to you:
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

You'll need to find a suitable expression to match what you need.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文