解析和格式化搜索结果

发布于 2024-07-14 04:52:19 字数 901 浏览 6 评论 0原文

搜索：
脚本+语言 Web+页面应用程序
结果：
...脚本语言最初...生成动态网页。它具有...图形应用程序...目的脚本语言，即...创建网页作为输出...< /p>

假设我想要一个值来表示匹配项两侧允许填充的字符数，另一个值表示结果中将显示多少个匹配项（即，我只想查看前 5 个匹配项，什么都看不到）更多的）。

您具体会如何做这件事？

这与语言无关，但我将在 PHP 环境中实现该解决方案，因此请将答案限制为不需要特定语言或框架的选项。

这是我的思考过程：根据搜索词创建一个数组。确定哪个搜索词在文章正文中的位置方面具有最低索引。将正文的该部分收集到另一个变量中，然后从文章正文中删除该部分。返回到步骤 1。您甚至可以为每个单词添加一个计数器，当计数器达到 3 左右时跳过它。

重要提示：

解决方案必须以非线性方式匹配所有搜索词。意思是，如果第一项存在于第二项之后，则应在第二项之后找到它。同样，它也应该在第 3 学期之后找到。如果第 3 项恰好存在于第 1 项和第 2 项之前，则应在第 1 项和第 2 项之前找到第 3 项。

该解决方案应该允许我声明“每个术语最多允许三个匹配，然后终止摘要”。

额外加分：

获取填充变量以选择性地填充单词，而不是字符。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

止于盛夏 2024-07-21 04:52:20

我的思考过程：

创建一个支持非唯一名称/值对的结果数组（PHP 在其标准 array 对象中支持此）
循环遍历每个搜索词并在搜索文本中找到其字符起始位置
添加结果数组中的一个项目，用于存储刚刚找到的字符位置，以实际搜索词作为键
当您找到所有搜索词后，按值（搜索词的字符位置）升序对数组进行排序
现在，搜索结果将按照在搜索文本中找到的顺序
排列循环遍历结果数组并使用指定的单词填充来获取搜索词每一侧的单词，同时还跟踪单独名称/值中的单词计数配对

伪代码，或者我的最佳尝试：

function string GetSearchExcerpt(searchText, searchTerms, wordPadding = 0, searchLimit = 3)
{
  results = new array()
  startIndex = 0
  foreach (searchTerm in searchTerms) 
  {
    charIndex = searchText.FindByIndex(searchTerms, startIndex) // finds 1st position of searchTerm starting at startIndex
    results.Add(searchTerm, charIndex)
    startIndex = charIndex + 1
  }
  results = results.SortByValue()
  lastSearchTerm = ""
  searchTermCount = new array()
  outputText = ""
  foreach (searchTerm => charIndex in results)
  {
    searchTermCount[searchTerm]++
    if (searchTermCount[searchTerm] <= searchLimit)
    {
      // WordPadding is a simple function that moves left or right a given number of words starting at a specified character index and returns those words
      outputText += "..." + WordPadding(-wordPadding, charIndex) + "<strong>" + searchTerm + "</strong>" + WordPadding(wordPadding, charIndex)
    }
  }

  return outputText
}

My thought process:

Create a results array that supports non-unique name/value pairs (PHP supports this in its standard array object)
Loop through each search term and find its character starting position in the search text
Add an item to the results array that stores this character position it has just found with the actual search term as the key
When you've found all the search terms, sort the array ascending by value (the character position of the search term)
Now, the search results will be in order that they were found in the search text
Loop through the results array and use the specified word padding to get words on each side of the search term while also keeping track of the word count in a separate name/value pair

Pseudocode, or my best attempt at it:

function string GetSearchExcerpt(searchText, searchTerms, wordPadding = 0, searchLimit = 3)
{
  results = new array()
  startIndex = 0
  foreach (searchTerm in searchTerms) 
  {
    charIndex = searchText.FindByIndex(searchTerms, startIndex) // finds 1st position of searchTerm starting at startIndex
    results.Add(searchTerm, charIndex)
    startIndex = charIndex + 1
  }
  results = results.SortByValue()
  lastSearchTerm = ""
  searchTermCount = new array()
  outputText = ""
  foreach (searchTerm => charIndex in results)
  {
    searchTermCount[searchTerm]++
    if (searchTermCount[searchTerm] <= searchLimit)
    {
      // WordPadding is a simple function that moves left or right a given number of words starting at a specified character index and returns those words
      outputText += "..." + WordPadding(-wordPadding, charIndex) + "<strong>" + searchTerm + "</strong>" + WordPadding(wordPadding, charIndex)
    }
  }

  return outputText
}

回复收藏 0 原文