返回介绍

solution / 1600-1699 / 1698.Number of Distinct Substrings in a String / README_EN

发布于 2024-06-17 01:03:15 字数 7236 浏览 0 评论 0 收藏 0

1698. Number of Distinct Substrings in a String

中文文档

Description

Given a string s, return _the number of distinct substrings of_ s.

A substring of a string is obtained by deleting any number of characters (possibly zero) from the front of the string and any number (possibly zero) from the back of the string.

 

Example 1:

Input: s = "aabbaba"
Output: 21
Explanation: The set of distinct strings is ["a","b","aa","bb","ab","ba","aab","abb","bab","bba","aba","aabb","abba","bbab","baba","aabba","abbab","bbaba","aabbab","abbaba","aabbaba"]

Example 2:

Input: s = "abcdefg"
Output: 28

 

Constraints:

  • 1 <= s.length <= 500
  • s consists of lowercase English letters.

 

Follow up: Can you solve this problem in O(n) time complexity?

Solutions

Solution 1: Brute Force Enumeration

Enumerate all substrings and use a hash table to record the count of different substrings.

The time complexity is $O(n^3)$, and the space complexity is $O(n^2)$. Here, $n$ is the length of the string.

class Solution:
  def countDistinct(self, s: str) -> int:
    n = len(s)
    return len({s[i:j] for i in range(n) for j in range(i + 1, n + 1)})
class Solution {
  public int countDistinct(String s) {
    Set<String> ss = new HashSet<>();
    int n = s.length();
    for (int i = 0; i < n; ++i) {
      for (int j = i + 1; j <= n; ++j) {
        ss.add(s.substring(i, j));
      }
    }
    return ss.size();
  }
}
class Solution {
public:
  int countDistinct(string s) {
    unordered_set<string_view> ss;
    int n = s.size();
    string_view t, v = s;
    for (int i = 0; i < n; ++i) {
      for (int j = i + 1; j <= n; ++j) {
        t = v.substr(i, j - i);
        ss.insert(t);
      }
    }
    return ss.size();
  }
};
func countDistinct(s string) int {
  ss := map[string]struct{}{}
  for i := range s {
    for j := i + 1; j <= len(s); j++ {
      ss[s[i:j]] = struct{}{}
    }
  }
  return len(ss)
}

Solution 2: String Hashing

String hashing is a method to map a string of any length to a non-negative integer, and the probability of collision is almost zero. String hashing is used to calculate the hash value of a string, which can quickly determine whether two strings are equal.

We take a fixed value BASE, treat the string as a number in BASE radix, and assign a value greater than 0 to represent each character. Generally, the values we assign are much smaller than BASE. For example, for a string composed of lowercase letters, we can set a=1, b=2, …, z=26. We take a fixed value MOD, calculate the remainder of the BASE radix number to MOD, and use it as the hash value of the string.

Generally, we take BASE=131 or BASE=13331, at which point the probability of collision of the hash value is extremely low. As long as the hash values of two strings are the same, we consider the two strings to be equal. Usually, MOD is taken as 2^64. In C++, we can directly use the unsigned long long type to store this hash value. When calculating, we do not handle arithmetic overflow. When overflow occurs, it is equivalent to automatically taking the modulus of 2^64, which can avoid inefficient modulus operations.

Except for extremely specially constructed data, the above hash algorithm is unlikely to cause collisions. In general, the above hash algorithm can appear in the standard answer of the problem. We can also take some appropriate BASE and MOD values (such as large prime numbers), perform several groups of hash operations, and only consider the original strings equal when the results are all the same, making it even more difficult to construct data that causes this hash to produce errors.

The time complexity is $O(n^2)$, and the space complexity is $O(n^2)$. Here, $n$ is the length of the string.

class Solution:
  def countDistinct(self, s: str) -> int:
    base = 131
    n = len(s)
    p = [0] * (n + 10)
    h = [0] * (n + 10)
    p[0] = 1
    for i, c in enumerate(s):
      p[i + 1] = p[i] * base
      h[i + 1] = h[i] * base + ord(c)
    ss = set()
    for i in range(1, n + 1):
      for j in range(i, n + 1):
        t = h[j] - h[i - 1] * p[j - i + 1]
        ss.add(t)
    return len(ss)
class Solution {
  public int countDistinct(String s) {
    int base = 131;
    int n = s.length();
    long[] p = new long[n + 10];
    long[] h = new long[n + 10];
    p[0] = 1;
    for (int i = 0; i < n; ++i) {
      p[i + 1] = p[i] * base;
      h[i + 1] = h[i] * base + s.charAt(i);
    }
    Set<Long> ss = new HashSet<>();
    for (int i = 1; i <= n; ++i) {
      for (int j = i; j <= n; ++j) {
        long t = h[j] - h[i - 1] * p[j - i + 1];
        ss.add(t);
      }
    }
    return ss.size();
  }
}
class Solution {
public:
  int countDistinct(string s) {
    using ull = unsigned long long;
    int n = s.size();
    ull p[n + 10];
    ull h[n + 10];
    int base = 131;
    p[0] = 1;
    for (int i = 0; i < n; ++i) {
      p[i + 1] = p[i] * base;
      h[i + 1] = h[i] * base + s[i];
    }
    unordered_set<ull> ss;
    for (int i = 1; i <= n; ++i) {
      for (int j = i; j <= n; ++j) {
        ss.insert(h[j] - h[i - 1] * p[j - i + 1]);
      }
    }
    return ss.size();
  }
};
func countDistinct(s string) int {
  n := len(s)
  p := make([]int, n+10)
  h := make([]int, n+10)
  p[0] = 1
  base := 131
  for i, c := range s {
    p[i+1] = p[i] * base
    h[i+1] = h[i]*base + int(c)
  }
  ss := map[int]struct{}{}
  for i := 1; i <= n; i++ {
    for j := i; j <= n; j++ {
      ss[h[j]-h[i-1]*p[j-i+1]] = struct{}{}
    }
  }
  return len(ss)
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文