- Preface
- FAQ
- Guidelines for Contributing
- Contributors
- Part I - Basics
- Basics Data Structure
- String
- Linked List
- Binary Tree
- Huffman Compression
- Queue
- Heap
- Stack
- Set
- Map
- Graph
- Basics Sorting
- 算法复习——排序
- Bubble Sort
- Selection Sort
- Insertion Sort
- Merge Sort
- Quick Sort
- Heap Sort
- Bucket Sort
- Counting Sort
- Radix Sort
- Basics Algorithm
- Divide and Conquer
- Binary Search
- Math
- Greatest Common Divisor
- Prime
- Knapsack
- Probability
- Shuffle
- Bitmap
- Basics Misc
- Bit Manipulation
- Part II - Coding
- String
- strStr
- Two Strings Are Anagrams
- Compare Strings
- Anagrams
- Longest Common Substring
- Rotate String
- Reverse Words in a String
- Valid Palindrome
- Longest Palindromic Substring
- Space Replacement
- Wildcard Matching
- Length of Last Word
- Count and Say
- Integer Array
- Remove Element
- Zero Sum Subarray
- Subarray Sum K
- Subarray Sum Closest
- Recover Rotated Sorted Array
- Product of Array Exclude Itself
- Partition Array
- First Missing Positive
- 2 Sum
- 3 Sum
- 3 Sum Closest
- Remove Duplicates from Sorted Array
- Remove Duplicates from Sorted Array II
- Merge Sorted Array
- Merge Sorted Array II
- Median
- Partition Array by Odd and Even
- Kth Largest Element
- Binary Search
- Binary Search
- Search Insert Position
- Search for a Range
- First Bad Version
- Search a 2D Matrix
- Search a 2D Matrix II
- Find Peak Element
- Search in Rotated Sorted Array
- Search in Rotated Sorted Array II
- Find Minimum in Rotated Sorted Array
- Find Minimum in Rotated Sorted Array II
- Median of two Sorted Arrays
- Sqrt x
- Wood Cut
- Math and Bit Manipulation
- Single Number
- Single Number II
- Single Number III
- O1 Check Power of 2
- Convert Integer A to Integer B
- Factorial Trailing Zeroes
- Unique Binary Search Trees
- Update Bits
- Fast Power
- Hash Function
- Count 1 in Binary
- Fibonacci
- A plus B Problem
- Print Numbers by Recursion
- Majority Number
- Majority Number II
- Majority Number III
- Digit Counts
- Ugly Number
- Plus One
- Linked List
- Remove Duplicates from Sorted List
- Remove Duplicates from Sorted List II
- Remove Duplicates from Unsorted List
- Partition List
- Add Two Numbers
- Two Lists Sum Advanced
- Remove Nth Node From End of List
- Linked List Cycle
- Linked List Cycle II
- Reverse Linked List
- Reverse Linked List II
- Merge Two Sorted Lists
- Merge k Sorted Lists
- Reorder List
- Copy List with Random Pointer
- Sort List
- Insertion Sort List
- Palindrome Linked List
- Delete Node in the Middle of Singly Linked List
- Rotate List
- Swap Nodes in Pairs
- Remove Linked List Elements
- Binary Tree
- Binary Tree Preorder Traversal
- Binary Tree Inorder Traversal
- Binary Tree Postorder Traversal
- Binary Tree Level Order Traversal
- Binary Tree Level Order Traversal II
- Maximum Depth of Binary Tree
- Balanced Binary Tree
- Binary Tree Maximum Path Sum
- Lowest Common Ancestor
- Invert Binary Tree
- Diameter of a Binary Tree
- Construct Binary Tree from Preorder and Inorder Traversal
- Construct Binary Tree from Inorder and Postorder Traversal
- Subtree
- Binary Tree Zigzag Level Order Traversal
- Binary Tree Serialization
- Binary Search Tree
- Insert Node in a Binary Search Tree
- Validate Binary Search Tree
- Search Range in Binary Search Tree
- Convert Sorted Array to Binary Search Tree
- Convert Sorted List to Binary Search Tree
- Binary Search Tree Iterator
- Exhaustive Search
- Subsets
- Unique Subsets
- Permutations
- Unique Permutations
- Next Permutation
- Previous Permuation
- Permutation Index
- Permutation Index II
- Permutation Sequence
- Unique Binary Search Trees II
- Palindrome Partitioning
- Combinations
- Combination Sum
- Combination Sum II
- Minimum Depth of Binary Tree
- Word Search
- Dynamic Programming
- Triangle
- Backpack
- Backpack II
- Minimum Path Sum
- Unique Paths
- Unique Paths II
- Climbing Stairs
- Jump Game
- Word Break
- Longest Increasing Subsequence
- Follow up
- Palindrome Partitioning II
- Longest Common Subsequence
- Edit Distance
- Jump Game II
- Best Time to Buy and Sell Stock
- Best Time to Buy and Sell Stock II
- Best Time to Buy and Sell Stock III
- Best Time to Buy and Sell Stock IV
- Distinct Subsequences
- Interleaving String
- Maximum Subarray
- Maximum Subarray II
- Longest Increasing Continuous subsequence
- Longest Increasing Continuous subsequence II
- Maximal Square
- Graph
- Find the Connected Component in the Undirected Graph
- Route Between Two Nodes in Graph
- Topological Sorting
- Word Ladder
- Bipartial Graph Part I
- Data Structure
- Implement Queue by Two Stacks
- Min Stack
- Sliding Window Maximum
- Longest Words
- Heapify
- Problem Misc
- Nuts and Bolts Problem
- String to Integer
- Insert Interval
- Merge Intervals
- Minimum Subarray
- Matrix Zigzag Traversal
- Valid Sudoku
- Add Binary
- Reverse Integer
- Gray Code
- Find the Missing Number
- Minimum Window Substring
- Continuous Subarray Sum
- Continuous Subarray Sum II
- Longest Consecutive Sequence
- Part III - Contest
- Google APAC
- APAC 2015 Round B
- Problem A. Password Attacker
- APAC 2016 Round D
- Problem A. Dynamic Grid
- Microsoft
- Microsoft 2015 April
- Problem A. Magic Box
- Problem B. Professor Q's Software
- Problem C. Islands Travel
- Problem D. Recruitment
- Microsoft 2015 April 2
- Problem A. Lucky Substrings
- Problem B. Numeric Keypad
- Problem C. Spring Outing
- Microsoft 2015 September 2
- Problem A. Farthest Point
- Appendix I Interview and Resume
- Interview
- Resume
- 術語表
Distinct Subsequences
Source
- leetcode: Distinct Subsequences | LeetCode OJ
- lintcode: (118) Distinct Subsequences
Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string
which is formed from the original string by deleting some (can be none) of the characters
without disturbing the relative positions of the remaining characters.
(ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).
Example
Given S = "rabbbit", T = "rabbit", return 3.
Challenge
Do it in O(n2) time and O(n) memory.
O(n2) memory is also acceptable if you do not know how to optimize memory.
题解 1
首先分清 subsequence 和 substring 两者的区别,subsequence 可以是不连续的子串。题意要求 S 中子序列 T 的个数。如果不考虑程序实现,我们能想到的办法是逐个比较 S 和 T 的首字符,相等的字符删掉,不等时则删除 S 中的首字符,继续比较后续字符直至 T 中字符串被删完。这种简单的思路有这么几个问题,题目问的是子序列的个数,而不是是否存在,故在字符不等时不能轻易删除掉 S 中的字符。那么如何才能得知子序列的个数呢?
要想得知不同子序列的个数,那么我们就不能在 S 和 T 中首字符不等时简单移除 S 中的首字符了,取而代之的方法应该是先将 S 复制一份,再用移除 S 中首字符后的新字符串和 T 进行比较,这点和深搜中的剪枝函数的处理有点类似。
Python
class Solution:
# @param S, T: Two string.
# @return: Count the number of distinct subsequences
def numDistinct(self, S, T):
if S is None or T is None:
return 0
if len(S) < len(T):
return 0
if len(T) == 0:
return 1
num = 0
for i, Si in enumerate(S):
if Si == T[0]:
num += self.numDistinct(S[i + 1:], T[1:])
return num
C++
class Solution {
public:
/**
* @param S, T: Two string.
* @return: Count the number of distinct subsequences
*/
int numDistinct(string &S, string &T) {
if (S.size() < T.size()) return 0;
if (T.empty()) return 1;
int num = 0;
for (int i = 0; i < S.size(); ++i) {
if (S[i] == T[0]) {
string Si = S.substr(i + 1);
string t = T.substr(1);
num += numDistinct(Si, t);
}
}
return num;
}
};
Java
public class Solution {
/**
* @param S, T: Two string.
* @return: Count the number of distinct subsequences
*/
public int numDistinct(String S, String T) {
if (S == null || T == null) return 0;
if (S.length() < T.length()) return 0;
if (T.length() == 0) return 1;
int num = 0;
for (int i = 0; i < S.length(); i++) {
if (S.charAt(i) == T.charAt(0)) {
// T.length() >= 1, T.substring(1) will not throw index error
num += numDistinct(S.substring(i + 1), T.substring(1));
}
}
return num;
}
}
源码分析
- 对 null 异常处理(C++ 中对 string 赋 NULL 是错的,函数内部无法 handle 这种情况)
- S 字符串长度若小于 T 字符串长度,T 必然不是 S 的子序列,返回 0
- T 字符串长度为 0,证明 T 是 S 的子序列,返回 1
由于进入 for 循环的前提是 T.length() >= 1
, 故当 T 的长度为 1 时,Java 中对 T 取子串 T.substring(1)
时产生的是空串 ""
而并不抛出索引越界的异常。
复杂度分析
最好情况下,S 中没有和 T 相同的字符,时间复杂度为 O(n)O(n)O(n); 最坏情况下,S 中的字符和 T 中字符完全相同,此时可以画出递归调用栈,发现和深搜非常类似,数学关系为 f(n)=∑i=1n−1f(i)f(n) = \sum _{i = 1} ^{n - 1} f(i)f(n)=∑i=1n−1f(i), 这比 Fibonacci 的复杂度还要高很多。
题解 2 - Dynamic Programming
从题解 1 的复杂度分析中我们能发现由于存在较多的重叠子状态(相同子串被比较多次), 因此可以想到使用动态规划优化。但是动规的三大要素如何建立?由于本题为两个字符串之间的关系,故可以尝试使用双序列( DP_Two_Sequence ) 动规的思路求解。
定义 f[i][j]
为 S[0:i] 中子序列为 T[0:j] 的个数,接下来寻找状态转移关系,状态转移应从 f[i-1][j], f[i-1][j-1], f[i][j-1] 中寻找,接着寻找突破口——S[i] 和 T[j] 的关系。
S[i] == T[j]
: 两个字符串的最后一个字符相等,我们可以选择 S[i] 和 T[j] 配对,那么此时有 f[i][j] = f[i-1][j-1]; 若不使 S[i] 和 T[j] 配对,而是选择 S[0:i-1] 中的某个字符和 T[j] 配对,那么 f[i][j] = f[i-1][j]. 综合以上两种选择,可得知在S[i] == T[j]
时有 f[i][j] = f[i-1][j-1] + f[i-1][j]S[i] != T[j]
: 最后一个字符不等时,S[i] 不可能和 T[j] 配对,故 f[i][j] = f[i-1][j]
为便于处理第一个字符相等的状态(便于累加),初始化 f[i][0]为 1, 其余为 0. 这里对于 S 或 T 为空串时返回 0,返回 1 也能说得过去。
Python
class Solution:
# @param S, T: Two string.
# @return: Count the number of distinct subsequences
def numDistinct(self, S, T):
if S is None or T is None:
return 0
if len(S) < len(T):
return 0
if len(T) == 0:
return 1
f = [[0 for i in xrange(len(T) + 1)] for j in xrange(len(S) + 1)]
for i, Si in enumerate(S):
f[i][0] = 1
for j, Tj in enumerate(T):
if Si == Tj:
f[i + 1][j + 1] = f[i][j + 1] + f[i][j]
else:
f[i + 1][j + 1] = f[i][j + 1]
return f[len(S)][len(T)]
C++
class Solution {
public:
/**
* @param S, T: Two string.
* @return: Count the number of distinct subsequences
*/
int numDistinct(string &S, string &T) {
if (S.size() < T.size()) return 0;
if (T.empty()) return 1;
vector<vector<int> > f(S.size() + 1, vector<int>(T.size() + 1, 0));
for (int i = 0; i < S.size(); ++i) {
f[i][0] = 1;
for (int j = 0; j < T.size(); ++j) {
if (S[i] == T[j]) {
f[i + 1][j + 1] = f[i][j + 1] + f[i][j];
} else {
f[i + 1][j + 1] = f[i][j + 1];
}
}
}
return f[S.size()][T.size()];
}
};
Java
public class Solution {
/**
* @param S, T: Two string.
* @return: Count the number of distinct subsequences
*/
public int numDistinct(String S, String T) {
if (S == null || T == null) return 0;
if (S.length() < T.length()) return 0;
if (T.length() == 0) return 1;
int[][] f = new int[S.length() + 1][T.length() + 1];
for (int i = 0; i < S.length(); i++) {
f[i][0] = 1;
for (int j = 0; j < T.length(); j++) {
if (S.charAt(i) == T.charAt(j)) {
f[i + 1][j + 1] = f[i][j + 1] + f[i][j];
} else {
f[i + 1][j + 1] = f[i][j + 1];
}
}
}
return f[S.length()][T.length()];
}
}
源码分析
异常处理部分和题解 1 相同,初始化时维度均多一个元素便于处理。
复杂度分析
由于免去了重叠子状态的计算,双重 for 循环,时间复杂度为 O(n2)O(n^2)O(n2), 使用了二维矩阵保存状态,空间复杂度为 O(n2)O(n^2)O(n2). 空间复杂度可以通过滚动数组的方式优化,详见 Dynamic Programming - 动态规划 .
空间复杂度优化之后的代码如下:
Java
public class Solution {
/**
* @param S, T: Two string.
* @return: Count the number of distinct subsequences
*/
public int numDistinct(String S, String T) {
if (S == null || T == null) return 0;
if (S.length() < T.length()) return 0;
if (T.length() == 0) return 1;
int[] f = new int[T.length() + 1];
f[0] = 1;
for (int i = 0; i < S.length(); i++) {
for (int j = T.length() - 1; j >= 0; j--) {
if (S.charAt(i) == T.charAt(j)) {
f[j + 1] += f[j];
}
}
}
return f[T.length()];
}
}
Reference
- LeetCode: Distinct Subsequences(不同子序列的个数) - 亦忘却_亦纪念
- soulmachine leetcode-cpp 中 Distinct Subsequences 部分
- Distinct Subsequences | Training dragons the hard way
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论