- Preface
- FAQ
- Guidelines for Contributing
- Contributors
- Part I - Basics
- Basics Data Structure
- String
- Linked List
- Binary Tree
- Huffman Compression
- Queue
- Heap
- Stack
- Set
- Map
- Graph
- Basics Sorting
- 算法复习——排序
- Bubble Sort
- Selection Sort
- Insertion Sort
- Merge Sort
- Quick Sort
- Heap Sort
- Bucket Sort
- Counting Sort
- Radix Sort
- Basics Algorithm
- Divide and Conquer
- Binary Search
- Math
- Greatest Common Divisor
- Prime
- Knapsack
- Probability
- Shuffle
- Bitmap
- Basics Misc
- Bit Manipulation
- Part II - Coding
- String
- strStr
- Two Strings Are Anagrams
- Compare Strings
- Anagrams
- Longest Common Substring
- Rotate String
- Reverse Words in a String
- Valid Palindrome
- Longest Palindromic Substring
- Space Replacement
- Wildcard Matching
- Length of Last Word
- Count and Say
- Integer Array
- Remove Element
- Zero Sum Subarray
- Subarray Sum K
- Subarray Sum Closest
- Recover Rotated Sorted Array
- Product of Array Exclude Itself
- Partition Array
- First Missing Positive
- 2 Sum
- 3 Sum
- 3 Sum Closest
- Remove Duplicates from Sorted Array
- Remove Duplicates from Sorted Array II
- Merge Sorted Array
- Merge Sorted Array II
- Median
- Partition Array by Odd and Even
- Kth Largest Element
- Binary Search
- Binary Search
- Search Insert Position
- Search for a Range
- First Bad Version
- Search a 2D Matrix
- Search a 2D Matrix II
- Find Peak Element
- Search in Rotated Sorted Array
- Search in Rotated Sorted Array II
- Find Minimum in Rotated Sorted Array
- Find Minimum in Rotated Sorted Array II
- Median of two Sorted Arrays
- Sqrt x
- Wood Cut
- Math and Bit Manipulation
- Single Number
- Single Number II
- Single Number III
- O1 Check Power of 2
- Convert Integer A to Integer B
- Factorial Trailing Zeroes
- Unique Binary Search Trees
- Update Bits
- Fast Power
- Hash Function
- Count 1 in Binary
- Fibonacci
- A plus B Problem
- Print Numbers by Recursion
- Majority Number
- Majority Number II
- Majority Number III
- Digit Counts
- Ugly Number
- Plus One
- Linked List
- Remove Duplicates from Sorted List
- Remove Duplicates from Sorted List II
- Remove Duplicates from Unsorted List
- Partition List
- Add Two Numbers
- Two Lists Sum Advanced
- Remove Nth Node From End of List
- Linked List Cycle
- Linked List Cycle II
- Reverse Linked List
- Reverse Linked List II
- Merge Two Sorted Lists
- Merge k Sorted Lists
- Reorder List
- Copy List with Random Pointer
- Sort List
- Insertion Sort List
- Palindrome Linked List
- Delete Node in the Middle of Singly Linked List
- Rotate List
- Swap Nodes in Pairs
- Remove Linked List Elements
- Binary Tree
- Binary Tree Preorder Traversal
- Binary Tree Inorder Traversal
- Binary Tree Postorder Traversal
- Binary Tree Level Order Traversal
- Binary Tree Level Order Traversal II
- Maximum Depth of Binary Tree
- Balanced Binary Tree
- Binary Tree Maximum Path Sum
- Lowest Common Ancestor
- Invert Binary Tree
- Diameter of a Binary Tree
- Construct Binary Tree from Preorder and Inorder Traversal
- Construct Binary Tree from Inorder and Postorder Traversal
- Subtree
- Binary Tree Zigzag Level Order Traversal
- Binary Tree Serialization
- Binary Search Tree
- Insert Node in a Binary Search Tree
- Validate Binary Search Tree
- Search Range in Binary Search Tree
- Convert Sorted Array to Binary Search Tree
- Convert Sorted List to Binary Search Tree
- Binary Search Tree Iterator
- Exhaustive Search
- Subsets
- Unique Subsets
- Permutations
- Unique Permutations
- Next Permutation
- Previous Permuation
- Permutation Index
- Permutation Index II
- Permutation Sequence
- Unique Binary Search Trees II
- Palindrome Partitioning
- Combinations
- Combination Sum
- Combination Sum II
- Minimum Depth of Binary Tree
- Word Search
- Dynamic Programming
- Triangle
- Backpack
- Backpack II
- Minimum Path Sum
- Unique Paths
- Unique Paths II
- Climbing Stairs
- Jump Game
- Word Break
- Longest Increasing Subsequence
- Follow up
- Palindrome Partitioning II
- Longest Common Subsequence
- Edit Distance
- Jump Game II
- Best Time to Buy and Sell Stock
- Best Time to Buy and Sell Stock II
- Best Time to Buy and Sell Stock III
- Best Time to Buy and Sell Stock IV
- Distinct Subsequences
- Interleaving String
- Maximum Subarray
- Maximum Subarray II
- Longest Increasing Continuous subsequence
- Longest Increasing Continuous subsequence II
- Maximal Square
- Graph
- Find the Connected Component in the Undirected Graph
- Route Between Two Nodes in Graph
- Topological Sorting
- Word Ladder
- Bipartial Graph Part I
- Data Structure
- Implement Queue by Two Stacks
- Min Stack
- Sliding Window Maximum
- Longest Words
- Heapify
- Problem Misc
- Nuts and Bolts Problem
- String to Integer
- Insert Interval
- Merge Intervals
- Minimum Subarray
- Matrix Zigzag Traversal
- Valid Sudoku
- Add Binary
- Reverse Integer
- Gray Code
- Find the Missing Number
- Minimum Window Substring
- Continuous Subarray Sum
- Continuous Subarray Sum II
- Longest Consecutive Sequence
- Part III - Contest
- Google APAC
- APAC 2015 Round B
- Problem A. Password Attacker
- APAC 2016 Round D
- Problem A. Dynamic Grid
- Microsoft
- Microsoft 2015 April
- Problem A. Magic Box
- Problem B. Professor Q's Software
- Problem C. Islands Travel
- Problem D. Recruitment
- Microsoft 2015 April 2
- Problem A. Lucky Substrings
- Problem B. Numeric Keypad
- Problem C. Spring Outing
- Microsoft 2015 September 2
- Problem A. Farthest Point
- Appendix I Interview and Resume
- Interview
- Resume
- 術語表
Hash Function
Source
- lintcode: (128) Hash Function
Problem
In data structure Hash, hash function is used to convert a string(or any other type) into an integer smaller than hash size and bigger or equal to zero. The objective of designing a hash function is to "hash" the key as unreasonable as possible. A good hash function can avoid collision as less as possible. A widely used hash function algorithm is using a magic number 33, consider any string as a 33 based big integer like follow:
hashcode("abcd") = (ascii(a) 33333^3333 + ascii(b) 33233^2332 + ascii(c) *33 + ascii(d)) % HASH_SIZE
= (97 33333^3333 + 98 33233^2332 + 99 * 33 +100) % HASH_SIZE
= 3595978 % HASH_SIZE
here HASH_SIZE is the capacity of the hash table (you can assume a hash table is like an array with index 0 ~ HASH_SIZE-1).
Given a string as a key and the size of hash table, return the hash value of this key.f
Example
For key="abcd" and size=100, return 78
Clarification
For this problem, you are not necessary to design your own hash algorithm or consider any collision issue, you just need to implement the algorithm as described.
题解 1
基本实现题,大多数人看到题目的直觉是按照定义来递推不就得了嘛,但其实这里面大有玄机,因为在字符串较长时使用 long 型来计算 33 的幂会溢出!所以这道题的关键在于如何处理 大整数溢出 。对于整数求模, (a * b) % m = a % m * b % m
这个基本公式务必牢记。根据这个公式我们可以大大降低时间复杂度和规避溢出。
Java
class Solution {
/**
* @param key: A String you should hash
* @param HASH_SIZE: An integer
* @return an integer
*/
public int hashCode(char[] key,int HASH_SIZE) {
if (key == null || key.length == 0) return -1;
long hashSum = 0;
for (int i = 0; i < key.length; i++) {
hashSum += key[i] * modPow(33, key.length - i - 1, HASH_SIZE);
hashSum %= HASH_SIZE;
}
return (int)hashSum;
}
private long modPow(int base, int n, int mod) {
if (n == 0) {
return 1;
} else if (n == 1) {
return base % mod;
} else if (n % 2 == 0) {
long temp = modPow(base, n / 2, mod);
return (temp % mod) * (temp % mod) % mod;
} else {
return (base % mod) * modPow(base, n - 1, mod) % mod;
}
}
}
源码分析
题解 1 属于较为直观的解法,只不过在计算 33 的幂时使用了私有方法 modPow
, 这个方法使用了对数级别复杂度的算法,可防止 TLE 的产生。注意两个 int 型数据在相乘时可能会溢出,故对中间结果的存储需要使用 long.
复杂度分析
遍历加求 modPow
,时间复杂度 O(nlogn)O(n \log n)O(nlogn), 空间复杂度 O(1)O(1)O(1). 当然也可以使用哈希表的方法将幂求模的结果保存起来,这样一来空间复杂度就是 O(n)O(n)O(n), 不过时间复杂度为 O(n)O(n)O(n).
题解 2 - 巧用求模公式
从题解 1 中我们可以看到其时间复杂度还是比较高的,作为基本库来使用是比较低效的。我们从范例 hashcode("abc")
为例进行说明。
hashcode(abc)=(a×332+b×33+c)%M=(33(33×a+b)+c)%M=(33(33(33×0+a)+b)+c)%M \begin{array}{cl} hashcode(abc) & = & (a \times 33^{2} + b \times 33 + c)\% M\\ & = & (33(33\times a+b)+c)\% M\\ & = & (33(33(33\times0+a)+b)+c)\% M \end{array} hashcode(abc)===(a×332+b×33+c)%M(33(33×a+b)+c)%M(33(33(33×0+a)+b)+c)%M
再根据 (a×b)%M=(a%M)×(b%M)(a \times b) \% M = (a \% M) \times (b \% M)(a×b)%M=(a%M)×(b%M)
从中可以看出使用迭代的方法较容易实现。
Java
class Solution {
/**
* @param key: A String you should hash
* @param HASH_SIZE: An integer
* @return an integer
*/
public int hashCode(char[] key,int HASH_SIZE) {
if (key == null || key.length == 0) return -1;
long hashSum = 0;
for (int i = 0; i < key.length; i++) {
hashSum = 33 * hashSum + key[i];
hashSum %= HASH_SIZE;
}
return (int)hashSum;
}
}
源码分析
精华在 hashSum = 33 * hashSum + key[i];
复杂度分析
时间复杂度 O(n)O(n)O(n), 空间复杂度 O(1)O(1)O(1).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论