List
ListNode
1 | class ListNode { |
temp.next = temp.next.next // is okay what if: current.next = current.next.next // NullPointerException
Examples of Linked List Operations:
1 | // length() 1 -> 2 -> 3 -> null |
head could be changed - we need to return the new head if changed usually need to consider
- head == null
- head.next = null
while loop - we need to understand what is the terminate condition
List Interface
An ordered collection (sometimes called a sequence).
Lists can contain duplicate elements.
The user of a List generally has precise control over where in the list each element is inserted and can access elements by their index.
Provided functionalities(interfaces)
- Random Access: set(int index, E e) get(int index) add(int index, E e), add(E e) remove(int index)
- Search
- Iterator
- Range-View
- Other useful methods: isEmpty(); size();
Most popular implementation class: ArrayList, LinkedList.
Abstract class, interface
Both can have methods declared without implementation (abstract methods).
1 | interface Dictionary { |
Declare into a general type, initialize into a concrete type (Interfaces are also types):
1 | List<Node> myList = new LinkedList<Node>(); |
Preserve the general semantic of a List
- Can only use methods in a List interface
- Not to make any assumptions on implementation
- Make sure interchangeable in future
Provide concrete implementation by using LinkedList
- Choose the best implementation for the current use case
list: size() is O(1) time complexity because of eager computation
1 | isEmpty() { |
节省代码量,同时使逻辑更清晰。
Comparison
Operation | ArrayList | LinkedList |
---|---|---|
get(int index) at head/tail | O(1) | O(1) - maintain both head and tail |
get(int index) in middle | O(1) | O(n) |
set(int index) at head/tail | O(1) | O(1) |
set(int index) in middle | O(1) | O(n) |
add(int index, E e) in middle | O(n) | O(n) |
add(int index, E e) at head | O(n) | O(1) |
add(E e) at tail | amortized: O(1) | O(1) |
remove(int index) at head | O(n) | O(1) |
remove(int index) at tail | O(1) | O(1) |
remove(int index) at middle | O(n) | O(n) |
size() | O(1) | O(1) |
isEmpty() | O(1) | O(1) |
When to choose what? ArrayList or LinkedList?
If you have a lot of random access operations, use ArrayList.
If you always add/remove at the end, use ArrayList
- When the time complexity is similar for using ArrayList and LinkedList, use ArrayList. (ArrayList use memory more efficiently - For each element LinkedList need to create different ListNode in memory, there is extra cost of storing the "next", "prev" reference, and the allocation is not contiguous).
- Stack and Vector class are not recommended, whenever you need a Vector, use ArrayList, whenever you need a Stack, use LinkedLIst.
- E.g. reason: http://stackoverflow.com/questions/25922201/linkedlist-vs-stack
ArrayList
ArrayList is regarded as a resizable array.
- maintain an array with potential unused cells
- will expand the array if there is no unused cells available
- by replacing the maintained array with a new larger array (1.5 times larger by default)
1 | class ArrayList<E> { |
get(int index): array[index]
set(int index, E e): array[index] = e
size(): // size means the number of used cells, not the capacity.
when call add(), remove(), size is modified as well
call add(11): array = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, size = 10, capacity = 10 (array.length). call add(11):
- check if it is full (size = capacity)
- if it is full, create a new array1, array1's size= 1.5 * array's size, copy all the element in array to array1, and change the maintained array to array1. array1 = new Integer[size * 1.5] array1 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, null, null, null, null, null} size = 10, capacity= 15 array = array1
- set the 11th element's value to 11, size++. array = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, null, null, null, null} size= 11, capacity= 15
call add(int index, E e):
expand if necessary
then right move all the elements after index by 1,
set the element at index.
Example: array= {1, 3, 4, 5, 6, null, null, null, null, null}, size = 5, capacity = 10 call add(1, 2) - add Integer 2 at index 1
do not need to expand since size < capacity
move all the elements of {3, 4, 5, 6} right by 1 position, size++ array= {1, 2, 3, 4, 5, 6, null, null, null, null}, size = 6, capacity = 10
set index 1 as Integer 2 array = {1, 2, 3, 4, 5, 6, null, null, null, null}
call remove(1): remove the element at index 1
check if index out of bound first.
move all the elements of {3, 4, 5, 6} left by 1 position, size--
LinkedList
backend by a doubly linked list, wrap the size, head and tail information.
1 | class ListNode<E> { |
get(int index) - from the head/tail, traverse the list to get the element at index
set(int index, E e) - from the head/tail, traverse the list to the get the element at index, and change it
add(int index, E e), add(E e)
- at tail - append new node after tail and update tail to the new added node
- at head - insert in front of the head, and update head to the new added node
- at middle
remove(int index)
- at head - update the head to head.next
- at tail - update the tail to tail.prev
- at middle · from head/tail, traverse the list to get the element at index, and remove it.
size() - O(1) - eager computation.
isEmpty()
Something about coding
null & empty collections, array
1 | int[] array = null; |
They are DIFFERENT!
null - there is no array object associated with the reference empty array/list - there is an array/list object, but the array/list object does not contain any element, yet (length/size == 0).
Example
1 | // 1 |
Check corner cases & boundary conditions if (array.length == 0 || array == null)
X NullPointerException if (array == null || array.length == 0)
Remember: always do null check before any ofher checks. The logic expressions are evaluated from left to right (short circuit)
The things to keep in mind
- Check null - check at very first place
- Check length/size/head == null/head.next == null/other initial check conditions.
- Check index out of bound/NullPointer (even inside the for/while loop)
- keep in mind any index can not be out of range(< 0 ll >= array.length).
- keep in mind some ListNode reference might be null.
- What is the termination condition of for/while loop Make some concrete examples to determine if the logic is correct.
About dummy node It is not a must, but can make your life much easier, there are several suitable conditions using dummy will be very handy.
- the head could be changed when solving the problem
Example - insert a value into a sorted list 1-> 3 -> 5
insert 4
dummy -> 0 -> 1 -> 3 -> 5. prev while (prev.next != null && prev.next.val < val){ prev = prev.next; }
1). With dummy node's help
1 | public ListNode insert(ListNode head, int val) { |
2). Without dummy node's help
1 | public ListNode insert(ListNode head, int val) { |
- not sure yet which node will be head when constructing the list
3 -> 5
4 -> 6
dummy -> 1 -> 2 -> ……
Example - merge two sorted list
1). With dummy node's help
1 | public ListNode merge(ListNode h1, ListNode h2) { |
2). Without dummy node's help
1 | public ListNode merge(ListNode h1, ListNode h2) { |
Tips: 尽量不使用global variable
Queue & Stack & Deque
In Java In Java, we usually refer it to Queue and Deque interface. A collection to hold multiple elements prior to processing. Contains a buffer and specifies a mechanism to choose which is the next element to process.
Queue - FIFO (exception: PriorityQueue) http://docs.oracle.com/javase/7/docs/api/java/util/Queue.html
offer() - offer at the tail
poll()- poll at the head
peek() - look at the head without polling it out
- Deque - FIFO & LIFO (both queue and stack) http://docs.oracle.com/javase/7/docs/api/java/util/Deque.html
- provide both offer at head/tail, poll at head/tail, and peek at head/tail
- so it can be used as a queue, or a stack.
Common APIs
- Use the same set of APIs
- What's the point: java Queue中 add/offer,element/peek,remove/poll区别
Other useful methods:
- isEmpty();
- size();
All the operations' cost is O(1). Most popular implementation class: LinkedList, ArrayDeque(Java 6 onwards).
When you need a queue or stack, you can just use LinkedList, as it implements both Queue and Deque interfaces. (Do not use Stack class. )
ArrayDeque: no null values in deque.
1 | import java.util.ArrayList; |
Our Own Implementation Stack, Queue and Deque are not premier data structures, their implementation depend on other data structures - what could be the candidates? - sequence type(线性的)
In general, two ways:
use array
use Linked list
Let's see how to use Linked List to do so. Usually, when we use Linked List, we might want to determine
- singly linked list?
- doubly linked list?
Stack implemented by Linked List
1 | class ListNode { |
Queue implemented by Linked List
1 | public class Queue { |
Queue implemented by array
Let's make a simpler case first, assume the capacity of the queue is bounded.
First, we need to maintain the position of head and tail - use two pointers.
offer = put one element, tail++ (tail points to the next available position) poll = grab element at head, head++ (head points to the next element in queue)
What is the problem here? what if tail == array.length - 1 and we need to do tail++?
Circular Array - we can connect the start and end of the array, so that it is a cycle. index of array.length <=> index of 0
Quick tip: head = head + 1 == array.length ? 0 : head + 1; // (head + 1) % array.length;
Second, we need to know when empty, when full. head == tail -> empty head == tail -> full two solutions:
- record size, size = 0 (empty), size == array.length (full)
- head + 1 == tail -> empty, head == tail -> full, capacity =array.length- 1(next of head points to the head of the queue)
1 | public class BoundedQueue { |
Stack implemented by array
1 | public class BoundedStack { |
Exercise We have done Queue and Stack, how about Deque?
- Use Linked List
- can we still use singly linked list or we need doubly linked list?
- Use Circular Array
PriorityQueue
It is a heap with the same Queue interface with offer(), peek(), poll(). But, it is not FIFO, when poll() or peek() we always look at the smallest/largest element (min heap/max heap).
The PriorityQueue will arrange the elements based on the order of the elements (who is smaller/larger by comparing any two of them) and it is optimized for problems about "who is the smallest/largest element'.
Internally it is implemented using an array.
- offer(E e) - insert one element into the heap
- peek() - peek the top element in the heap (smallest/largest based on how to order the elements)
- poll() - remove the top element from the heap, and return it (smallest/largest based on how to order the elements)
- size()
- isEmpty()
time complexity:
- offer(E e) - O(log n)
- peek() - O(1)
- poll() - O(log n)
- remove() - O(log n), from AbstractQueue, remove(Object) - O(n) 删除任意一个元素
- size()- O(1)
- isEmpty · O(1)
Order - The PriorityQueue need to know how to compare the elements and determine which one is smaller/larger.
There are two ways to do so: The element type implementing Comparable interface
Example:
1 | PriorityQueue<Integer> minHeap = new PriorityQueue<Integer>(); |
How does the priorityqueue know how to compare the Integer objects?
The element's class can implement Comparable interface and thus implement the required method compare To(), PriorityQueue will use this method to compare any two elements.
1 | interface Comparable<E> { |
The return value of compareTo(another) method determines the order of this and another: 0 - this and another are of the same priority -1 (< 0) - this has higher priority than another 1 (> 0) - this has less priority than another
Another Example using custom class: Suppose you have an integer matrix, each row is sorted by ascending order and each column is also sorted by ascending order, we need a class to representing each cell in the matrix, and we need to compare two cells with their value in the matrix.
1 | class Cell implements Comparable<Cell> { |
Provide an extra Comparator object to compare the elements There is another interface Comparator, it is used to compare two elements with same type E.
1 | interface Comparator<E> { |
1 | import java.util.Comparator; |
If class E already implements Comparable interface, and we need another way of ordering the elements in our PriorityQueue, we can pass the new Comparator.
1 | // I want a max heap here... |
Or, there is a utility method Collections.reverseOrder()
, it will return a comparator that reverses the natural order.
equals() method of a comparator: if the incoming comparator equals the current comparator, not about orders!
Conclusion: To make PriorityQueue work, there are two ways:
- The element class E implements Comparable interface
- Provide a Comparator< E > class to compare elements. and pass an comparator object to the PriorityQueue.
if E already implements Comparable< E >, but still you provide a Comparator, PriorityQueue will choose the order specified in Comparator.
Most frequently used constructors of PriorityQueue 1. PriorityQueue<Cell> heap = new PriorityQueue<Cell>();
initialize the internal array with default capacity(11) class Cell must implements Comparable<Cell>
! 2. PriorityQueue<Cell> heap = new PriorityQueue<Cell>(16);
initialize the internal array with specified capacity(16) class Cell implements Comparable<Cell>
! 3. PriorityQueue<Cell> heap = new PriorityQueue<Cell>(16, new MyComparator();
initialize the internal array with specified capacity(16) class MyComparator implements Comparator<Cell>
!
Be Careful the initial capacity has to be > 0: // if k <= 0, IllegalArgumentException will be thrown in the constructor. PriorityQueue<Cell> minHeap = new PriorityQueue<Cell>(k, new MyComparator();
Method 1 and 2, the object inside the PQ should be comparable! Otherwise, Java will throw runtime exception (it will cast anything inside into a Comparable to do the comparison).
heapify() - convert an array to heap in O(n) time ArrayList
initialize the internal array with heapify() internally
class Cell implements Comparable
! |
it is not exposed to outside from PriorityQueue class, it is a private method. the only way to utilize the heapify() is to use the constructor
1 | // Part of the PriorityQueue implementation…… |
Nested class: define a class within a class. Static nested class vs. non-static nested class: belong to class, or belong to instance. Anonymous class: nested class with no name. Often replaceable by lambda expressions in Java8.
Something about coding: nested classes Nested class: A class within another class. Why? (Official tutorial)
- It is a way of logically grouping classes that are only used in one place: If a class is useful to only one other class, then it is logical to embed it in that class and keep the two together. Nesting such "helper classes" makes their package more streamlined.
- It increases encapsulation: Consider two top-level classes, A and B, where B needs access to members of A that would otherwise be declared private. By hiding class B within class A, A's members can be declared private and B can access them. In addition, B itself can be hidden from the outside world.
- It can lead to more readable and maintainable code: Nesting small classes within top-level classes places the code closer to where it is used.
Static nested class: a nested class associated with the outside class. Can access class variables and methods. Inner class: a nested class associated with an instance of the class. Can access instance variables and methods. Anonymous inner class (defined in a method with just new and no definition)
1 | PriorityQueue<Cell> pQueue = new PriorityQueue<>(16, new Comparator<Cell>() { |
Heap
Implement PriorityQueue (Heap) Definition
- Heap is a (binary) tree based data structure
- Across the entire tree, the relation between a parent node and a child node stays consistent.
- Example: min heap
- the parent node is always <= its two child nodes (parent node is the smallest node in the subtree rooted at itself).
- the relation between the two child nodes can differ.
- Example: min heap
- The common implementation of a heap is using a complete binary tree. * A "complete binary tree" is a binary tree in which every level, except possibly the last level, is completely filled, and all nodes are as far left as possible.
Representation
it can also be represented as an array (by the level order traversal of the binary tree), since it is a complete binary tree.
- Why? - if it is a complete binary tree, the matching between the nodes and the index of the array is determined, and the relation of parent and child nodes can be well transferred to the relation between two indices.
Example:
index of parent = i, what is the index of the two child nodes?
left child of index i = 2 * i + 1
right child of index i = 2 * i + 2
parent of index i = (i - 1) / 2
the root of the tree is at index 0.
Basic Heap Internal Operations
percolateUp()
- when to use?
- the element need to be moved up to maintain the heap's property, for example, when offering a new element into the heap.
- how?
- compare the element with its parent, move it up when necessary. do this until the element does not need to be moved.
percolateDown()
- when to use?
- the element need to be moved down to maintain the heap's property, for example, when poll the root element from the heap.
- how?
- compare the element with its two children, if the smallest one of the two children is smaller than the element, swap the element with that child. do this until the element does not need to be moved.
Heapify()
- convert an array into a heap in O(n) time.(percolateDown)
- how?
- for each node that has at least one child, we perform percolateDown() action, in the order of from the nodes on the deepest level to the root.(从底向上,从右向左)
- the time complexity of heapify() is O(n)
{10, 11, 7, 2, 8, 4, 6, 13, 3}
all the red nodes are the nodes has at least one child. perform percolateDown() on each of them following the order: 2, 7, 11, 10:
if the heap has n elements, n = 9 : range [0, 3] n = 8 : range [0, 3] n = 7 : range [0, 2] .... the range of indices need to perform percolateDown() is: [0, n / 2 - 1] last index's parent = ((n - 1) - 1) / 2 = (n - 2) / 2 = n / 2 - 1
update()
- if you know the position of the element you want to update, it will take O(logn).
- how?
- either you need percolateUp(), or percolateDown() on that element.
- what if you do not know the position of the element?
- you need to find the position of the element first, if not asking for help with other additional data structures, this operation is O(n)
Time Complexity: percolateUp() - O(height) == O(logn) percolateDown() - O(height) == O(logn)
Key points
- child index -> parent index
- parent = (child - 1) / 2
- parent index -> children index
- left child = parent * 2 + 1
- right child = parent * 2 + 2
- basic internal operations of heap:
- percolateUp(int index)
- offer()
- update()
- percolateDown(int index)
- poll()
- update()
- heapify()
Sorting algorithms (and heap)
1 | int[] array = {4, 5, 0, 1, 2, 3} |
quicksort:
- wоrѕt саѕе О(n^2), аvеrаgе О(nlоgn), ѕрасе wоrѕt саѕе О(n), аvеrаgе О(lоgn)
- unstable sort
mergesort:
- worst case O(nlogn), average O(nlogn), space worst case/average O(n)
- stable sort
Stable sorting algorithms maintain the relative order of records with equal keys (i.e. values). That is, a sorting algorithm is stable if whenever there are two records R and S with the same key and with R appearing before S in the original list, R will appear before S in the sorted list.
Why object sorting needs to be stable? Example: sorting by scores, then by gender.
two male students m1 and m2, if m1's score is higher than m2, then after the sort the order is preserved.
https://www.youtube.com/watch?v=ZZuD6iUe3Pc
Sorting algorithm: time O(nlogn), space O(1) ? heapSort (not stable):
- heapify the input array to maxHeap. - O(n)
- n times poll operation. - O(nlogn)
1 | // max heap |
count sort/bucket sort/radix sort/rainbow sort
Heap Implementation
1 | import java.util.NoSuchElementException; |
Extensions:
- How to make it capacity unbounded?
- when size++ && is Full, need to allocate a new larger array.
- How to pass in a Comparator and use the Comparator to compare and determine the priority of different elements.
- whenever you need to compare two elements
Tree
Tree nodes
1 | class TreeNode { |
Usually, we use the root to represent the binary tree.
General Tree, each node can have arbitrary number of children.
1 | class TreeNode { |
Tree is special kind of Graph:
1 | class GraphNode { |
Other representation: Adjacency list Vertices are stored as records or objects, and every vertex stores a list of adjacent vertices. This data structure allows the storage of additional data on the vertices. Additional data can be stored if edges are also stored as objects, in which case each vertex stores its incident edges and each edge stores its incident vertices.
When coding, you can use integers to represent vertex. In this way, a graph can be represented as:
List<List< Integer >> graph = ...
graph.get(1) -> Node 1's neighbors graph.get(2) -> Node 2's neighbors .... graph.get( i ) -> Node i's neighbors for (Integer i : graph.get(1)) { // i is a neighbor of node 1 }
LinkedList is a special kind of Tree (if there is only one child for each of the node in the binary tree, it can be reduced to a linked list)
Binary Search Tree
is a binary tree maintains the following property: for any of the nodes in the binary tree, all the nodes in its right subtree is larger than itself, all the nodes in its left subtree is smaller than itself (compared by the key stored in each of the nodes).
Basic Operations of Binary Search Tree
search() - worst case O(n), average O(logn) insert() - worst case O(n), average O(logn) remove() - worst case O(n), average O(logn)
Balanced Binary Search Tree
search(), insert(), remove() operations are all guaranteed to be O(logn). Eg. AVL Tree, Red-Black Tree, etc.
Red-Black tree
- in Java: TreeMap/TreeSet
- in C++: map/set.
We are not going to implement balanced binary search tree, instead we are more interested in basic implementation.
Traversals of Binary Tree
preorder - root, left sub, right sub inorder - left sub, root, right sub postorder - left sub, right sub, root
Like LinkedList, General ways of solving problems about binary (search) tree: Recursion divide and conquer is nature of binary tree - binary tree can be easily divided into three parts
- solve problem for left/right subtree
- solve the problem for root.
Iterative
- could be complicated for binary tree but we need to know how.
Level order traversal: Try to build one for your LaiCode test cases.
Search in BST
1 | // recursive |
Tail Recursion The recursive call is always the last execution statement. We can easily transfer the tail recursive to iterative solution.
Not a tail recursion:
... return isBst(root.left, ...) && isBst(root.right, ...)
1 | // iterative |
Insert in BST
Recursion I A very common way when the tree structure could be changed, in this case, we return the new root after the change (remind about the same case of Linked List? ).
1 | public TreeNode insert(TreeNode root, int key) { |
Recursion II (Remove redundant operation)
1 | public TreeNode insert(TreeNode root, int target) { |
(target == root.key) return; // terminate
target < root.key:
- root.left == null; root.left = new TreeNode(target); // terminate
- else: helper(root.left, target);
- target > root.key:
- root.right == null; root.right = new TreeNode(target); //terminate
- else: helper(root.right, target);
第二种只做了一次赋值,第一种做了多次赋值,但第一种对以后的问题有启发。
Iterative I
1 | public TreeNode insert(TreeNode root, int target) { |
Iterative II
1 | public TreeNode insert(TreeNode root, int target) { |
Delete in BST
delete(root, 8) step one: find the node to be deleted - trivial? step two: delete it ... - not trivial? how many possible situations?
1 | // return: the node that replace target |
Binary Tree Traversals (iteratively)
Too easy to use recursion to solve the traversal problem.
1 | public void inorder(TreeNode root) { |
Why do we need the Iterative solution? since we already have the simple, elegant recursive solution.
Recall here for Java memory areas:
- STACK: storing the local variables and other information for each of the method calls
- HEAP: allocating spaces for dynamically created objects
How recursion works
1 | public void print(int x) { |
The unfinished methods must be retained in STACK before it can return: when we call print(10), what will happen?
STACK: tail -> { } <- head
Does recursion consume extra spaces? - YES, the method calls use spaces on STACK. Think about calling print(1000000) now, what will happen?
The space consumed on STACK: O(# of method call levels).
In Java, STACK is usually a size limited memory area. By default, a several thousands levels recursion call would easily eat up all the space and throw StackOverFlowException - this is something you have to keep in mind(especially when the solution is in recursive way).
Iterative way is good because it needs much less space of STACK(There is probably only O(1) method call levels, so O(1) space on STACK).
It is not easy to convert all recursion solutions to a "while loop" solution without any other auxiliary data structures' help.
1 | public void inorder(TreeNode root) { |
There are two paths of recursion need to follow instead of one, and we need to finish the first branch, then the second one.
In this case, we will need something else to help us: The recursion is internally done by using STACK to maintain the method call levels and directions, we can simulate this ourselves, so a stack will be needed.
The difference is, we use our own stack and the space used by our own stack is on HEAP. The space consumed on STACK is trivial. - We do not change the total space consumption, but we move out the space consumption of STACK to HEAP.
Once the root is traversed, we can print it directly and we do not need to store it in the stack any more.
- We always print root first, then root can be eliminated from stack.
- We traverse left sub first, so the right sub should be retained in the stack before the left sub is done.
1 | public void preOrder(TreeNode) { |
The problem is, we can not throw away the root in the stack before we traversed all the nodes in left subtree. How can we know we have already traversed all the nodes in left sub?
The root is the top element in the stack, use a helper node to store the next "visiting" node and subtree.
- when helper node is not null, we should traverse the subtree, so we push helper and we go left
- when helper is null. means the left subtree of the root is finished, the root is the top element in the stack. We can print the top and let helper = top.right. (traverse the left subtree first, then top, then right subtree)
- do 1 and 2 until helper is null and there is no nodes left in the stack.
1 | public void inOrder(TreeNode root) { |
1 | public void postOrder(TreeNode root) { |
What is the drawback of Method 1 if we print the nodes on the fly? Need to store everything in memory before we can get the whole post order traversal sequence.
Method 2:
The problem is, we need to traverse both left and right subtrees first, then we can eliminate the root from the stack. We need a mechanism to know, when we finished visiting all subtrees' nodes.
What we need to know?
- The Direction! we are visiting down? or returning from left? or returning from right?
The root is the top element in the stack Maintain a previous Node, to record the previous visiting node on the traversing path, so that we know what the direction we are taking now and what is the direction we are taking next.
- root = stack.top
- if previous is null -> going down (left subtree has priority) (初始情况)
- if previous is current's parent, -> going down(left subtree has priority)
- if previous == current.left, -> left subtree finished, going current.right
- if previous == current.right, -> right subtree finished, pop current, going up
1 | public void postOrder(TreeNode root) { |
This solution is important because this is the closest imitation of how actually the recursion is done in STACK, and it can be used as a general framework to iteratively do pre-order and in-order traversal as well.
If you look closely, the recursion is actually a DFS procedure on the binary tree. pre-order, in-order, post-order are following the same DFS procedure, and the only difference is when you want to print the node.
Map & Set & Hash
Set A collection that can not contain duplicate elements.
The Java platform contains three general-purpose Set implementations: HashSet, TreeSet, and LinkedHashSet.
HashSet: which stores its elements in a hashtable, is the best-performing implementation; however it makes no guarantee concerning the order of iteration
TreeSet: which stores its elements in a red-black tree(balanced binary search tree), orders its elements based on their values
LinkedHashSet: it is a HashSet and also it is a LinkedList, it maintains the order when each of the elements is inserted into the HashSet.
Map
A collection that maps keys to values. A Map cannot contain duplicate keys; each key can map to one value.
- A collection that stores <key, value> pairs, such that no duplicate keys are allowed.
The Java platform contains three general-purpose Map implementations: HashMap, TreeMap, and LinkedHashMap.
HashMap
Common API http://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html
V put(K key, V value) V get(K key) V remove(K key) boolean containsKey(K key) Set<Map.Entry<K, V>> entrySet() - get the set view of all the entries in the hashmap Set< K > keySet() - get the set view of all the keys in the hashmap Collection< V > values() - get the collection view of all the values in the hashmap boolean containsValue(V value) - O(n) void clear() int size() boolean isEmpty()
1 | Map<String, Integer> employerNumbers = new HashMap<>(); |
HashMap vs. Hashtable Similar to Vector vs. ArrayList
Vector, Hashtable exist from Java 1.0 ArrayList, HashMap introduced from Java 5
Hashtable not allow null key, but HashMap allow one null key.
Vector, Hashtable operations are synchronized (thread safe), introduce a lot of performance penalty
We do not use Hashtable anymore.
A Hashtable implementation.
- A table of buckets(an array of buckets), using the array index to denote each bucket.
- For each <key, value>, it goes to one of the buckets, the bucket index is determined by a hash function applied on key and the size of array.
- Collision Control
- Collision - two keys mapped to the same bucket
- Separate Chaining(Close Addressing) - the element of each of the buckets is actually a single linked list.
- If different Keys are determined to use the same bucket, they will be chained in the list.
What are the elements in the buckets? Singly Linked List, each node contains one <key, value> pair
1
2
3
4
5class Entry<K, V> {
K key;
V value;
Entry<K, V> next;
}https://docs.oracle.com/javase/8/docs/api/java/util/Map.Entry.html
HashMap operations
put(key, value)
int hash = hash(key) -- hash
int index = hash % table_size -- index
search for the list resides in the bucket to see if the key already exists
- if not find the same key already existed, add a new Entry node to the list
- if find an Entry with the same key, update the value of that Entry
- get(key)
- int hash = hash(key) -- hash
- int index = hash % table_size -- index
- search for the list resides in the bucket to see if the key already exists
HashMap can only has one null key, it is always mapped to bucket 0.
Get into a little details about HashMap Implementation:
- array of entries
- each entry is actually a singly linked list (handle collision)
- contains the <key, value> pair
Define the class for each entry:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21class Entry<K, V> {
final K key;
V value;
Entry<K, V> next;
Entry(K key, V value) {
this.key = key;
this.value = value;
}
public K getKey() {
return key;
}
public V getValue() {
return value;
}
public void setValue(V value) {
this.value = value;
}
}maintain an array of entries
Entry<K, V>[] array
- To get(K key), put(K key, V value), remove(K key), we follow the steps
- hash(key) to hash#
- hash# to entry index in the array
- from the corresponding singly linked list, iterate all of the nodes to find if the same key exists.
- To get(K key), put(K key, V value), remove(K key), we follow the steps
hash(key) to get the hash#
1
2
3
4
5
6
7
8private int hash(K key) {
// return the hash# of the key
if (key == null) {
return 0;
}
int hashNumber = key.hashCode();
// postprocessing to make the hashNumber non-negative.
}from the hash#, mapped to the entry index.
1
2
3
4int getIndex(int hashNumber) {
// return the corrsponding index of array.
return hashNumber % array.length;
}When iterate the corresponding entry for the given key, which is actually a singly linked list, we need compare each of the entry in the list, if the key is the same as the key we want, then it is the entry we want to find.
1
2
3
4
5
6
7Entry<K, V> cur = array[index];
while (cur != null) {
K curKey = cur.getKey();
if (curKey is the same as given key) {
...
}
}
Two Questions Here:
- "HashMap get(key), put(key, value), remove(key) is O(1)", is it always true? What is the major factor affecting the time complexity? hash(key) -> key.hashCode()
- When we need to know if a key is already in the HashMap, we need to find its bucket, then traverse the linked list in that bucket to see if there is a same key already existed. how do we define "the same key"? key.equals(anotherKey)
Very Important: HashMap use
- key.hashCode() to determine the entry index for the key
- key.equals() to determine whether two keys are the same keys.
==, equals(), hashCode()
- ==
- determine if two primitive type has the same value.
- determine if two reference are pointed to the same object.
equals(), hashCode()
defined in Object class, Object is the root class for any Java class
any Java class implicitly extends Object class, so if in the subclass these two methods are not overridden, it inherits what it defines in Object class.
the default implementation of equals() is to check if the two references are pointed to the same object "==".
the default implementation of hashCode() returns a "unique hash value" for the object based on its memory address.
1
2
3
4
5
6
7
8
9
10
11
12class Object {
public boolean equals(Object obj) {
return this == obj;
}
public int hashCode() {
// here it returns the object's hash value(based on its memory address)
// it is unique for each object.
}
public String toString() {
}
}
sometimes we do not want to compare the reference ... we want to compare the object's content. Example: (from an existing class)
1 | // s1 and s2 are different String objects |
Why s1.equals(s2) == true ?
in String class, the equals() method is overridden:
1 | class String { |
Example(our own class):
class has to override equals(), hashCode() if you want to have a different meaning than comparing the references.
1 | // representing a (x, y) coordinate on the grid. |
1 | public class Coordinate { |
What about hashCode() ?
Remember, in Hashmap, key.hashCode() method determines which entry the key will go to, we need to guarantee the same keys always go to the same entry.
("yahoo"(1), 2) -- index = 1 put("yahoo"(2), 5), -> "yahoo"(2).hashCode() -- index,
In Java, There is a contract between equals() and hashCode(), the developers need to maintain:
- If one.equals(two), it is a must that one.hashCode() == two.hashCode()
- If one.hashCode() == two.hashCode(), it is not necessary one.equals(two)
When you want to override equals(), please definitely override hashCode() as well Only by doing this, you can make sure the HashMap will still work for the key type.
Example:
not override equals(), hashCode()
1 | public class Coordinate { |
override equals(), override hashCode()
1 | public class Coordinate { |
hashCode() is VERY important!
- The performance of HashMap solely depends on how good the hashCode() is.
- bad hashCode() can lead to linear time complexity in extreme case, for example, if all the keys has the same hash number returned by .hashCode().
- easy, fast, efficient
- minimize collision, as evenly distributed as possible
Common hashCode() implementation
1 | class Combo { |
一般直接采取自动生成的方式
Rehashing, load factor
- Rehashing
- is needed when there are too many <key, value> pairs maintained in the hashmap, -> too many collisions.
- expand the array to a larger one and move all the <key, value> pairs to the new array (in HashMap by default, the array size is doubled each time.)
- Rehashing is global wise - meaning all the <key, value> pairs need to participate.
- load factor
- to control when rehashing is needed, maintain the ratio of number of <key, value> pairs / number of buckets
- default 0.75
- when the the ratio exceeds the limit, rehashing will happen.
HashSet
It is backed up by a HashMap instance. Only care about the Key here.
Common API https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html boolean add(E e) boolean remove(Object o) boolean contains(Object o) void clear() int size() boolean isEmpty()
1 | class HashSet<K> { |
HashMap Implementation, Arrays, Collections
Review About HashMap:
- where are equals() and hashCode() from?
- the difference between "==" and equals()
- The contract between equals() and hashCode()
- If o1.equals(o2) = true, then you have to guarantee o1.hashCode() == o2.hashCode()
- If o1.hashCode() == o2.hashCode(), it is not necessary o1.equals(o2)
The performance of HashMap is controlled by hashCode()
If hashCode() return the same int value for all the objects, what will happen? It will be transferred to a single linked list.
HashMap Implementation
1 | import java.util.Arrays; |
What is a "view"?
entrySet() -> view
public Set<Map.Entry<K,V>> entrySet()
Returns a Set view of the mappings contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. If the map is modified while an iteration over the set is in progress (except through the iterator's own remove operation, or through the setValue operation on a map entry returned by the iterator) the results of the iteration are undefined. The set supports element removal, which removes the corresponding mapping from the map, via the Iterator.remove, Set.remove, removeAll, retainAll and clear operations. It does not support the add or addAll operations.
Specified by: entrySet in interface Map<K,V>
Specified by: entrySet in class AbstractMap<K,V> R
Returns: a set view of the mappings contained in this map
Efficient code accessing the HashMap
Which is better?
1 | // 1 |
2方法更好,1方法访问了3次,2方法访问了2次。
Which is better?
1 | // 1 |
2更好,set.add(cur) 同时可以返回信息
How to iterate each of the key, value pairs in a HashMap?
1 | HashMap<String, Interger> map = new HashMap<String, Interger>(); |
Which is better? A : 2
1 | // 1 |
Remove when traverse
1 | HashMap<String, Interger> map = new HashMap<String, Interger>(); |
Which is correct? A : 2
1 | // 1 |
LinkedList -> Tree in HashMap
Optimization -> from Java 8 Search() operation
Balanced Binary Search Tree -> O(logn)
- if the LinkedList is very short, we do not do this optimization.
- require order of the keys, but how?
- If the key does not implement Comparable, use a system hash function to decide the order.
Data Structures vs. Java Implementations
Data Structures
- array new int[5];
- array(sorted) -- search O(n) -> O(logn)
- O(1) random access(get by index)
- limitation of array, fixed size or resizable.
- linked list
- singly
- doubly
- O(n) random access(get by index)
- O(1) append at head/tail
- queue
- O(1) offer, poll, peek
- FIFO
- Circular Array
- stack
- O(1) push, pop, top
- LIFO
- deque
- O(1) insert/remove/peek at both end
- heap
- elements are ordered by their values
- O(1) peek smallest/largest
- O(logn) offer, poll
- hashtable
- elements are not ordered
- average O(1) search, insert, remove
- binary tree
- O(n) search
- balanced binary search tree
- elements are ordered
- O(logn) search, insert, remove
- O(logn) smallest/largest
- O(logn) nextLarger(), nextSmaller()
Java Implementation
- array (int[] array, int[m] [n] matrix, ArrayList)
- linked list
- singly
- doubly (LinkedList)
- queue (LinkedList)
- stack (LinkedList)
- deque (LinkedList, ArrayDeque)
- heap (PriorityQueue)
- hashtable (HashMap, HashSet)
- Balanced Binary Search Tree (TreeMap, TreeSet)
- ......
Extra Reading
Java Collection Framework
- The most most frequently used classes and APls during practices/interviews.
- The representation classes of data structures(called collections/containers).
- Collections Framework is available from Java 5 (a set of the classes and interfaces conducting a hierarchical structure, following the strong OO paradigm, containing all the interfaces and implementations)
- Official tutorial: https://docs.oracle.com/javase/tutorial/collections
Interfaces/Classes Hierarchy
The arrows means "implements"
Collection - Defines what operations should be provided for any classes that is a "Collection". Any class that is a kind of collection should implement the defined method. http://docs.oracle.com/javase/7/docs/api/java/util/Collection.html int size() boolean isEmpty() boolean add(E element) boolean contains(E element) boolean remove(E element) void clear()
List - Defines a group of data structures, that can contain duplicate elements, maintain the order of insertion of the elements and provide the capability of random access by using index. http://docs.oracle.com/javase/7/docs/api/java/util/List.html E get(int index) E remove(int index) E set(int index, E element)
ArrayList - A concrete implementation of List by resizable-array. http://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html It is a List It is a Collection It implements all the methods specified in List and Collection. List
Queue - Defines a group of data structures, that is designated to 1). hold and buffer elements before processing 2). provide a way of choosing which element buffered is the next one to be processed 3). there are usually two ends of the Queue, and offer always at tail, poll() always at head. http://docs.oracle.com/javase/7/docs/api/java/util/Queue.html boolean add(E element); E element(); E remove(); // group two operations boolean offer(E element); E peek(); E poll();
Deque - Defines a subtype of queue, where it is double ended. (FIFO & LIFO are both provided) http://docs.oracle.com/javase/7/docs/api/java/uti/Deque.html boolean offerFirst(E element); boolean offerLast(E element); E pollFirst(); E pollLast(); E peekFirst(); E peekLast();
LinkedList - A List implementation that backed by a doubly linked list. It also can be used as queue/deque/stack. http://docs.oracle.com/javase/7/docs/api/java/util/LinkedList.html It is a List lt is a Queue It is a Deque It is a Collection It implements all the methods specified in Collection, List, Queue, Deque.
PriorityQueue - Defines a special implementation of queue, where it is using priority to determine which element is the next to process(using order). http://docs.oracle.com/javase/7/docs/api/java/util/PriorityQueue.html It implements all the methods specified in Collection, Queue.
Set - Defines a group of data structures, that contains no duplicate elements. http://docs.oracle.com/javase/7/docs/api/java/util/Set.html
HashSet - A concrete implementation of Set using hashtable. http://docs.oracle.com/javase/7/docs/api/java/uti/HashSet.html It is a Set It is a Collection It implements all the methods specified in Set and Collection.
SortedSet - Define a subtype of Set, that contains no duplicate elements, and the elements are sorted. It is a subtype of Set. http://docs.oracle.com/javase/7/docs/api/java/util/SortedSet.html E first(); // smallest one E last(); // largest one SortedSet
TreeSet - A concrete implementation of SortedSet using red-black tree. http://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html It is a SortedSet It is a Set : It is a Collection It implements all the methods specified in SortedSet, Set and Collection.
Similar to Map, SortedMap, HashMap and TreeMap: http://docs.oracle.com/javase/7/docs/api/java/util/Map.html http://docs.oracle.com/javase/7/docs/api/java/util/SortedMap.html http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html http://docs.oracle.com/javase/7/docs/api/java/util/TreeMap.html
About Arrays and Collections
They are the placeholders of a set of utility methods for manipulating arrays and collections objects.
- They are NOT the class for array and Collection
- All the methods in Arrays/Collections are static methods, so there is no need to create Arrays/Collections object to utilize them
Arrays http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html
Arrays.sort(int[] array) int[] array = new int[] {3, 2, 1}; Arrays.sort(array); // array = {1, 2, 3}
Arrays.sort(T[], Comparator
comparator) use the order defined in the comparator to sort the array. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18class Cell {
int x;
int y;
int value;
}
class CellComparator implements Comparator<Cell> {
public int compare(Cell c1, Cell c2) {
if (c1.value == c2.value) {
return 0;
}
return c1.value < c2.value ? -1 : 1;
}
}
Cell[] cells = ...;
Arrays.sort(cells, new CellComparator());
Arrays.sort(int[] array); -> quicksort Arrays.sort(Integer[] array); -> optimized mergesort
quicksort - average fastest comparison based sorting algorithm average time O(nlogn), space O(logn) worst time O(n^2) space O(n) not stable
mergesort - guaranteed O(nlogn) time, O(n) space stable
heapsort - O(nlogn) time, O(1) space {1, 4, 3, 2, 5} Stable sort: the order of entries with equal keys will be preserved. Example: Sort student on score first, then gender. Two male student with score 90 and 80 will preserve their order after the sort on gender.
Arrays.asList(T... a) - convert an array to a List Returns a fixed-size list backed by the specified array. (Changes to the returned list "write through" to the array.)
1
2
3
4
5
6
7
8List<Integer> list = Arays.asList(1, 2, 3);
//list is a List of [1, 2, 3] (Integer[])
list.set(0, 4);
//list is [4, 2, 3]
list.add(5);
// throw UnsupportedOperationException - fixed-sizedArrays.copyOf(original, int newLength)
1
2
3
4
5int[] array = new int[] {1, 2, 3};
int[] copy = Arrays.copyOf(array, 1);
// copy = {1}
copy = Arrays.copyOf(array, 5);
// copy= {1, 2, 3, 0, 0}, padding default values at the endArrays.fill(original, value)
1
2
3int[] array = new int[] {1, 2, 3};
Arrays.fill(array, 1);
// array = {1, 1, 1}Arrays.toString(int[] array), Arrays.deepToString(int[] [] matrix)
1
2
3
4
5
6
7int[] array = new int[] {1, 2, 3};
String s = Arrays.toString(array);
// s = "[1, 2, 3]";
int[][] matrix = new int[] {{1, 2}, {1, 2}};
String t = Arrays.deepToString(matrix);
// t = "[[1, 2], [1, 2]]"Arrays.binarySearch(original, value)
1
2
3
4
5int[] array = new int[] {3, 2, 1};
Arrays.sort(array);
// the array must be guaranteed to be sorted before using binarySearch
int i = Arrays.binarySearch(array, 2);
// i = 1
Collections
http://docs.oracle.com/javase/7/docs/api/java/util/Collections.html
- Collections.sort(List
list), Collections.sort(List list, Comparator comparator) - Collections.binarySearch(List
list) - Collections.swap(List
list, int i, int j) - Collections.fill(List
list, T value) - Collections.reverse(List list)
- Collections.reverseOrder() : reverse the natural order defined by a class implementing Comparable interface
// You can use Collections.reverseOrder() this way. PriorityQueue
Convert List to array(or, Collection to array)
1 | interface Collection<T> { |
in Java - The elements in the generic Collection can not be primitive type. There is no List
List
There is no existing utility / method in Java plain API, in another way, you have to write your own method to do this.
Bit Representation
1 | public class Bit { |
1 | public class Main { |
">>>" vs. ">>"
signed shift ">>" - arithmetical
The leftmost position after signed shift depends on sign extension(the most significant bit)
unsigned shift ">>>" - logical
The signed right shift operator ">>>" shifts a zero into the leftmost position ![image-20200127204304752](SDE实践/image-20200127204304752.png)
Note: There is no "unsigned int" in Java, but there is unsigned right shift.
two view of int:
- 32 bits, each of which is either 0 or 1
- arithmetic number http://stackoverflow.com/questions/1023373/findbugs-warning-integer-shift-by-32-what-does-it-mean
Autoboxing and Unboxing
- Primitive type vs. Wrapper class
- int: Integer
- long: Long
- char: Character
- double: Double
- boolean: Boolean
- ......
A wrapper class is just a wrapper of the corresponding primitive type, the Object representation of primitive type values. Similar to String, all the wrapper class objects are IMMUTABLE - internal values can not be changed after initialization. It can help:
- Generic type can not be primitive type, List
; there can not be List - It can help provide useful functionalities and contracts (json serializer: obj -> json)
- How do you represent a "not existed" int value? "null"
Prefer primitive types to wrapper classes.
1 | // example: |
autoboxing Autoboxing is the automatic conversion that the Java compiler makes between the primitive types and their corresponding object wrapper classes.
For example, converting an int to an Integer, a double to a Double, and so on.
1
2
3
4
5List<Integer> integerList = new ArrayList<Integer>();
for (int i = 0; i < 50; i++) {
integerList.add(i); // ==> integerList.add(Integer.valueOf(i));
}unboxing Unboxing is the reverse operation of autoboxing.
1
2Integer a = 4; // autoboxing => Integer a = Integer.valueOf(4);
a += 4;
What happened when Integer++?
Integer is immutable, the int value of the Integer object can never change
1
2
3
4
5
6
7
8
9// a++:
int temp = a.intValue();
temp++;
a = Integer.valueOf(temp); // a is pointing to another Integer object.
// a += 4;
int temp = a.intValue();
temp += 4;
a = Integer.valueOf(temp); // a is pointing to another Integer object.
More Examples:
+, -, *, /, >, <, >=, <= ... only applied to primitive type.
1 | public class Main { |
- Integer class cache the Integer object with value from -128 to 127, so every time an Integer object within this range is needed, it will always return the corresponding object.
- The unboxing is done only when it is necessary, for example, a > b. Be very careful about using "=="
int[] vs. Integer[] They are totally different types, and there is no auto conversion directly between them. Integer[] objArray = new Integer[5]; int[] array = objArray; // compile error. But, Integer[] objArray = new Integer[5]; objArray[0] = 1; // will work. int[] array = new int[5]; Integer a = 5; array[0] = a; // will work.
When you implement your own Comparator or override equals(), keep in mind:
- be very careful if you see "==" on Object type.
Bad:
1 | class MyComparator implements Comparator<Integer> { |
3 alternative ways:
- return i1.compareTo(i2);
- if (i1.equals(i2)) {}
- if (i1.intValue() == i2.intValue()) {}
Bad: It could overflow.
1 | class MyComparator implements Comparator<Integer> { |
Good:
1 | class MyComparator implements Comparator<Integer> { |
String
a sequence of characters "abc"
in Java, they are objects
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17class String {
private final char[] value;
// different String objects can reuse the same char array,
// but can start with different offset index.
private int offset;
// maintains the length of the part in the char array
// represented by this String object.
private int count;
public String() {
value = new char[0];
offset = 0;
count = 0;
}
......
}
char[] array = {'a', 'b', 'c', 'd'};
offset = 1; count = 2; // array(1, 2); "bc";creating Strings
- String literal:
- a series of characters in your code that is enclosed in double quotes "abc"
Whenever it encounters a string literal in your code, the compiler creates a String Object with its value(really?) String s1 = "abc"; String s2 = "abc"; System.out.println(s1 == s2); // true
Pooling for String objects There is no need for maintaining several copies of String objects with the same literal, since String objects are immutable. Usually compiler and JVM will optimize for the number of String objects created, it will maintain an intern area in HEAP, for the same String literal it will only has one String object associated.
String s1 = "abc"; String s2 = new String("abc"); // String(String another)
// how many Strings are created? System.out.println(s1 == s2); // false System.out.println(s1.equals(s2)); // true The String objects created with "new" will not use the intern pool;
Another Example: String sa = "a"; String sb = "b"; String sab = "a" + "b"; // compile time concat, "ab" // <=> String sab = "ab";
System.out.println(sab == "a" + "b"); // true System.out.println(sab == sa + "b"); // false, sa.concat("b"); System.out.println(sab == sa + sb); // false, sa.concat(sb);
The optimization is at compile time, the literals will be concatenated if possible before getting the String object.
Question: how many String objects are created?
1
2
3
4
5
6
7
8for (int i = 0; i < 100; i++) {
String s = new String("abc"); // new String(String another)
}
for (int i = 0; i < 100; i++) {
String temp = "abc";
String s = new String(temp);
}100: if "abc" is in the pool 101: if "abc" is not in the pool
Constructors - http://oracle.com/javase/7/docs/api/java/lang/String.html String() String(String value) String(char[] array) String(char[] array, int offset, int length) String(StringBuilder builder)
- String literal:
Immutable
the value of the String object cannot be changed after creation.
String s1 = "abc"; String s2 = s1; // this is another String object, the previous object's value is not changed. s2 = "def" System.out.println(s1); // "abc" System.out.println(s2); // "def"
String length
- s1.length() vs. int[] length;
Concatenate Strings
- s1.concat(s2) // again, this will create a new String object rather than change the value of s1 or s2.
- s1 = "abc" + "def"; // 等价于s1 = "abcdef", compile time concatenation.
- s1 = s1 + "def"; // <==> s1.concat("def");
- s1 = 1 + "def"; // String.valueOf(1), (1 -> Integer(1) -> Integer(1).toString()) -> "1"
- s1 = "id: " + 2 + "isStudent: " + true + 'y';
Interface, Abstract Class, Access Modifier, Exceptions
Inheritance
Definitions: A class that is derived from another class is called a subclass (also a derived class, extended class, or child class). The class from which the subclass is derived is called a superclass(also a base class or a parent class).
1 | public class Person { |
Override vs. Overload
Override is when you redefine a method that has been defined in a parent class (using the same signature). Resolved at runtime.
Overload is when you define two methods with the same name, in the same class, distinguished by their signatures (different), Resolved at compile time.
1 | class A { |