DATA STRUCTURE

1. Stack 栈
2. Queue 队列
3. Array 数组
4. Linked List 链表
5. Tree
6. Graph
7. Heap 堆
8. Hash Table
9. 邻接矩阵 Adjacency Matrix

1. Stack 栈

A LIFO (Last In First Out — the element placed at last can be accessed at first) structure.

Operations

empty() – Returns whether the stack is empty – Time Complexity: O(1)
size() – Returns the size of the stack – Time Complexity: O(1)
top() / peek() – Returns a reference to the topmost element of the stack – Time Complexity: O(1)
push(a) – Inserts the element ‘a’ at the top of the stack – Time Complexity: O(1)
pop() – Deletes the topmost element of the stack – Time Complexity: O(1)

Implement stack using different approaches: https://www.geeksforgeeks.org/stack-in-python/

2. Queue 队列

A FIFO (First In First Out — the element placed at first can be accessed at first) structure.
Implement queue using different approaches: https://www.geeksforgeeks.org/queue-in-python/

3. Array 数组

A fixed-size data structure that can hold items of the same data type.
An array of arrays - 2-dimensional arrays.
Indexed, meaning that random access is possible.

Operations

Traverse: Go through the elements.
Search: Search for an element by value or index.
Update: Update the value of an existing element at a given index.
Delete and Insert: 由于array是fixed- length, 无法直接实现这些操作，可以通过创建一个新的array with (size+1), 然后复制current array.

Applications

Build blocks for other data structures such as array lists, heaps, hash tables, vectors, and matrices.
Used for different sorting algorithms such as insertion sort, quick sort, bubble sort, and merge sort.

4. Linked List 链表

A sequential structure that consists of a sequence of items in linear order which are linked to each other.
Access data sequentially, and random access is not possible.
A simple and flexible representation of dynamic sets: The number of nodes in a list is not fixed and can grow and shrink on demand.
链表只能依靠指针来访问，最后一个node也是有指针的，指向NULL。

Basic terminology

Elements in a linked list are known as nodes.
Each node contains a key and a pointer to its successor node, known as next.
The attribute named head points to the first element of the linked list.
The last element of the linked list is known as the tail.

Singly linked list — Traversal of items can be done in the forward direction only.
Doubly linked list — Traversal of items can be done in both forward and backward directions. Nodes consist of an additional pointer known as prev, pointing to the previous node.
Circular linked lists — Linked lists where the prev pointer of the head points to the tail and the next pointer of the tail points to the head.

Operations

Search: Find the first element with the key k in the given linked list by a simple linear search and returns a pointer to this element.
Insert: Insert a key to the linked list. An insertion can be done in 3 different ways; insert at the beginning of the list, insert at the end of the list, and insert in the middle of the list.
Delete: Removes an element x from a given linked list. You cannot delete a node in a single step. A deletion can be done in 3 different ways; delete from the beginning of the list, delete from the end of the list, and delete from the middle of the list.

Code

Introduce the existing module in Python - collections.deque
- In Python, there’s a specific object in the collections module that you can use for linked lists called deque (pronounced “deck”), which stands for double-ended queue.
- collections.deque uses an implementation of a linked list in which you can access, insert, or remove elements from the beginning or end of a list with constant O(1) performance.
```
>>> deque(['a','b','c']) # input has to be an iterable object, if not then new deque would be empty.
deque(['a', 'b', 'c'])

>>> deque('abc')
deque(['a', 'b', 'c'])

>>> deque([{'data': 'a'}, {'data': 'b'}])
deque([{'data': 'a'}, {'data': 'b'}])
```
- For more info: https://docs.python.org/3/library/collections.html#collections.deque

Implement your own Linked List - 待完善

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

    def __repr__(self):
        return self.data

class LinkedList:
    def __init__(self):
        self.head = None

    def __repr__(self):
        node = self.head
        nodes = []
        while node is not None:
            nodes.append(node.data)
            node = node.next
        nodes.append("None")
        return " -> ".join(nodes)

>>> llist = LinkedList()
>>> llist
None

>>> first_node = Node("a")
>>> llist.head = first_node
>>> llist
a -> None

>>> second_node = Node("b")
>>> third_node = Node("c")
>>> first_node.next = second_node
>>> second_node.next = third_node
>>> llist
a -> b -> c -> None

5. Tree

A hierarchical structure where data is organized hierarchically and linked together. This structure is different from a linked list whereas, in a linked list, items are linked in a linear order.

e.g. A binary search tree

Every node in a binary search tree comprises the following attributes.
1. key: The value stored in the node.
2. left: The pointer to the left child.
3. right: The pointer to the right child.
4. p: The pointer to the parent node.
binary-search-tree property
Let x be a node in a binary search tree.
- If y is a node in the left subtree of x, then y.key ≤ x.key
- If y is a node in the right subtree of x, then y.key ≥ x.key

Implementation - Binary search tree

# Creating node class
class Node:
    def __init__(self, data):
        self.data = data
        self.leftChild = None
        self.rightChild = None

# Function to insert in BST
    def insert(self, data):
        # if value is lesser than the value of the parent node
        if data < self.data:
            if self.leftChild:
                # if we still need to move towards the left subtree
                self.leftChild.insert(data)
            else:
                self.leftChild = Node(data)
                return
        # if value is greater than the value of the parent node        
        else:
            if self.rightChild:
                # if we still need to move towards the right subtree
                self.rightChild.insert(data)
            else:
                self.rightChild = Node(data)
                return

    # function to print a BST
    def PrintTree(self):
        if self.leftChild:
            self.leftChild.PrintTree()
        print( self.data),
        if self.rightChild:
            self.rightChild.PrintTree()


# Creating root node
root = Node(27)
# Inserting values in BST
root.insert(14)
root.insert(35)
root.insert(31)
root.insert(10)
root.insert(19)

# printing BST
>>> root.PrintTree()

# Output
10 14 19 27 31 35

A complete binary tree
- 一棵深度为k的有n个结点的二叉树，对树中的结点按从上至下、从左到右的顺序进行编号。如果编号为i（1≤i≤n）的结点与满二叉树中编号为i的结点在二叉树中的位置相同，则这棵二叉树称为完全二叉树。
AVL树：Adelson-Velsky and Landis Tree 是计算机科学中最早被发明的自平衡二叉查找树。在AVL树中，任一节点对应的两棵子树的最大高度差为1，因此它也被称为高度平衡树。查找、插入和删除在平均和最坏情况下的时间复杂度都是O(logn)。增加和删除元素的操作则可能需要借由一次或多次树旋转，以实现树的重新平衡。
红黑树：一种自平衡二叉查找树，是在计算机科学中用到的一种数据结构，典型用途是实现关联数组。红黑树主要特征是在每个节点上增加一个属性表示节点颜色，可以红色或黑色。红黑树和 AVL 树类似，都是在进行插入和删除时通过旋转保持自身平衡，从而获得较高的查找性能。红黑树保证从根节点到叶尾的最长路径不超过最短路径的 2 倍，所以最差时间复杂度是 O(logn)。红黑树通过重新着色和左右旋转，更加高效地完成了插入和删除之后的自平衡调整。

6. Graph

A graph consists of a finite set of vertices or nodes and a set of edges connecting these vertices.
The order of a graph is the number of vertices in the graph. The size of a graph is the number of edges in the graph.
Two nodes are said to be adjacent if they are connected to each other by the same edge.
Directed and undirect graph
Implementations
- Using adjacent lists/matrices
  See more at: https://www.educative.io/answers/how-to-implement-a-graph-in-python
- Using dictionary, as well as some basic operations such as find_path
  See more at: https://www.python.org/doc/essays/graphs/

7. Heap 堆

A special case of a binary tree where the parent nodes are compared to their children with their values and are arranged accordingly.
Heaps can be represented using trees as well as arrays.
Property: 给定堆中任意节点P和C，若P是C的母节点，那么P的值小于等于（或大于等于）C的值.
- 若母节点的值恒小于等于子节点的值，此堆称为最小堆: min heap.
- 反之，若母节点的值恒大于等于子节点的值，此堆称为最大堆: max heap.

8. Hash Table

是根据键(Key)而直接访问在内存储存位置的数据结构。也就是说，它通过计算出一个键值的函数，将所需查询的数据映射到表中一个位置来让人访问，这加快了查找速度。
若关键字为，则其值存放在的存储位置上。由此，不需比较便可直接取得所查记录。称这个对应关系为散列函数，按这个思想建立的表为散列表。
Hash table stores key-value pairs but the key is generated through a hashing function.
- A simple example of a hash function: h(k) = k1 % m
  - h(k) is the hash function that accepts a parameter k. The parameter k is the value that we want to calculate the key for.
  - k1 % m is the algorithm for our hash function where k1 is the value we want to store, and m is the size of the list. We use the modulus operator to calculate the key.

Implementation

class HashTable:

	# Create empty bucket list of given size
	def __init__(self, size):
		self.size = size
		self.hash_table = self.create_buckets()

	def create_buckets(self):
		return [[] for _ in range(self.size)]

	# Insert values into hash map
	def set_val(self, key, val):
	
		# Get the index from the key
		# using hash function
		hashed_key = hash(key) % self.size
		
		# Get the bucket corresponding to index
		bucket = self.hash_table[hashed_key]

		found_key = False
		for index, record in enumerate(bucket):
			record_key, record_val = record
			
			# check if the bucket has same key as
			# the key to be inserted
			if record_key == key:
				found_key = True
				break

		# If the bucket has same key as the key to be inserted,
		# Update the key value
		# Otherwise append the new key-value pair to the bucket
		if found_key:
			bucket[index] = (key, val)
		else:
			bucket.append((key, val))

	# Return searched value with specific key
	def get_val(self, key):
	
		# Get the index from the key using
		# hash function
		hashed_key = hash(key) % self.size
		
		# Get the bucket corresponding to index
		bucket = self.hash_table[hashed_key]

		found_key = False
		for index, record in enumerate(bucket):
			record_key, record_val = record
			
			# check if the bucket has same key as
			# the key being searched
			if record_key == key:
				found_key = True
				break

		# If the bucket has same key as the key being searched,
		# Return the value found
		# Otherwise indicate there was no record found
		if found_key:
			return record_val
		else:
			return "No record found"

	# Remove a value with specific key
	def delete_val(self, key):
	
		# Get the index from the key using
		# hash function
		hashed_key = hash(key) % self.size
		
		# Get the bucket corresponding to index
		bucket = self.hash_table[hashed_key]

		found_key = False
		for index, record in enumerate(bucket):
			record_key, record_val = record
			
			# check if the bucket has same key as
			# the key to be deleted
			if record_key == key:
				found_key = True
				break
		if found_key:
			bucket.pop(index)
		return

	# To print the items of hash map
	def __str__(self):
		return "".join(str(item) for item in self.hash_table)


hash_table = HashTable(50)

# insert some values
hash_table.set_val('gfg@example.com', 'some value')
print(hash_table)
print()

hash_table.set_val('portal@example.com', 'some other value')
print(hash_table)
print()

# search/access a record with key
print(hash_table.get_val('portal@example.com'))
print()

# delete or remove a value
hash_table.delete_val('portal@example.com')
print(hash_table)

哈希冲突
- 指的是当key集合很大时，关键字值不同的元素可能胡映像到哈希表的同一个地址。
  $k_1 != k_2, H(k_1)=H(k_2)$ ，这种现象就是哈希冲突。
- 解决哈希冲突
  - 开放定址法：这种方法也称再散列法，其基本思想是：当关键字key的哈希地址p=H(key)出现冲突时，以p为基础，产生另一个哈希地址p1，如果p1仍然冲突，再以p为基础，产生另一个哈希地址p2。直到找出一个不冲突的哈希地址pi ，将相应元素存入其中。
  - 链地址法：这种方法的基本思想是将所有哈希地址为i的元素构成一个称为同义词链的单链表，并将单链表的头指针存在哈希表的第i个单元中，因而查找、插入和删除主要在同义词链中进行。链地址法适用于经常进行插入和删除的情况。

9. 邻接矩阵 Adjacency Matrix