- 1. Stack 栈
- 2. Queue 队列
- 3. Array 数组
- 4. Linked List 链表
- 5. Tree
- 6. Graph
- 7. Heap 堆
- 8. Hash Table
- 9. 邻接矩阵 Adjacency Matrix
1. Stack 栈
- A LIFO (Last In First Out — the element placed at last can be accessed at first) structure.
Operations
- empty() – Returns whether the stack is empty – Time Complexity: O(1)
- size() – Returns the size of the stack – Time Complexity: O(1)
- top() / peek() – Returns a reference to the topmost element of the stack – Time Complexity: O(1)
- push(a) – Inserts the element ‘a’ at the top of the stack – Time Complexity: O(1)
- pop() – Deletes the topmost element of the stack – Time Complexity: O(1)
- Implement stack using different approaches: https://www.geeksforgeeks.org/stack-in-python/
2. Queue 队列
- A FIFO (First In First Out — the element placed at first can be accessed at first) structure.
- Implement queue using different approaches: https://www.geeksforgeeks.org/queue-in-python/
3. Array 数组
- A fixed-size data structure that can hold items of the same data type.
- An array of arrays - 2-dimensional arrays.
- Indexed, meaning that random access is possible.
Operations
- Traverse: Go through the elements.
- Search: Search for an element by value or index.
- Update: Update the value of an existing element at a given index.
- Delete and Insert: 由于array是fixed- length, 无法直接实现这些操作,可以通过创建一个新的array with (size+1), 然后复制current array.
Applications
- Build blocks for other data structures such as array lists, heaps, hash tables, vectors, and matrices.
- Used for different sorting algorithms such as insertion sort, quick sort, bubble sort, and merge sort.
4. Linked List 链表
- A sequential structure that consists of a sequence of items in linear order which are linked to each other.
- Access data sequentially, and random access is not possible.
- A simple and flexible representation of dynamic sets: The number of nodes in a list is not fixed and can grow and shrink on demand.
- 链表只能依靠指针来访问,最后一个node也是有指针的,指向NULL。
Basic terminology
- Elements in a linked list are known as nodes.
- Each node contains a key and a pointer to its successor node, known as next.
- The attribute named head points to the first element of the linked list.
- The last element of the linked list is known as the tail.
- Singly linked list — Traversal of items can be done in the forward direction only.
- Doubly linked list — Traversal of items can be done in both forward and backward directions. Nodes consist of an additional pointer known as prev, pointing to the previous node.
- Circular linked lists — Linked lists where the prev pointer of the head points to the tail and the next pointer of the tail points to the head.
Operations
- Search: Find the first element with the key k in the given linked list by a simple linear search and returns a pointer to this element.
- Insert: Insert a key to the linked list. An insertion can be done in 3 different ways; insert at the beginning of the list, insert at the end of the list, and insert in the middle of the list.
- Delete: Removes an element x from a given linked list. You cannot delete a node in a single step. A deletion can be done in 3 different ways; delete from the beginning of the list, delete from the end of the list, and delete from the middle of the list.
Code
- Introduce the existing module in Python - collections.deque
- In Python, there’s a specific object in the
collections
module that you can use for linked lists called deque (pronounced “deck”), which stands for double-ended queue. collections.deque
uses an implementation of a linked list in which you can access, insert, or remove elements from the beginning or end of a list with constant O(1) performance.>>> deque(['a','b','c']) # input has to be an iterable object, if not then new deque would be empty. deque(['a', 'b', 'c']) >>> deque('abc') deque(['a', 'b', 'c']) >>> deque([{'data': 'a'}, {'data': 'b'}]) deque([{'data': 'a'}, {'data': 'b'}])
- For more info: https://docs.python.org/3/library/collections.html#collections.deque
- In Python, there’s a specific object in the
- Implement your own Linked List - 待完善
class Node:
def __init__(self, data):
self.data = data
self.next = None
def __repr__(self):
return self.data
class LinkedList:
def __init__(self):
self.head = None
def __repr__(self):
node = self.head
nodes = []
while node is not None:
nodes.append(node.data)
node = node.next
nodes.append("None")
return " -> ".join(nodes)
>>> llist = LinkedList()
>>> llist
None
>>> first_node = Node("a")
>>> llist.head = first_node
>>> llist
a -> None
>>> second_node = Node("b")
>>> third_node = Node("c")
>>> first_node.next = second_node
>>> second_node.next = third_node
>>> llist
a -> b -> c -> None
5. Tree
- A hierarchical structure where data is organized hierarchically and linked together. This structure is different from a linked list whereas, in a linked list, items are linked in a linear order.
e.g. A binary search tree
- Every node in a binary search tree comprises the following attributes.
- key: The value stored in the node.
- left: The pointer to the left child.
- right: The pointer to the right child.
- p: The pointer to the parent node.
- binary-search-tree property
Let x be a node in a binary search tree.
- If y is a node in the left subtree of x, then y.key ≤ x.key
- If y is a node in the right subtree of x, then y.key ≥ x.key
Implementation - Binary search tree
# Creating node class class Node: def __init__(self, data): self.data = data self.leftChild = None self.rightChild = None # Function to insert in BST def insert(self, data): # if value is lesser than the value of the parent node if data < self.data: if self.leftChild: # if we still need to move towards the left subtree self.leftChild.insert(data) else: self.leftChild = Node(data) return # if value is greater than the value of the parent node else: if self.rightChild: # if we still need to move towards the right subtree self.rightChild.insert(data) else: self.rightChild = Node(data) return # function to print a BST def PrintTree(self): if self.leftChild: self.leftChild.PrintTree() print( self.data), if self.rightChild: self.rightChild.PrintTree()
# Creating root node root = Node(27) # Inserting values in BST root.insert(14) root.insert(35) root.insert(31) root.insert(10) root.insert(19) # printing BST >>> root.PrintTree() # Output 10 14 19 27 31 35
- Every node in a binary search tree comprises the following attributes.
- A complete binary tree
- 一棵深度为k的有n个结点的二叉树,对树中的结点按从上至下、从左到右的顺序进行编号。如果编号为i(1≤i≤n)的结点与满二叉树中编号为i的结点在二叉树中的位置相同,则这棵二叉树称为完全二叉树。
- AVL树:Adelson-Velsky and Landis Tree 是计算机科学中最早被发明的自平衡二叉查找树。在AVL树中,任一节点对应的两棵子树的最大高度差为1,因此它也被称为高度平衡树。查找、插入和删除在平均和最坏情况下的时间复杂度都是O(logn)。增加和删除元素的操作则可能需要借由一次或多次树旋转,以实现树的重新平衡。
- 红黑树:一种自平衡二叉查找树,是在计算机科学中用到的一种数据结构,典型用途是实现关联数组。红黑树主要特征是在每个节点上增加一个属性表示节点颜色,可以红色或黑色。红黑树和 AVL 树类似, 都是在进行插入和删除时通过旋转保持自身平衡,从而获得较高的查找性能。红黑树保证从根节点到叶 尾的最长路径不超过最短路径的 2 倍,所以最差时间复杂度是 O(logn)。红黑树通过重新着色和左右旋 转,更加高效地完成了插入和删除之后的自平衡调整。
6. Graph
- A graph consists of a finite set of vertices or nodes and a set of edges connecting these vertices.
- The order of a graph is the number of vertices in the graph. The size of a graph is the number of edges in the graph.
- Two nodes are said to be adjacent if they are connected to each other by the same edge.
- Directed and undirect graph
Implementations
- Using adjacent lists/matrices
- Using dictionary, as well as some basic operations such as find_path
See more at: https://www.python.org/doc/essays/graphs/
7. Heap 堆
- A special case of a binary tree where the parent nodes are compared to their children with their values and are arranged accordingly.
- Heaps can be represented using trees as well as arrays.
- Property: 给定堆中任意节点P和C,若P是C的母节点,那么P的值小于等于(或大于等于)C的值.
- 若母节点的值恒小于等于子节点的值,此堆称为最小堆: min heap.
- 反之,若母节点的值恒大于等于子节点的值,此堆称为最大堆: max heap.
8. Hash Table
- 是根据键(Key)而直接访问在内存储存位置的数据结构。也就是说,它通过计算出一个键值的函数,将所需查询的数据映射到表中一个位置来让人访问,这加快了查找速度。
- 若关键字为,则其值存放在的存储位置上。由此,不需比较便可直接取得所查记录。称这个对应关系为散列函数,按这个思想建立的表为散列表。
- Hash table stores key-value pairs but the key is generated through a hashing function.
- A simple example of a hash function:
h(k) = k1 % m
- h(k) is the hash function that accepts a parameter k. The parameter k is the value that we want to calculate the key for.
- k1 % m is the algorithm for our hash function where k1 is the value we want to store, and m is the size of the list. We use the modulus operator to calculate the key.
- A simple example of a hash function:
Implementation
class HashTable:
# Create empty bucket list of given size
def __init__(self, size):
self.size = size
self.hash_table = self.create_buckets()
def create_buckets(self):
return [[] for _ in range(self.size)]
# Insert values into hash map
def set_val(self, key, val):
# Get the index from the key
# using hash function
hashed_key = hash(key) % self.size
# Get the bucket corresponding to index
bucket = self.hash_table[hashed_key]
found_key = False
for index, record in enumerate(bucket):
record_key, record_val = record
# check if the bucket has same key as
# the key to be inserted
if record_key == key:
found_key = True
break
# If the bucket has same key as the key to be inserted,
# Update the key value
# Otherwise append the new key-value pair to the bucket
if found_key:
bucket[index] = (key, val)
else:
bucket.append((key, val))
# Return searched value with specific key
def get_val(self, key):
# Get the index from the key using
# hash function
hashed_key = hash(key) % self.size
# Get the bucket corresponding to index
bucket = self.hash_table[hashed_key]
found_key = False
for index, record in enumerate(bucket):
record_key, record_val = record
# check if the bucket has same key as
# the key being searched
if record_key == key:
found_key = True
break
# If the bucket has same key as the key being searched,
# Return the value found
# Otherwise indicate there was no record found
if found_key:
return record_val
else:
return "No record found"
# Remove a value with specific key
def delete_val(self, key):
# Get the index from the key using
# hash function
hashed_key = hash(key) % self.size
# Get the bucket corresponding to index
bucket = self.hash_table[hashed_key]
found_key = False
for index, record in enumerate(bucket):
record_key, record_val = record
# check if the bucket has same key as
# the key to be deleted
if record_key == key:
found_key = True
break
if found_key:
bucket.pop(index)
return
# To print the items of hash map
def __str__(self):
return "".join(str(item) for item in self.hash_table)
hash_table = HashTable(50)
# insert some values
hash_table.set_val('gfg@example.com', 'some value')
print(hash_table)
print()
hash_table.set_val('portal@example.com', 'some other value')
print(hash_table)
print()
# search/access a record with key
print(hash_table.get_val('portal@example.com'))
print()
# delete or remove a value
hash_table.delete_val('portal@example.com')
print(hash_table)
- 哈希冲突
- 指的是当key集合很大时,关键字值不同的元素可能胡映像到哈希表的同一个地址。
,这种现象就是哈希冲突。
- 解决哈希冲突
- 开放定址法:这种方法也称再散列法,其基本思想是:当关键字key的哈希地址p=H(key)出现冲突时,以p为基础,产生另一个哈希地址p1,如果p1仍然冲突,再以p为基础,产生另一个哈希地址p2。直到找出一个不冲突的哈希地址pi ,将相应元素存入其中。
- 链地址法:这种方法的基本思想是将所有哈希地址为i的元素构成一个称为同义词链的单链表,并将单链表的头指针存在哈希表的第i个单元中,因而查找、插入和删除主要在同义词链中进行。链地址法适用于经常进行插入和删除的情况。
- 指的是当key集合很大时,关键字值不同的元素可能胡映像到哈希表的同一个地址。