Heap

Материал из Поле цифровой дидактики
Описание Как использовать кучу? - Куча (структура данных)
Область знаний Информатика
Область использования (ISTE) Computational Thinker
Возрастная категория 14


Поясняющее видео
Близкие рецепту понятия Куча (структура данных)
Среды и средства для приготовления рецепта: Snap!, Python, Scratch

Generally, a heap is a binary tree data structure which satisfies the Heap Property (explained below).

For simplicity, this page will talk only about balanced, max-heaps that contain numbers.

Purpose

A heap is a way to hold and represent numbers and can be an alternative to a list. Both can hold numbers, but a heap is more efficient at some operations and less efficient for others.

In Scratch, a heap is most useful in two ways: it is good at sorting a list and serving as a priority queue.

A priority queue is like a list, but the biggest item is always at the front. A priority queue is often used in path-finding, but here is an easier example that should show you what it does:

Twenty clones all want to get through a door. How much they want to do this is stored in a list, but only one can fit at a time. The clone that wants to get through the door the most should be allowed through first. After it gets through, the clone that wants it the second most should be allowed through.

If a list were used, figuring out which sprite to let through would be the same as finding the highest value in a list.

define findHighestItem
set [result v] to [1]
set [n v] to [1]
repeat ((length of [list v]) - (1))
change [n v] by (1)
if <(item (n) of [list v]) > (item (result) of [list v])> then
set [result v] to (n)
end
end

This solution is perfectly fine, but the list must be iterated (looped) over every time that the highest element needs to be found. For examples where speed is important (for example, path-finding or sorting large lists), we want to do this faster—and that can be done with a heap.

Abstractly

The idea of a heap is that the container does not need to be iterated over if slightly more work is done to keep it sorted. Below a representation of a heap. The following is basic data structure terminology:

  • Every item (each circle) is called a node
  • Every node that points to another one is called a child
  • Every child has a parent (the node above it)
  • Every parent has two children
  • Every parent is bigger than its children

The last statement is called the heap property. It is the property that must be true for us to call something a heap. If every parent is not bigger than both of its children, we cannot call it a heap.


In Scratch, a list can do a number of things:

  • insert a number
  • delete a number
  • get value at an index

In addition there are a number of things that are really useful to do:

  • find the biggest item
  • find the smallest item
  • sort the list
  • figure out if a number exists in a list

Heaps can efficiently:

  • insert a number
  • transform a list into a heap
  • delete a number
  • find a number
  • find the biggest number
  • find the smallest number


Implementation

Because Scratch does not provide a tree data structure, lists can be used instead to program heaps.

How a List Can Be a Heap

Every item in a heap can be given an x and a y position.

Y position is backwards (zero is on the top). X position is the number of nodes between the left side of the heap and the element. For example, the node with the number "6" in the heap below has an x position of 0 and a y position of 1. The number "7" in the heap below has an x position of 3 and a y position of 2.

The following formulas convert between the index of an element and it's x/y position in a heap. Note that lg(x) is the logarithm in base two (if you do not know what logarithms are, do not worry) and is equal to log(x)/log(2) and that floor(x) is the "round down" function (in Scratch it is "round(x - .5)). Also note that we must find y before we find x when we are calculating heap-position from list-position.

  • Index = 2y + x - 1
  • y = floor(lg(Index))
  • x = Index + 1 - 2^y

Insertion

To add a number into a list one would use the basic Add () to () (block). Adding a number to a heap is a bit trickier, it must satisfy the heap property that a parent is greater than its children. The "bubble up" function makes sure that new elements are inserted into the correct to satisfy this property.

"Bubble Up" uses recursion. Generally "bubble up" does this:

bubble up (element):
if the element is bigger than its parent
a) swap them and
b) bubble up myself again

So, to add a number to a heap, one can just add the number to the bottom of a heap and "bubble" it up to where it actually belongs.

define swap(a)(b)
set[store v] to (item (a) of [heap v])
replace item (a) of [heap v] with (item (b) of [heap v])
replace item (b) of [heap v] with (store)

define bubble up(index)
if<(index) > (1)> then
set [bubble_up_parent v] to ((round(((index) - (1.5)) / (2))) + (1)) // where the parent is
if<(item (bubble_up_parent) of [heap v]) < (item (index) of [heap v])> then
swap(index)(bubble_up_parent)
bubble up(index)
end
end

define insert(number)
add (number) to [heap v]
bubble up(length of [heap v])

Creating the Heap

define Bubble Down
set[bubble_down_value v] to (index)
if<not <((2) * (index)) > (length of [list v])>> then
if<(item((2) * (index)) of [queue v]) < (item (index) of [list v])> then
set[bubble_down_value v] to ((2) * (index))
end
if<not<(((2) * (index)) + (1)) > (length of [list v])>> then
if<(item(((2) * (index)) + (1)) of [list v]) < (item (bubble_down_value) of [list v])> then
set[bubble_down_value v] to (((2) * (index)) + (1))
end
end
if<not<(bubble_down_value) = (index)>> then
Swap (bubble_down_value)(index) :: custom
Bubble Down(bubble_down_value) :: custom

define Make into Heap
set [n v] to (length of [list v])
repeat(length of [list v])
Bubble Down(n) :: custom
change [n v] by (-1)

Finding the Maximum

This is sort of the entire point of a heap! You instantly know which item is the biggest—it's the first one!

define findBiggestValue
set[result v] to (item (1 v) of [heap v])

Deletion

There are two different types of deletion. The first one is the removal of the greatest number. For priority queues and sorting this is the only deletion that is needed, so this is the one that will be shown first. For this one needs the opposite of "bubble up": "bubble down" To delete the biggest number it is first swapped with the last number (which is usually pretty small). Then the biggest number (which is on the end) is erased and the small number moves down the tree until the tree is a heap again.

define swap (a) (b)
set [store v] to (item (a) of [queue v])
replace item (a) of [queue v] with (item (b) of [queue v])
replace item (b) of [queue v] with (store)

define Bubble Down (index)
set [bubble_down_value v] to (index)
if < not <((2) * (index)) > (length of [queue v])>> then
if<(item((2) * (index)) of [queue v]) < (item (index) of [queue v])> then
set[bubble_down_value v] to ((2) * (index))
end
if<not<(((2) * (index)) + (1)) > (length of [queue v])>> then
if<(item(((2) * (index)) + (1)) of [queue v]) < (item (bubble_down_value) of [queue v])> then
set[bubble_down_value v] to (((2) * (index)) + (1))
end
end
if<not<(bubble_down_value) = (index)>> then
Swap (bubble_down_value)(index)
Bubble Down(bubble_down_value)
end

define removeMax
swap(1)(length of [queue v])
delete (1 v) of [queue v]
Bubble Down(1)

To remove any element is very similar, but the smaller number that one replaces a number with can move either up or down:

define removeElement(index)
swap(index)(length of [heap v]) :: custom
delete item (last v) of [heap v] :: list
bubbleUp(index) :: custom
bubbleDown(index) :: custom

Finding Numbers

A heap isn't better or worse than a list at finding if a specific number exists. One must loop through every number (though we can ignore anything bigger than the first element)

define find(value)
set[result v] to (-1) // -1 means the value is not in the array
if<not<(value) > (item (1 v) of [heap v])>> then
set[n v] to (0)
repeat(length of [heap v])
change [n v] by (1)
if<(item (n) of [heap v]) = (value)> then
set [result v] to (n)
stop [this script v]
end
end

Sorting

Even though heaps are just ways of storing numbers, the concept can be applied to the sorting-list problem. For really big lists, the sorting method below is about as fast as any of the methods in Sorting Values. Here is what we are going to do: make a list into a heap, continually remove the biggest element and put it into the sorted list.

define sort // sorts the list 'unsorted'
delete (all v) of [sorted v]
Make into Heap // make "unsorted" into a heap
repeat until<(length of [unsorted v]) = (0)>
add(item (1 v) of [unsorted v]) to [sorted v]
Remove Max
end

Min Heaps

This article focuses on implementing a heap that always has the biggest number at the top. It is possible to construct a heap with the smallest number on the top too. This is useful if you want to sort from lowest to highest, or want a priority queue that goes from smallest to largest values. For the most part, the code all stays the same except all comparisons of element's values. In other words, every

<(item (a) of [heap v]) < (item (b) of [heap v])>

and

<(item (a) of [heap v]) > (item (b) of [heap v])>

switch. The computer then reverses the "heap property" and makes every parent smaller than its children.

See Also