Sunday 10 March 2013

HEAP SORT

 
In this post, I will be explaining Heap Sort.
Sorting, as we all know, is a technique of arranging the elements of a list in a certain order. Sorting can be done either on numerical data or on alphabetical data (lexicographic sorting).
Here I'm concerned with the heap sort applied on the numerical data.

Some prior knowledge that you may require:-

Tree representation of array:-

One important thing to note here is that we have started the indexing of the array from '1' rather than from '0'. This may be a weird thing for some of you but in C the arrays may be indexed as per the wish of programmer. We are totally free to choose the starting index number for any array.
 
A heap is a specialized tree-based data structure that satisfies the heap property, i.e. any parent node in the tree is always greater than each of it's child node.

ALGORITHM:
Heap sort basically includes two steps:-
1) First of all a heap is constructed using the data.
2) In the second step, a sorted array is created by repeatedly swapping the largest element (the first element of array or the root node of the tree) with the last element of the array. The heap is reconstructed after each swap taking appropriate upper limits. Once all objects have been swapped within the heap, we have a sorted array.

For executing the first step of our algorithm, we start building the heap from the last parent node of our tree.
The second step is more explicitly explained as follows:-
Once a heap is constructed, it is sure that the maximum of all the data elements will be at the root node of the tree. Now, we interchange the first and the last element of the array and then the heap is reconstructed by excluding the last element of array. Repeating this process for each node, we get a sorted array.
Some of you may have this question in your mind-
Why the second step is needed, after constructing the heap, is the array not sorted?
The answer to this question is very simple one. After the heap is constructed it is sure that each parent node will be greater than both of it's child node, in fact this is the definition of a heap, but here we are not sure that both the child nodes are in sorted order, i.e. there could exist a pair of sibling that is not in sorted order. 
The following example explains the sorting of 15,19,10,7,17 and 16: 

Source Code in C:-
/******************************************************************************/
#include<stdio.h>
#include<conio.h>

void heapsort(int *,int);
void buildheap(int*,int,int,int);

int main()
{
    int i,n,*arr;

    clrscr();

    printf("enter the maximum number of elements");
    scanf("%d",&n);

    printf("enter the data of array:\n");
    for(i=1;i<=n;i++)
        scanf("%d",&arr[i]);

    heapsort(arr,n);

    printf("\nthe sorted array is as follows:\n");
    for(i=1;i<=n;i++)
        printf("%d\t",arr[i]);


    getch();
    return 1;
}

void heapsort(int*arr,int n)
{
    int tmp,i;
    for(i=n/2;i>0;i--)
        buildheap(arr,arr[i],i,n);

    for(i=n;;i--)
    {
        tmp=arr[i];
        arr[i]=arr[1];
        arr[1]=tmp;
        if(i<=2)
            break;
        buildheap(arr,tmp,1,i-1);
    }
}

//function for building the heap
void buildheap(int*arr,int val,int pos, int limit)
{
    int child;
    child=pos*2;
    if((arr[child]<arr[child+1])&&(child+1)<=limit)
        child++;
    while(child<=limit)
    {
        if(val>arr[child])
            break;
        arr[pos]=arr[child];
        pos=child;
        child=pos*2;
        if((arr[child]<arr[child+1])&&(child+1)<=limit)
            child++;
    }
    arr[pos]=val;
}

/*****************************************************************************/

1 comment:

  1. we can start array from 0 and the child nodes will be at 2n+1 and 2n+2 positions each time so it can be simplified further!

    ReplyDelete