Advertisement

Counting Sort

Started by February 01, 2022 02:48 AM
9 comments, last by SuperVGA 2 years, 10 months ago

I'm having a difficult time understanding where/why my solution fails for a Counting Sort function. It works for some variations of arr, but not all.

I ran through the correct solution and can understand how it works, but I'm still lost as to why mine does not. I've been trying to figure it out for some time now. Any help/insight is very much appreciated!

Note: arr is an array of integers.

My function:

def countingSort(arr):

	fArray = [0] * 100
	
	for i in range(len(arr)):
		fArray[i] = arr.count(i)
		
	return fArray

The correct solution:

def countingSort(arr):

	fArray = [0] * 100

	for i in range(len(arr)):
		fArray[arr[i]] += 1
		
	return fArray

The dream's waiting.

First I don't know this language so take this with a grain of salt. In any case I think neither one is really correct although the second one may work in certain circumstances. Your counting array has to be large enough for the range of numbers not the number of numbers, so this could easily fail for negative numbers and numbers over 100. Assuming that doesn't happen, in the first one you only go through your input array size. So if your input array was size 20 and contained the number 30, it will never be counted. Also even if you solved these problems your method is super inefficient since you apparently go through your input array for each possible number. I imagine ‘count’ does this. The second method is better and is what is generally considered a counting sort. Finally you normally want to regenerate your sorted array and not just return the count array, but I guess that's also a valid option at times.

Advertisement

jc_arioh said:
I ran through the correct solution and can understand how it works, but I'm still lost as to why mine does not.

You picked the wrong solution to work through.

If you want to know why the good solution works, you focus at the good solution. If you want to know why the bad solution does not work, you focus on the bad solution.

jc_arioh said:
I'm having a difficult time understanding where/why my solution fails for a Counting Sort function. It works for some variations of arr, but not all.

The strategy for finding the flaw is 2-3 steps.

1. Find an (preferably small) case where the solution doesn't work.

2. Eliminate as much of the case as you can while it still produces a wrong result. Repeatedly make small simplifications, eg using a smaller array, replace some values by some other numbers, etc. Run the code after each small change, and check you still get a wrong answer. Repeat until you see no more options to reduce the problem further without loosing the wrong answer.

3. You now have a minimally wrong case. Walk through it in deep detail, step by step, and see how it derives the incorrect answer.

For comparison you may want to walk through the same minimal case in the good solution, so you can compare the steps that they do.

There are two obvious flaws with your function. One is that it has terrible performance (O(n²), when a counting sort should run in O(n)). The other is that you are using len(arr) when you should be using len(fArray) or just 100. If arr is longer than 100, you will get an out-of-bounds array access. If arr is shorter than 100 but contains elements that are larger than its size, these elements will never be counted.

a light breeze said:

There are two obvious flaws with your function. One is that it has terrible performance (O(n²), when a counting sort should run in O(n)). The other is that you are using len(arr) when you should be using len(fArray) or just 100. If arr is longer than 100, you will get an out-of-bounds array access. If arr is shorter than 100 but contains elements that are larger than its size, these elements will never be counted.

Where in the linked source is there any indication of running ON²?

I'm not familiar with counting sort, though, so I might have missed something.
The snipped posted by OP though, should just iterate once across one array, ON.

@SuperVGA

SuperVGA said:

Where in the linked source is there any indication of running ON²?

arr.count(i) returns the number of occurrences of element i in arr. The only way to do this is to iterate over the entire array in order to count the occurences, which is an O(n) operation. Together with the outer loop, that makes O(n²).

Advertisement

a light breeze said:

arr.count(i) returns the number of occurrences of element i in arr. The only way to do this is to iterate over the entire array in order to count the occurences, which is an O(n) operation. Together with the outer loop, that makes O(n²).

I think with the way he's doing it there are a couple variables to consider. There is the number of numbers, and the range of numbers. So I'm not sure it's really either of those, even if it was fully working. But it doesn't matter since the first example is not a real counting sort anyway, and there is no reason to do it that way.

Gnollrunner said:

I think with the way he's doing it there are a couple variables to consider. There is the number of numbers, and the range of numbers.

True, the performance of the original code would technically be O(n×m), where n = size of input array and m = range of values in input array = size of output array - if it was coded correctly. But it's bugged to use n iterations instead of m in the outer loop, so the actual performance of the code is O(n²).

@a light breeze Understood, thank you for clarifying that. I definitely did not take into account the possibility of arr having fewer elements than fArray

The dream's waiting.

a light breeze said:

@SuperVGA

SuperVGA said:

Where in the linked source is there any indication of running ON²?

arr.count(i) returns the number of occurrences of element i in arr. The only way to do this is to iterate over the entire array in order to count the occurences, which is an O(n) operation. Together with the outer loop, that makes O(n²).

Ah, right, of course! I was fixated on the lower bit, and missed that.

This topic is closed to new replies.

Advertisement