Algorithm proof

LordG · 2002-10-31T17:22:20

I don''t know if anyone here is good at proving algorithms, but anyways, I''m having trouble solving this problem. We are given n keys x1, x2, ... , xn to be placed in a dictionary, and the corresponding distinct probabilities p1, p2, ... , pn with which we will subsequently search for these keys. You may assume that p1 > p2 > ... > pn, and that p1 + p2 + ... + pn = 1. a) Suppose the keys are to be inserted into a linked list. Prove that the list in which keys appear in order of decreasing probability, i.e., x1, x2, ... , xn, minimises the expected number of nodes examined to find a key. State the sample space, probability distribution and random variable used in your proof. b) Suppose the keys are to be inserted into a binary search tree. Prove or disprove: Every binary search tree that minimises the expected number of nodes examined to find a key has in its root a key of maximum probability. I just don''t know how to start this thing. Any help would be great.

Math and Physics Programming

Started by LordG October 27, 2002 11:02 PM

39 comments, last by LordG 22 years, 3 months ago

grhodes_at_work

1,385

October 30, 2002 10:28 AM

Sorry I didn''t catch this sooner.

LordG, thank you for acknowledging the forum rules in your second post and for promising to follow the rules in the future. Homework questions are not completely forbidden, but you have to be very careful with how you state them. You must not merely ask for an answer, and you must show your own work. It is also pretty much a requirement that the particular homework question overlaps problems in game development to a degree.

Once again, I appreciate the others who are helping to mitigate the problem of homework posts. Its a big help, especially when I''m unable to moderate in real-time.

.

Graham Rhodes
Senior Scientist
Applied Research Associates, Inc.

Graham Rhodes Moderator, Math & Physics forum @ gamedev.net

Zipster

2,420

October 30, 2002 02:36 PM

Let''s examine what information we are given:

1) We are working with a binary search tree.

A binary search tree is a graphical representation of every possible path that a binary search can take on given data. However, there are two things to keep in mind here. 1) A binary search only works with sorted data. If the data isn''t sorted, then we can''t do a binary search, which is implied. I''m not saying that the data is sorted iif it''s a binary tree. 2) In a binary search, the first node checked represents the mid-value of the entire set of data, and each consequetive node represents the mid-value of each subset of data. Thus, the following trees with A < B < C:

A \  B / \    CA \  B / \C

Or any other related trees are not valid representations of a binary search tree, because they incorrectly represent the progession of a binary search. The root has to be the middle value of the set of data, because that''s how a binary search works. In this case, it has to be B. Each sub-tree then has to be in a definite relationship with it''s root. In the end, a binary search tree lends itself to be balanced, simply due to the nature of a binary search.

2) We are trying to minimize the expected number of nodes to be examined.

To minimize the expected number of nodes to reach any value we search for, we put the values of highest probability where they will be examined before those of lesser probability. We proved this with the linked list question. A binary search is the same, only instead of having single elements linked directly to one another, where it takes n examinations to reach the nth element, we have levels of the tree linked to each other, where it takes n examinations to reach the nth level of the tree. This is also what makes a binary search a lot better than a linear search in a lot of cases - each level in a tree contains possibly more than one element, yet it only takes a single examination to traverse a level. Given probabilities A < B < C < D < E < F < G, any one of these trees would work best (discounting integer rules and such):

     D   /   \  B     F / \   / \A   C E   G     D   /   \  B     F / \   / \C   A G   E     D   /   \  F     B / \   / \G   E A   C

Thus each sub-level changes (and takes its children with it), but inter-level juxtapositioning can change fluidy, and we can still assume all data is sorted, because we have no specifics on that.

Anonymous

October 30, 2002 03:07 PM

quote:
Original post by Zipster
2) In a binary search, the first node checked represents the mid-value of the entire set of data, and each consequetive node represents the mid-value of each subset of data. Thus, the following trees with A < B < C:

Not true. A binary search tree doesn''t need to be balanced.

Waverider

169

October 30, 2002 03:27 PM

BSP, which is a form of binary tree, works best when it is balanced to keep the overall search from going any farther than a certain depth.

When you are doing a full tree parse, it doesn't matter. But if you are using the tree to search, then balancing is important.

From a memory standpoint, it also keeps recursive search routines form pushing too much on the stack.

[edited by - Waverider on October 30, 2002 4:39:22 PM]

It's not what you're taught, it's what you learn.

Cedric

158

October 30, 2002 03:42 PM

quote:
Original post by Zipster
A binary search tree is a graphical representation of every possible path that a binary search can take on given data. However, there are two things to keep in mind here. 1) A binary search only works with sorted data.

A binary search is not the same as a binary search tree:

http://www.nist.gov/dads/HTML/binarySearchTree.html
Binary Search: An algorithm to search a sorted array. It begins with an interval covering the whole array. If the search value is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is empty.
Binary Search Tree: A binary tree where every node's left subtree has values less than the node's value, and every right subtree has values greater. A new node is added as a leaf.

We're talking about a binary search tree, here, not a binary search.

quote:
If the data isn't sorted, then we can't do a binary search, which is implied. I'm not saying that the data is sorted iif it's a binary tree. 2) In a binary search, the first node checked represents the mid-value of the entire set of data, and each consequetive node represents the mid-value of each subset of data. Thus, the following trees with A < B < C:
A \ B / \ C A \ B / C

Or any other related trees are not valid representations of a binary search tree, because they incorrectly represent the progession of a binary search.

The first tree is right, because the subtree of every node is smaller than its root. Check the definition.

quote:
The root has to be the middle value of the set of data, because that's how a binary search works. In this case, it has to be B. Each sub-tree then has to be in a definite relationship with it's root. In the end, a binary search tree lends itself to be balanced, simply due to the nature of a binary search.

Wrong again.

quote:
Given probabilities A < B < C < D < E < F < G, any one of these trees would work best (discounting integer rules and such):
D / \ B F / \ / A C E G D / \ B F / \ / C A G E D / \ F B / \ / G E A C

The first tree is the only one that respects the definition, but you're digging yourself a nice hole there. See, if you have to take the median key as the node of your tree (as you claim), you can't take the key with the highest probability (unless it happens to be the median).

Cédric

[edited by - cedricl on October 30, 2002 4:45:08 PM]

[edited by - cedricl on October 30, 2002 4:47:59 PM]

LilBudyWizer

491

October 30, 2002 03:55 PM

There is a small little problem with that logic. No statement is made about the ordering of x1, x2, ..., xn. So your arguement says the median key then has to be the root, but nothing says the median key has the highest probability. What if it isn''t? So as near as I can tell, by your own arguement, the second part is not true. That is even granting your definition of a binary search tree that I don''t agree with.

Keys to success: Ability, ambition and opportunity.

Anonymous

October 30, 2002 04:27 PM

quote:
Original post by Waverider
BSP, which is a form of binary tree, works best when it is balanced to keep the overall search from going any farther than a certain depth.

But the whole point of the question is that it is NOT certain that a balanced tree always works best. It can be better to keep the tree unbalanced, if that brings the keys with higher probability closer to the root.

Waverider

169

October 31, 2002 09:49 AM

quote:
Original post by Anonymous Poster
But the whole point of the question is that it is NOT certain that a balanced tree always works best. It can be better to keep the tree unbalanced, if that brings the keys with higher probability closer to the root.

Assuming that certain keys have a higher probabilitty of being searched upon.

I presume that is what was meant, then, in this case, by a binary search tree. I thought the context of the discussion was that ANY binary search tree doesn't have to be balanced. BSP trees, as long as I understand them correctly, only provide the results needed at a leaf node (such as determining which subsector a point is in). I guess there are some cases where a certain subsector had a much higher probability for being needed - giving it a shorter parse depth would speed everything up.

Sure, in the case where certain keys have higher probability, fewer parse iterations to acquire those keys would make it all faster.

But, I know I'm just repeating what I has already been said.

[edited by - Waverider on October 31, 2002 10:59:20 AM]

It's not what you're taught, it's what you learn.

Zipster

2,420

October 31, 2002 11:15 AM

If we actually give a mathmatical definition to "expected number of nodes to be examined," then we can actually prove b) false.

My mistake was that I was assuming that the binary search tree was balance (for some reason), and in addition, freely maneuvering probabilities instead of keeping in mind that they are attached to values in a rigid order.

Using the same math, I was able to show that a balanced tree is usually better than a unbalanced one

Cedric

158

October 31, 2002 01:33 PM

Glad you''re back to our sane world

Just one thing to point out:

quote:
Original post by Zipster
Using the same math, I was able to show that a balanced tree is usually better than a un balanced one

This statement is true (it looks like you''re making fun of it). Most of the time, a balanced search tree is better than an unbalanced one. If A had a probability of 90%, then it would make sense to put it at the root of the tree.

Cédric

Algorithm proof

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Algorithm proof

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines