Overview
Everybody has to report status or estimate a work task to their boss. Even "the boss" usually has to report status, if only to shareholders or the owners. This article discusses an approach using some simple formulas and Excel (or your favorite Excel substitute) to give your status a more scientific basis than "off the cuff". It also allows you to answer "what if" questions fairly quickly without a lot of "hand waving".
Giving good estimates will have positive effects for you, your team, and your management (who expect you to make them look good while rewarding you with [FILL IN YOUR DREAMS AND WISHES HERE]). I may be stretching this last part a bit, or living in positive-moral-ethical-symbiotic-seeming-dreamscape that does not really exist. Trust me, though, try it out. Giving good numbers will make you feel better than pulling penguins out of your...um...imagination.
Problem Statement
At some point early in your career as a developer, you are going to be asked two questions that every sane developer has come to dread:
- What is your progress on [FILL IN YOUR TASK HERE]?
- When will it be finished?
Initially, you will probably be asked for these "estimates" (you are not counting photons, after all) after you have already started the work. These will be used for:
- Figuring out if you are "getting stuff done".
- Enabling others to know you need help before you may realize it yourself. Keeping your nose in the code creates its own kind of myopia.
- Coordinating the work of others (your teammates, the test team, marketing, sales...you know...those "other folks" who go into the office as well but do not seem enthralled by what you do...yet for some reason are interested in the outcome...) so you can "join" up at the proper time.
Sooner or later, you will be asked these before the work even starts and these estimates that you create will:
- Form the basis of how your team leader plans the tasks for the others on your team.
- Be used to estimate the overall project cost (and perhaps decide if it happens).
- Probably come back to haunt you.
That is to say, plan the project, determine if it is going into the weeds, plan intersection points for the project, allocate budgets, and in general make everybody feel that the chaos is "managed". While it is true you can try to wave off or swag these questions with an off-the-cuff answer, it will usually work out better for everybody involved if you develop a strategy for answering these questions with some credibility.
I Have a Great Tool
If this sounds a lot like something that belongs in the Scrum or DSDM tool you have been using (Jira, Scrumwise, pick your favorite), you are correct. However, before you dash off to read a good book on Game Programming Patterns
, you might consider the following:
- LOTS of companies do not have high-end project management tools. You may work for one already. You may work for one in the future. Better to practice it now and be ready for that interview question about how you manage your manager's expectations.
- Your company may work for many different clients and tools are at the project level, so you only get the good ones if you are on those projects.
- If the inputs to the tools have a lot of "bottom up" granularity, you might already have all you need. If they do not, you have the option of spending time putting the items into the tool or coming up with the numbers and putting a "next level up" number in. This is a decision about how much granularity you have in your tool.
- It is really hard to run "what if" scenarios with these tools. They are more geared to predicting based on current state. Dinking with them to flip around assignments, order of execution, weight on tasks, etc., can have unintended and complex outcomes. Sometimes the "UNDO" is REALLY hard to find.
The reality is that this is a skill that is not specific to estimating a software project. It's a skill for estimating ANY project.
Basic Approach
As a computer scientist, the idea of giving an "estimate" may rankle you a little bit. Fortunately, many scientists have gone before us and they seem to have had some success with it, so we can skip right past the "feels icky" concern and get right to "how can we make numbers make sense". It's better than penguins.
Sequential Operations
The first thing you have to realize, and this will probably NOT come as a shock, is that everything you do has a "Start" and an "End". When you start, you are 0% complete. When you end, you are 100% complete. You start, go through the steps sequentially, and reach the end. It is the steps in the middle you have to count. I'm going to use the example of a rather pedestrian task, fixing bugs. Without getting too deep into the process of your company, the rough list of what you need to do to resolve a bug in a production system is as follows:
- Investigate the bug.
- Write or change some code to fix it.
- Perform some kind of desk check or unit test to verify it works. You may have to write the unit test.
- Check it in so that others can see it.
- Integrate your change and verify you have not destroyed the universe.
- Wait for QA to bless and it and close/complete it.
You could just as easily have a more exciting example where you have to design a game engine; you still have individual pieces to build and the same basic SDLC steps for each one (design, code, unit test, integrate, lather, rinse, repeat, ...).
The Recipe
We are going to use a spreadsheet (I am using Excel, you can use any one you wish, they all support something like these operations) to keep track of the "state" of completion for each task you have to work. Practically speaking, you can really only do one thing at a time. You may be spread across other projects, but we can handle that a different way. Assume, for now, that you are going to "start", execute a series of steps to carry you along, and then finally "end" when it will be done.
- You assign a percentage complete (0.0-1.0, you will see why later) to each state.
- You assign a state to each task you must complete.
- Map each state to a percentage complete for the step.
- Add up all the "percent completes" and divide by how many there are to get the average completion (how much you are done).
As this is not a course in Excel, I have put all this into a .zip file with Excel Spreadsheets in both .xls and .xlsx format.
You do not have to follow this format explicitly. I am going to enumerate and describe all the elements in this particular incarnation:
- This is just a list of the item numbers. This way I can add charts or whatever at the top by moving the tables down and not have to worry about referring to specific "Excel" row; I always refer to #XXX.
- This table is for bugs. It could also be "component" or "API Method", etc.
- It always helps to have a reminder what the numbers refer to. No secret sauce here.
- These are the states you modify. Each is set up as a "Data Validation" with a "List" type (the list items are column G). I STRONGLY ENCOURAGE you to do this. Validation Lists like this stop people from randomly typing junk into things that should have a fixed set of items and breaking your house of cards. You don't let people other developers use cast to set their own values to your enum instances, right?
- This is a bit of the secret sauce. The value in D is looked up in G and the returned index from H is placed here. Just like a std::map/tt] lookup.
- This is the list of states. If you want to add a new one, just add a new element in the middle (to both column G and H) and the D/E columns will honor it. Nifty when you want to add/remove states. NOTE: I am using "past tense verbs" for states. As in "this has already been done". Be consistent in your language choice.
- This is the percentage looked up. A bit more secret sauce here. The space between the steps are NOT LINEAR, unless you want them to be. It takes longer to code and test than investigate, so the % complete reflects that by going from 0 - 0.2 - 0.5.
- A bright yellow box gives you a perfect eye-draw-point for the review where you will have to show this. It looks a little sad at 0.0% right now, though...
First Pass - Basic Estimates of Completion
Now that we have the basic template down, let's plug in some "actual work done" and update the numbers.
- So we completed two items and one of them has been unit tested. And that moves us to about 34% done. That seems pretty straight forward.
- The drop down on each of these boxes means you cannot fat-finger in the wrong state.
One important point to "point out" is that you need to check that the states all work as expected. Move all of them to "coded" and you should get 50%. Move all of them to "integrated", 90%. And so on. Depending on how the "lookup" method works, it may require a sorted or non-sorted list for G (this one does NOT require a sorted list). Be aware of that if you start to see numbers not lining up. Always do a "unit test" to make sure you are not reporting junk.
Second Pass - Better Estimates of Completion
You could stop with the first pass and that might be fine. Your boss comes up to you and says, "how long will it take". You take each item, multiply it by an "average" number of hours for each and now you have the "Total Work Estimate". (1 - % Complete) X Total Work Estimate = Hours Remaining.
So, you report this, get back to work, and coding bliss ensues. BUT, if you add one more "knob" to the calculation, you are going to add a dimension that lies at the heart of every savvy developer's very personal contribution to the project: YOUR KNOWLEDGE OF THE DOMAIN SPACE.
If you look at the list in C, a few alarms should be going off. It is reasonable to assume that "Change Background Color" will take a considerably shorter time than "System time randomly sets to future." However, the linear-state-estimator we have gives all these tasks the same "weight" in terms of how much effort they require for that type of "state operation."
What you want to do is "weight" them based on "how hard they are because you know how hard that part is going to be to fix". You can do this a lot of ways. I tend to follow the 1, 2, 4, 9 model. Also known as "trivial", "some work", "some real work", and "send out for pizza".
Truth be told, this started as a square approximation for difficulty, but "2" kept showing up because there was no middle ground between "1" and "4". You can choose your own scale. The key point here is that because you have knowledge of the system, this is where you get to put it to good use. You can defend your estimates because you know "where the bones are buried".
If this is not that kind of task or project, maybe you rely on your hard-earned-experience to come up with some estimates. It is still better than penguins. This is what it looks like:
There are some interesting things here to notice:
- Our percentage completion for the same "states" we had before jumped from 34% to 63%. So completing a couple of "big chunks" up front really improves the number quickly. On the other side, if you are going through the project and you are not "weighting" but you do all the small stuff up front, you could be in for a nasty surprise, time-wise, when you start working on the bigger items. This is the advantage of using your domain knowledge to make these estimates. The "Complete" calculation consists of SUM(G)/SUM(F).
- The points chart is manually entered and fairly straight forward.
- The calculation here is just "%" X Points.
- A calculation for "Counts" is done by counting how many times the state (I) shows up in the states (D). This is handy for a quick look at where you are overall when the number of rows gets large.
Third Pass - Time
Now that we have a better estimator for how much work is in front of us and how much work has been done, we can add another dimension: Time. After all, your boss is going to still ask "how long will it take?" This is relatively easy to add. Decide how many "points" of work you can reasonably do in an hour, then do some basic unit cancellation and you get the following:
- I'm a big fan of "Meaningful Units". Do they still teach unit cancellation?
- The number of Points Per Hour is up to you, but should be reflective of the values you choose in F and your knowledge of the domain.
- Total Days = SUM(F) / (Points Per Hour X Hours Per Day).
- Days Remaining is just (1- % Complete) X Total Days. Weeks Remaining is this value divided by Days Per Week.
Really, It Is Better Than Penguins
Suppose you are really terrible at estimating. If you screw up a number here or there, the odds are good that you are going to go too high on some and too low on others. A funny thing happens with numbers called "The Law of Large Numbers."
What this boils down to is that making and adding up a number of small estimates will average out and yield an estimate that is usually not so bad. Yes, you will royally screw up one here and there, but on average, it should work. Of course, there is ONE slight qualifier to this. You may be an "over" or "under" estimator. That is to say, you think too much (or too little) of your skills and have an internal "multiplier" on your estimates.
Personally, I underestimate EVERY TIME how long it will take me to do something (so clearly my ego is healthy). But I (and I feel this is in line with most people) am consistent with my underestimation. I'm always off by about 1/2. So at the end, I always have a "multiply everything by this one factor", and my factor is 2X. DO NOT factor this in while doing your estimates...multiply it as a constant factor at the end on your Points Per Hour value. Or you can adjust your PPH value accordingly.
Remember, your goal is to get a reasonable estimate quickly, not generate more work. If you don't want to figure out your factor, don't worry about it. Your boss will. If he is a good boss, he may mention it once or twice, but will probably just factor it in without telling you after that. It will be better if you come to terms with it though and just stick it in there.
Playing "What If" Games
Once you gotten this put together, you can play the "what if" game and answer questions quickly and easily.
- Your boss asks how much time can you save if they decide to abandon a fix for this cycle - Set it to "complete" or add a new state of "abandoned" and give it a % value of 1.0.
- Your boss says he will add a second person to do the work - Modify your Points Per Hour so you work faster (though not 2X, because nine women still cannot make a baby in one month).
- Your boss says "work more hours in the day" - Ok. The reason it says "6" in the sheets is because that is about how much effective actual work time developers have. You can get "more" from "crunch time", but how much more and for how long before diminishing returns make it worthless is a complicated question. That being said, you can increase the Hours Per Day, but my money says your Points Per Hour should probably go down a bit...sleepy eyes make mistakes.
- Your boss decides to time share you with another project. Reduce your Points Per Hour by a factor to account for this. If you are spread between two projects, you should drop your PPH by at least half (though probably more depending on your personal context switching). If your boss wants you on more projects...well...you may consider this article to be interesting. Share it with your boss at your discretion...when you run out of good options, you are left with bad ones.
Conclusion
If you are in the software world and think about Scrum and "burn down charts", it should be obvious that the technique presented here is definitely in the same ballpark. This technique uses a tool that is ubiquitous and the approach can be easily extended to other domains easily.
FEATURE REQUESTS (:P): Why not also tracking date of state changes to estimate real time better?