Software test engineers and software test managers for years have been asking the question about how to estimate the software testing effort in software projects. They have toyed with many theories and with even more approaches to the subject of quantifying and evaluating the process of how much testing is enough to ensure a successful product. Many agree that estimation must be done on a project-by-project basis. However software development has so many common characteristics that general guidelines can be established and considered in calculating testing estimates for all projects.
In his article The Testing Estimation Process Antonio Cardoso discusses eight rules that can be applied to estimating testing. These include:
Rule 1: Estimation shall be always based on the software requirements
Rule 2: Estimation shall be based on the expert judgment
Rule 3: Estimation shall be based on the previous projects
Rule 4: Estimation shall be based on metrics
Rule 5: Estimation shall never forget the past
Rule 6: Estimation shall be recorded
Rule 7: Estimation shall be supported by tools.
Rule 8: Estimation shall always be verified
The following sections explore and expand upon Rule 1,
with the understanding that rules two through eight are equally pertinent to
the testing estimation process. This document describes how estimators can
categorize software requirements, how they can apply a weighted complexity
variable, and finally how they can introduce risk factors to provide an accurate
estimate. These are presented under the following headings:
Software requirements can be broken into the following categories:
The following sections describe these requirements from a �time required to accomplish� perspective.
Business logic testing requirements include the following:
Planning the test. Next phase is to plan the test, which will include describing steps of the following test execution as well as preparation of required data. Usually, these test cases are being validated (or should be validated) with a client or person, who knows the client�s business flow. Allowance must be made in this area for review and meetings, misunderstandings and resulting rework.
Preparing the test data. In many cases, production data is unavailable, requiring that test data be prepared from scratch. Considering that test data preparation usually happens at the fist stages of the tester�s involvement, the application may not be as familiar as it would be later in the development cycle. As a result, inaccurate test data could be applied, producing skewed results. Because test data preparation is a time consuming process, allowance for this must be made in the estimate.
Testing. Test execution for business logic scenarios is complex and involves much concentration from the tester. Testers may get tired faster, which can lead to fewer test cases executed per day. Business logic bug reporting takes longer time then functional or technical (even performance). Complete chain of steps, complex description of the before-execution condition and multiple screen-shuts, bug description sent back and force for clarification and etc. consume sufficient amount of time from the test team.Planning a regression strategy. If business test cases were planned carefully, considering possible positive and negative conditions, possible estimated bug rates could be set at one for every two or three test cases, especially during initial builds. The question then is, �What regression strategy will the tester adopt? How many regression tests must be estimated to validate fixes? �
Functional requirements (assuming that functional testing is a part of the testing scope) does not require extensive understanding of the application rules. However, such validation requires precisely documented functional requirements, which in many cases do not exist. Consequently, the tester must spend time consulting with developers about tested features.
Test Planning. The test planning is much easier than business logic test planning. Sometimes functional testing is conducted using checklists, which is easier to prepare and takes less time. Test data preparation usually is not required. However, there are usually a much larger number of functional tests then business logic tests.
Test Execution. Test execution is a quicker process than reporting. The bugs found / tests executed ratio is considerably lower than during the business scenarios verification. However, due to a typically larger number of tests, the number of bugs could also be high. The same question should be asked as well as in the business testing �How many regression runs must be estimated to validate fixes?�
A single requirement outlining performance can take the
longest time to plan, in terms of preparing for execution, the execution
itself, and analysis of results. Many questions must be asked, such as:
What to test?
When to test?
What tool to use?
Who will test? (In terms of tester�s skills)
Who will prepare the test environment, or will the existing environment be used?
Who will recover everything that could be crashed during the test?
Many other questions should be answered before beginning to estimate hours for performance test preparation and execution. In fact, if performance testing can be treated as a separate estimate, this would improve the overall accuracy of the existing estimate.
Technical requirements specify hardware and software supported, as well as details of the database and user interface.
Planning. Planning for technical requirements takes a relatively short time. Usually a checklist is used to verify whether the required element exists in the database and user interface. For compatibility testing, selected chunks of functional test cases are used, which allows for the shortening of the test-planning period. Possible planning difficulties include identifying the test infrastructure required to execute selected test cases on multiple platforms.
Test execution and reporting. The test execution and bug reporting are relatively simple
due to the following: these test cases are being executed after functional and
business runs, that is, on the known application with known test cases.
Usually, there are few issues resulting from the testing of technical requirements
and normally a single regression test will validate fixes.
A potential exception in the technical category is compatibility requirements that might require a time-consuming execution and environment preparation process. For example, one requirement can list multiple software/hardware configurations that an application should be compatible with. In this case the following should be considered:
Multi-platform testing requires having knowledge of these platforms. If you do not have a tester in the group who knows UNIX, for example, consider a sufficient extension of the learning curve or consider getting another resource. This is considered as outsourcing with all possible issues affecting previously estimated time.
Testing of minimum hardware requirements may result in at least slow test execution, and sometimes can result in multiple system failures.
Unavailability of the required number of PCs to test specified configurations may require software re-installation, usage of swappable hard drives, and other such solutions, resulting in multiple technical issues that may significantly slow down planned testing activities.
Compatibility Testing. The only time consuming task could be Compatibility test execution. Some clients may request an unusual number of hardware/software configurations. Although estimating such tasks may not seem realistic, the �client is always right!� and every effort should be made to accommodate client wishes. However problems such as non-availability of test PC�s for each required configuration must be considered. In this case swappable hard drives and the prospect of multiple installing/uninstalling can lead to potential obstacles.
Once requirements have been categorized, it is possible to quantify the time it takes to prepare and to execute each single requirement within the group. This is done by multiplying the number of requirements in a group by a complexity constant* (defined below) and summing these figures.
Complexity constants are determined by analyzing tasks and assigning values. These values become constants in calculating estimates, as in the following table.
>Table 1: Complexity constant values
Constant Description Value Complexity: Low Relatively simple 1 Complexity: Medium Moderately complex 1.2 Complexity: High Complex 1.5
An example project identifies 100 requirements consisting of 70 business, 25 functional, 4 technical, and 1 performance. The complexity constants in table 1 are applied. Calculations are shown in table 2.
Business requirements: Of the 70 business requirements, 30 are high complexity, 10 are medium complexity and 30 are low complexity. Management has agreed to 1.5 hours for the learning curve, test preparation and data preparation and .5 hours for test and 3 regression runs. Also, it was estimated that every second test case would produce one bug, which means a regression ratio of .5. (For example, the first test run could have 100 test cases while the second could have 50, resulting in a regression ration between the first and second test run of .5). The estimate for business requirements is therefore 435 hours (see table 2).
Functional requirements: Of the 25 functional requirements 10 are low complexity, 5 are medium complexity, and 10 are high complexity. Management has agreed to 30 minutes for test preparation execution, and for issue reporting at � (or .25) test case ratio and 4 regression runs. Also, it was estimated that every second test case would produce one bug, which means a regression ratio of .5. The estimate for functional requirements is therefore 31 hours (see table 2).
Note: From my personal experience, there are usually many more functional requirements than business requirements.
Technical requirements: Of the four technical requirements, each is estimated separately. Two requirements concerned the look of user interface (UI) elements, so 15 minutes were assigned to create a checklist, verify them and report issues if any. One requirement was to list operating systems that the program should support, but it was decided not to plan to test this requirement separately, but instead assign these operating systems to the test machines for regular execution. The final technical requirement was to specify the minimum hardware configuration that an application should support, but it was decided that no test planning would be required; instead, it was decided that the 20 functional test cases could verify this requirement at 15 minutes each with 1/10 re-run ratio and one regression run. An estimate of 5.6 (rounded to 6 for ease of calculation) hours was made for technical requirements verification.
Performance requirements: The performance requirement is estimated at 40 hours to prepare, execute, analyze and recover if needed. Management has agreed to a second regression run, which is estimated at 20 hours. The final for performance is 60 hours.
Table 2: Example Detailed calculation of Estimated Time to Complete Project (hours)
# Requirement Category Number of Requirements Complexity Constant Basic Hours* Sub-Total Regression Ratio Regression Runs Totals 1 Business 30 1.5 2 90 .5 3 2 Business 10 1.2 2 24 .5 3 3 Business 30 1 2 60 .5 3 174 435 4 Functional 10 1 .5 5 .25 4 5 Functional 5 1.2 .5 3 .25 4 6 Functional 10 1.5 .5 7.5 .25 4 15.5 31 7 Technical 2 1 .25 8 Technical 1 0 9 Technical 1 (20)** 1 .25 .1 1 5 5.5 (6) 10 Performance 1 40 .5 1 40 60 TOTAL 532
Basic Hours � minimum Test Execution and Test Preparation hours assigned to Requirement Category.
20 test cases from Functional area will be used
The total Estimated Time to Complete (ETC) is therefore 532 hours (435 hours for business requirements, 31 hours for functional requirements, 6 hours for technical requirements, and 60 hours for performance requirements).
Requirements are thus identifiable, measurable elements that must be analyzed to be reflected in the testing process. Estimates for each requirement, as well as allowance for the time for learning, test planning, test data preparation, test execution and regression runs should be added to each estimate, based on the identified requirement category and its complexity. Cardoso�s first rule can then be renamed �Estimation shall be always based on the explored software requirements�.
Another aspect of the estimation process cannot be ignored: the need to assess risk factors and plan for contingencies. The following scenarios demonstrate project risk situations.
Scenario 1. An estimated number of person/hours required to complete all testing tasks was agreed upon. Shortly before the start of the project, a new technology was selected to produce the code. To management, the new technology was a step ahead and supposed to be better and produce better results. To developers, the new technology had adverse effects, generating longer development time. As a result, more re-work was generated due to the lack of development tool knowledge. So, at the end of the project, developers were rushed to meet a deadline. Unit testing was not done properly and the test team received an unstable product. Consequently, test execution was postponed and myriad problems resulted.
Scenario 2. A well-qualified team was assembled for a project. A well-known technology was selected, and the test team was carefully chosen and got involved from the project�s early stages. Everything had the appearance of an ideal situation until the client decided that since a lot of money was being paid out, they should not be involved. As a result, the requirements were unclear, nobody really knew what should be produced, and it was taking one to two weeks to book an appointment with a manager to verify the design. A lot of coding was based on assumptions and not on facts. The end result of these problems was infinite rework, much frustration, and unstable builds.
Staff Seniority Level can also be factored as a source for project risk. Staff Seniority Level ratio is the sum of developers� seniority divided by the total number of developers. For example, six developers are selected for a project. Two developers are classified as senior, three as intermediate, and one as junior. Constants can be applied according to experience, such as 1 for the senior developer, which means that no hours increase would be recorded if only senior developers were involved. A constant of 2 would be attached to the intermediate developer, and 3 to the junior. The calculated Staff Seniority Level ratio would be 11 (2 x1 + 3 x 2 + 1 x 3) divided by 6 (the number of developers) or 1.83. Therefore for only one risk the ETC based on 579 hours would be 579 x 1.83, for a total of 1059.6 hours as a final estimate for testing activities.
Other risk factors are listed in table 3 and possible risk constants attached.
Table 3: Other project risks and example risk constants
Risk Levels of Complexity Risk Constant Requirements Effectiveness Clear and understandable 1 Understandable, but not clear 1.1 Ambiguous and difficult to understand 1.6 Project Organization All procedures documented, clear, and followed by team 1 Procedures present but not being followed 1.2 No procedures � management relies on team member enthusiasm 1.5 Knowledge of the Technology Familiar to all team members 1 Half the team members are new to the adopted technology 1.2 New technology known by only a few 1.6 Management Knowledge and Cooperation Complete understanding of development cycle by the client, full cooperation, fast reply, 1 Basic understanding of development, hard to get attention, but possible 1.2 No cooperation (�we pay and you do� attitude) 1.4 Documentation availability Documents exist, are updated regularly, and are accessible 1 Documents exist, but are not updated regularly and are difficult to locate 1.1 Documents do not exist 1.3 Meeting Milestones Milestones achieved 100 percent of projected estimate 1 Some milestones slip but are still achieved 80 percent on time 1.1 Nobody cares about milestones 1.5
For example, after calculating a number of hours based on the explored requirements, multiply all percentages from all possible risks together then multiply estimated number of hours by the (multiplied) percentage figure. If 5 risk factors are identified, constant ratios could be 1.1, 1.4, 1, 1.6, and 1.2. These constants correspond to the following percentage that should be added: 1.1 (10 percent), 1.4 (40 percent), 1 (zero percent), 1.6 (60 percent) and 1.2 (20 percent), total 130 percent. This percentage (130%) is added to the estimated number of hours. Assuming the ETC is 532 hours, the result would be 532 x 1.3 = 691.6, total 1331.7 hours.
General Contingency is a new term. The main idea is �the early the project the less risks are known�. What if I am called to provide my estimates not at the Coding stage where majority of risks are visible or can be predicted based on the previous project performance, but at the Project Initiation stage?
I believe that the above is a very valid question. As early on the project we are as more difficult to come up with a valid number. In this case I recommend using a General Contingency percentage. What does General Contingency mean? General contingency � is a percentage that should be added to the estimated time based on the project stage. For example, estimates based on a project at the coding stage, where the majority of risks are visible, are different from estimates based on a project at its initiation stage. A General Contingency factor can be applied depending on stage of maturity of the project and is a flat percentage that should be added to the overall estimate. For example, if the project is at the coding stage, add a contingency percentage such as 15 percent to the estimated figure. Therefore if the number of hours estimated based on analyzed requirements is ETC (532), risks identified (130%) this total is multiplied by the General Contingency factor (15% or 1.15). Other project stages include the project initiation stage, post coding, alpha and beta testing. These would have a proportionate contingency factor applied.
While the process of estimating testing in the software industry may be different for each project, all share the common thread of Requirements. This article has attempted to introduce guidelines to simplify this process. In addition to Cardoso�s rules for testing estimation, it is recommended that requirements be categorized into Business, Functional, Performance, and Technical. Requirements within each category are then divided into complexity groups with a complexity constant attached to each group. This can be a simple value corresponding to low, medium, and high complexity. Detailed requirements can then be calculated and an initial estimate of time to complete (ETC) given. To the initial ETC, project Risks corresponding to the vagaries of the business environment must be factored. Examples of such vagaries are illustrated in table 3. To do this, risks are identified and an appropriate risk constant applied to produce a final ETC. Then, based on the project stage, a General Contingency should be attached.
Other external factors may also exist, and should also be considered for inclusion in the estimate. Hours spent leading and managing, for example, is particularly relevant if three or more testers form part of the team.
In addition, project risks and complexities can be used as a negotiating tool. For example, if the requirements management process cannot be improved, the testing team will require more time. Or if the documentation strategy is not improved, more resources will be required.
With these estimation principles in place, you are in a good position to present an accurate estimate and at the same time help the client to better understand the needs of the development cycle.
This article was posted by Youri Kroukov on Oct 10, 2002 stickyminds.com
posted on this software testing web site with his permission