Software test sufficiency
By scope, software testing can be categorized as follows: unit testing, component testing, integration testing, and system testing. Correctness is the minimum requirement of software, the essential purpose of testing. Correctness testing will need some type of oracle, to tell the right behavior from the wrong one. The tester may or may not know the inside details of the software module under test, e.
Therefore, either a white-box point of view or black-box point of view can be taken in testing software. We must note that the black-box and white-box ideas are not limited in correctness testing only.
The black-box approach is a testing method in which test data are derived from the specified functional requirements without regard to the final program structure.
Because only the functionality of the software module is of concern, black-box testing also mainly refers to functional testing -- a testing method emphasized on executing the functions and examination of their input and output data. In testing, various inputs are exercised and the outputs are compared against specification to validate the correctness. All test cases are derived from the specification.
No implementation details of the code are considered. It is obvious that the more we have covered in the input space, the more problems we will find and therefore we will be more confident about the quality of the software.
Ideally we would be tempted to exhaustively test the input space. But as stated above, exhaustively testing the combinations of valid inputs will be impossible for most of the programs, let alone considering invalid inputs, timing, sequence, and resource variables. Combinatorial explosion is the major roadblock in functional testing. To make things worse, we can never be sure whether the specification is either correct or complete.
Due to limitations of the language used in the specifications usually natural language , ambiguity is often inevitable. Even if we use some type of formal or restricted language, we may still fail to write down all the possible cases in the specification. Sometimes, the specification itself becomes an intractable problem: it is not possible to specify precisely every situation that can be encountered using limited words.
And people can seldom specify clearly what they want -- they usually can tell whether a prototype is, or is not, what they want after they have been finished. Specification problems contributes approximately 30 percent of all bugs in software. The research in black-box testing mainly focuses on how to maximize the effectiveness of testing with minimum cost, usually the number of test cases.
It is not possible to exhaust the input space, but it is possible to exhaustively test a subset of the input space. Partitioning is one of the common techniques. If we have partitioned the input space and assume all the input values in a partition is equivalent, then we only need to test one representative value in each partition to sufficiently cover the whole input space. Domain testing [Beizer95] partitions the input domain into regions, and consider the input values in each domain an equivalent class.
Domains can be exhaustively tested and covered by selecting a representative value s in each domain. Boundary values are of special interest. Experience shows that test cases that explore boundary conditions have a higher payoff than test cases that do not. Boundary value analysis [Myers79] requires one or more boundary values selected as representative test cases. The difficulties with domain testing are that incorrect domain definitions in the specification can not be efficiently discovered.
Good partitioning requires knowledge of the software structure. A good testing plan will not only contain black-box testing, but also white-box approaches, and combinations of the two. Contrary to black-box testing, software is viewed as a white-box, or glass-box in white-box testing, as the structure and flow of the software under test are visible to the tester.
Testing plans are made according to the details of the software implementation, such as programming language, logic, and styles. Test cases are derived from the program structure.
White-box testing is also called glass-box testing, logic-driven testing [Myers79] or design-based testing [Hetzel88]. There are many techniques available in white-box testing, because the problem of intractability is eased by specific knowledge and attention on the structure of the software under test.
The intention of exhausting some aspect of the software is still strong in white-box testing, and some degree of exhaustion can be achieved, such as executing each line of code at least once statement coverage , traverse every branch statements branch coverage , or cover all the possible combinations of true and false condition predicates Multiple condition coverage. Control-flow testing, loop testing, and data-flow testing, all maps the corresponding flow structure of the software into a directed graph.
Test cases are carefully selected based on the criterion that all the nodes or paths are covered or traversed at least once. By doing so we may discover unnecessary "dead" code -- code that is of no use, or never get executed at all, which can not be discovered by functional testing. In mutation testing, the original program code is perturbed and many mutated programs are created, each contains one fault.
Each faulty version of the program is called a mutant. Test data are selected based on the effectiveness of failing the mutants. The more mutants a test case can kill, the better the test case is considered.
The problem with mutation testing is that it is too computationally expensive to use. The boundary between black-box approach and white-box approach is not clear-cut.
Many testing strategies mentioned above, may not be safely classified into black-box testing or white-box testing. It is also true for transaction-flow testing, syntax testing, finite-state testing, and many other testing strategies not discussed in this text. One reason is that all the above techniques will need some knowledge of the specification of the software under test. Another reason is that the idea of specification itself is broad -- it may contain any requirement including the structure, programming language, and programming style as part of the specification content.
We may be reluctant to consider random testing as a testing technique. The test case selection is simple and straightforward: they are randomly chosen.
Study in [Duran84] indicates that random testing is more cost effective for many programs. Some very subtle errors can be discovered with low cost. And it is also not inferior in coverage than other carefully designed testing techniques. One can also obtain reliability estimate using random testing results based on operational profiles. Effectively combining random testing with other testing techniques may yield more powerful and cost-effective testing strategies.
Not all software systems have specifications on performance explicitly. But every system will have implicit performance requirements. The software should not take infinite time or infinite resource to execute.
Performance has always been a great concern and a driving force of computer evolution. Performance evaluation of a software system usually includes: resource usage, throughput, stimulus-response time and queue lengths detailing the average or maximum number of tasks waiting to be serviced by selected resources.
Typical resources that need to be considered include network bandwidth requirements, CPU cycles, disk space, disk access operations, and memory usage [Smith90]. The goal of performance testing can be performance bottleneck identification, performance comparison and evaluation, etc.
The typical method of doing performance testing is using a benchmark -- a program, workload or trace designed to be representative of the typical system usage. Software reliability refers to the probability of failure-free operation of a system. It is related to many aspects of software, including the testing process. Directly estimating software reliability by quantifying its related factors can be difficult.
Testing is an effective sampling method to measure software reliability. Guided by the operational profile, software testing usually black-box testing can be used to obtain failure data, and an estimation model can be further used to analyze the data to estimate the present reliability and predict future reliability.
Therefore, based on the estimation, the developers can decide whether to release the software, and the users can decide whether to adopt and use the software. Risk of using software can also be assessed based on reliability information. There is agreement on the intuitive meaning of dependable software: it does not fail in unexpected or catastrophic ways. The robustness of a software component is the degree to which it can function correctly in the presence of exceptional inputs or stressful environmental conditions.
It only watches for robustness problems such as machine crashes, process hangs or abnormal termination. The oracle is relatively simple, therefore robustness testing can be made more portable and scalable than correctness testing. This research has drawn more and more interests recently, most of which uses commercial operating systems as their target, such as the work in [Koopman97] [Kropp98] [Ghosh98] [Devale99] [Koopman99]. Stress testing, or load testing, is often used to test the whole system rather than the software alone.
In such tests the software or system are exercised with or beyond the specified limits. Typical stress includes resource exhaustion, bursts of activities, and sustained high loads.
Software quality, reliability and security are tightly coupled. Flaws in software can be exploited by intruders to open security holes. So most software is not about perfection, but sufficiency.
This topic has been a open talking point at OpenRain. I am also highly concerned with sufficient tests, but prefer a incremental approach and am wary to invest too much effort in automated tests up front for several key reasons. Please let me know what you think! Thanks for visiting DZone today,. Edit Profile. Sign Out View Profile.
Over 2 million developers have joined DZone. Sufficiency in Software Testing. Nowadays, quality is the driving force behind the popularity as well as the success of a software product, which has drastically increased the requirement to take effective measures for quality assurance. The metrics and KPIs serve a crucial role and help the team determine the metrics that calculate the effectiveness of the testing teams and help them gauge the quality, efficiency, progress, and the health of the software testing.
Therefore, to help you measure your testing efforts and the testing process, our team of experts have created a list of some critical software testing metrics as well as key performance indicators based on their experience and knowledge. Software testing metrics, which are also known as software test measurement, indicates the extent, amount, dimension, capacity, as well as the rise of various attributes of a software process and tries to improve its effectiveness and efficiency imminently.
Software testing metrics are the best way of measuring and monitoring the various testing activities performed by the team of testers during the software testing life cycle. Moreover, it helps convey the result of a prediction related to a combination of data. Hence, the various software testing metrics used by software engineers around the world are:. It helps in understanding any variances in the testing and is extremely helpful in estimating similar projects in the future.
Similar to test efficiency, test efforts are also evaluated with the assistance of various metrics: Number of Test Run Per Time Period: Here, the team measures the number of tests executed in a particular time frame. It finds defects and isolates them from the software product and its deliverables. Moreover, the test effectiveness metrics offer the percentage of the difference between the total number of defects found by the software testing and the number of defects found in the software.
A type of performance measurement, Key Performance Indicators or KPIs, are used by organizations as well as testers to get data that can be measured. KPIs are the detailed specifications that are measured and analyzed by the software testing team to ensure the compliance of the process with the objectives of the business. Moreover, they help the team take any necessary steps, in case the performance of the product does not meet the defined objectives.
The various important KPIs for software testers are:. Software testing metrics and key performance indicators are improving the process of software testing exceptionally. From ensuring the accuracy of the numerous tests performed by the testers to validate the quality of the product, these play a crucial role in the software development lifecycle.
0コメント