A generic model for software size estimation based on component partitioning : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Software Engineering

Thumbnail Image
Open Access Location
Journal Title
Journal ISSN
Volume Title
Massey University
The Author
Software size estimation is a central but under-researched area of software engineering economics. Most current cost estimation models use an estimated end-product size, in lines of code, as one of their most important input parameters. Software size, in a different sense, is also important for comparative productivity studies, often using a derived size measure, such as function points. The research reported in this thesis is an investigation into software size estimation and the calibration of derived software size measures with each other and with product size measures. A critical review of current software size metrics is presented together with a classification of these metrics into textual metrics, object counts, vector metrics and composite metrics. Within a review of current approaches to software size estimation, that includes a detailed analysis of Function Point Analysis-like approaches, a new classification of software size estimation methods is presented which is based on the type of structural partitioning of a specification or design that must be completed before the method can be used. This classification clearly reveals a number of fundamental concepts inherent in current size estimation methods. Traditional classifications of size estimation approaches are also discussed in relation to the new classification. A generic decomposition and summation model for software sizing is presented. Systems are classified into different categories and, within each category, into appropriate component type partitions. Each component type has a different size estimation algorithm based on size drivers appropriate to that particular type. Component size estimates are summed to produce partial or total system size estimates, as required. The model can be regarded as a generalization of a number of Function Point Analysis-like methods in current use. Provision is made for both comparative productivity studies using derived size measures, such as function points, and for end product size estimates using primitive size measures, such as lines of code. The nature and importance of calibration of derived measures for comparative studies is developed. System adjustment factors are also examined and a model for their analysis and application presented. The model overcomes most of the recent criticisms that have been levelled at Function Point Analysis-like methods. A model instance derived from the generic sizing model is applied to a major case study of a system of administrative applications in which a new Function Point Analysis-type metric suited to a particular software development technology is derived, calibrated and compared with Function Point Analysis. The comparison reveals much of the anatomy of Function Point Analysis and its many deficiencies when applied to this case study. The model instance is at least partially validated by application to a sample of components from later incremental developments within the same software development technology. The performance of the model instance for this technology is very good in its own right and also very much better than Function Point Analysis. The model is also applied to three other business software development technologies using the IFIP 1 International Federation for Information Processing standard inventory control and purchasing reference system. The purpose of this study is to demonstrate the applicability of the generic model to several quite different software technologies. Again, the three derived model instances show an excellent fit to the available data. This research shows that a software size estimation model which takes explicit advantage of the particular characteristics of the software technology used can give better size estimates than methods that do not take into account the component partitions that are characteristic of the software technology employed.
Computer software, Development, Estimates