Benchmarking Terms of Reference

This is an extract from an original article by Pam Morris - Total Metrics - published in IFPUG Book on Software Measurement 2011.

Terms of Reference

It is our recommendation that before engaging a benchmark supplier, or funding an in-house benchmarking program, that the sponsors work with the benchmarker and stakeholders to establish the ‘Terms of Reference’ for the benchmarking activity. These terms should include the agreed position for each of the following:

1. Strategic Intent of the Benchmark

o How will the results be used?

2. Type of Benchmark

o Internal and/or external?

3. Benchmark Performance Metrics

o What are the processes or products required to be assessed to satisfy the goals of the benchmark and how will they be measured?

4. Standards for Measures

o What are the agreed units of measurement, data accuracy and validation requirements?

5. Scope of the Benchmark

o What are the inclusion and exclusion criteria for projects and applications?

6. Frequency of Benchmark

o When and how often should measures be collected and reported?

7. Benchmark Peers

o What are the criteria by which equivalent sample data will be selected for comparison?

8. Benchmarking Report

o Who will be the audience, and what will be the report’s structure, content and level of detail provided, to support the results?

9. Dispute Resolution Process

o What is the process that will be followed should disagreement arise about the validity of the benchmarking results?

1. Strategic Intent of the Benchmark

Sponsors of the benchmark need to work with IT Management to establish:

• The objectives of the benchmarking activity i.e. what are the results required to demonstrate; within what period, and for what purpose. What are the criteria by which the benchmark will be judged to be successful. Common reasons for benchmarking include monitoring:

o Process improvement initiatives

o Outsourcing contract performance against targets

o Consistency in performance across organisational units

o Benefits achieved from new investments or decisions compared to benefits claimed

o Performance compared to competitors or industry as a whole

• The stakeholders, i.e. who will be responsible for the benchmark’s design, data collection, analysis, review, approval, sign off and funding.

2. Type of Benchmark

Establish whether the organisation will benchmark:

• Internally to demonstrate improvement trends over time for the organisations internal processes, or

• Externally to compare internal results with external independent organisational units, or Industry as a whole.

Organizations that are aware of their own limitations will recognise their need to improve without first being compared externally to demonstrate how much improvement is required. As a first step, it is recommended that organizations start by internally benchmarking, and then when their own measurement and benchmarking processes are established, do some external benchmarking to establish their industry competitiveness. However, prior to determining standards for the collection, analysis and reporting of their benchmark metrics, they should first identify their proposed strategy for externally benchmarking. This enables their internal benchmarking framework to be aligned to that of the External Benchmark Data Set, thereby facilitating the next step of External Benchmarking without any rework to realign the data.

3. Benchmark Performance Metrics

Benchmarking AD/M should ideally monitor the performance all of the four perspectives identified in the Balanced Scorecard approach - Financial, Customer, Business Processes, Learning and Growth. Whilst this is the ideal approach, in our experience IT organizations focus their initial IT benchmarking activities on areas that directly impact their IT costs. They measure the cost effectiveness and quality of their IT processes and products by optimising the following Key Result Areas (KRAs):

Benchmarking is not a ‘one size fits all activity’. Many ‘Benchmarking Service Providers’ offer turn-key solutions that fail to take into account the individual needs of their clients. By clearly defining the strategic intent of the benchmark before engaging a Benchmark Provider an organisation ensures that client organisational goals are met and the solution being offered provides a good “fit’. Once this is decided they can then focus on benchmarking Key Performance Indicators (KPIs) that demonstrate achievement of those goals. For example, for many telecommunications and financial sector companies, maintaining competitor advantage is the key to their success, so they need their IT department to constantly deliver new, innovative products to their market. In this case, ‘speed of delivery’ becomes their highest priority to optimise their competitive position. In comparison, recent budget cuts for Government Agencies may focus their improvement needs on maximizing their IT cost-effectiveness. Before starting a benchmarking activity identify the key organisational goals and their corresponding KRAs, then one or two KPIs within that area that will demonstrate the achievement of the identified goals. When conducting an external benchmark some compromise may need to be made in the selection of KPI’s as these must align to performance measures for which industry/peer data is available.

4. Standards for Measures

When comparing between projects, business units and/or organisations you need to ensure that the measurement units collected are equivalent. This is not merely a matter of stating that cost will be measured in US dollars, size will be measured in Function Points and effort will be measured in days. Whilst ‘cost’ of software projects is probably the most carefully collected project metric, and the most important for the organization to monitor, it is a very difficult unit of measure to benchmark over time. This is becoming increasingly the case in a world of off-shore multi-country development, where currency conversion rates fluctuate daily and salary rates rise with different rates of inflation across countries and time. Comparing dollars spent per function point this year, to previous years, requires multiple adjustments and each adjustment has the potential to introduce errors. Instead most organizations choose to measure cost effectiveness by measuring the effort input instead of cost input. Whilst it may seem straight forward to measure the Project Productivity Rate as the number of function points delivered per person per day, in order to really compare ‘apples to apples’, the benchmarking analysis needs to ensure that for each of the participating organisational units the following characteristics of the size and effort measures are consistent:

Every organisation has different ways of measuring and recording their metrics. The resulting productivity rate may vary up to 10 fold depending on which of the various combinations of the above choices are made for measuring effort and size. To avoid basing decisions on invalid comparisons, agreed standards need to be established at the beginning of the benchmarking activity for each of the measures supporting the selected KPIs for each contributing organizational unit. Each measure needs to be clearly defined and communicated to all participants involved in the collection, recording and analysis of the data. If some data is inconsistent with the standards then it should be either excluded from the benchmark or transformed to be consistent and appropriate error margins noted and applied to the results.

To simplify this process, and facilitate external industry benchmarking, it is recommended that organisations adopt the defacto data collection standards and definitions for measuring AD/M developed by the International Software Benchmarking Standards Group (ISBSG).

The ISBSG community recognized the need for formal standardisation of AD/M measurement and in 2004 developed the first working draft of a Benchmarking Standard which became the basis for the new ISO/IEC framework of Benchmarking standards. The first part of a 5 part framework for Benchmarking Information Technology was approved in May 2011, to become an ISO International standard (ISO/IEC 29155-1. Systems and software engineering -- Information technology project performance benchmarking framework -- Part 1: Concepts and definitions. ) Seventeen countries participated in the review of the interim drafts and the final approval vote for the standard. This international collaborative process ensures the result is robust and the outcome is accepted across the IT industry. The ISBSG is already a recognised industry leader in setting standards for data collection. A number of software metrics related tools vendors and Benchmarking Providers have adopted the ISBSG data collection and reporting standards and have integrated the ISBSG data set in their tools

5. Scope of the Benchmark

Not all of the software implementation projects or software applications supported are suitable candidates for inclusion in the Benchmarking activity or can be grouped into a homogeneous set for comparison. All candidate projects and applications should be investigated and categorised on the following types of characteristics in order to make a decision about their acceptability into the benchmarking set, or if they need to be grouped and compared separately:

When benchmarking against industry ‘projects’ you need to ensure that you are comparing against a ‘Project/Release’ or a ‘Sub-Project/ Work Package’ since the productivity rates of the Project/Release type ‘project’ will be decreased by the overhead effort and cost.

It is recommended that prior to selecting the projects or applications to be benchmarked they are first grouped into like ‘projects’ and then classified using the above categories, to either ensure that each of the benchmarking sets consists of an even mix of all types, or if this is not able to be achieved, that they are grouped into ‘like’ categories for comparison exclusively within those categories.

6. Frequency of Benchmark

The frequency in which data is collected, analysed and reported will be determined by the goals of the Benchmarking activity. However, when determining how often these activities need to be done the following need to be considered:

7. Benchmarking Peers

Previous discussions have highlighted the factors to categorise individual projects and applications to ensure that sample sets of data for internal benchmarking are comparable. However, when an external data benchmarking set is derived from industry, or selected from one or more external organizations, then additional factors need to be considered.

8. Benchmarking Report

Prior to commencing the benchmarking process it is recommended that the sponsors and key stakeholders agree on how the information will be reported. They need to decide on the reports:

9. Dispute Resolution Process

If the Terms of Reference are established prior to the benchmarking activity and agreed by all parties, then any areas of contention should be resolved prior to the results being published. However, as mentioned earlier, in some circumstances there are significant financial risks for an organization that believes that it has been unfairly compared. It is recommended that if benchmarking is incorporated into contractual performance requirements then a formal dispute resolution process also be included as part of the contract.


Whilst the above warnings appear to indicate that comparative benchmarking is difficult to achieve, in our experience this is not the case. It is surprising in reality to see the results of pooling data into a benchmarking set and how well they align with results from external data sets from a similar environment. In our experience the rules of thumb derived from industry data are able to accurately predict the scale of effort or the cost of a project, indicating that the measures from one data set can be used to predict the results for another.

However, as consultants who have worked for over 20 years in the benchmarking industry we are constantly confronted with contracts that require performance targets based on a single number to be derived from a large heterogeneous data set. Such benchmarks are unlikely to deliver useful results and client expectations need to be managed from the outset. The Terms of Reference described above are provided as guidance for consideration when embarking on a benchmarking activity. Only some variables will apply to your unique situation. If they do apply, consider their impact and choose to accommodate or ignore them from an informed position; fail to consider them at your own risk.


Morris, P. January 2004. Levels of Function Point Counting.


Select M&S DCQ (Microsoft Word doc)

Morris, P. 2010. Cost of Speed. IFPUG Metrics Views. July 2010. Vol. 4. Issue 1: 14-18.

International Software Benchmarking Standards Group (ISBSG)