Multi-Model CMMI® Appraisals – Factors to Consider

In the last few months, I have been frequently asked the question, “Should we do our DEV and SVC appraisal as a single multi-model appraisal?” This question is posed usually by large IT organizations in India. These organizations have already been appraised at ML5 of the DEV model (maybe more than once). And they now are on the verge of their first SVC appraisal in 2012. I guess the issue of multi-model appraisals will become more important in another 1-2 years, when the next round DEV and SVC appraisals are due for many organizations.

Well, the answer is “it depends”:-).

In this note we will try to understand the factors to consider (elaboration of what “it depends” on), so that you can take them into account when you face the situation. This note has been put together with a large dose of inputs from D Sankararaman, Mukul Madan, and V Seshadri. These were validated by Channaveer Patil and Dan He. However, they are not responsible for any errors that may have crept into this note.

Multi-model appraisals are covered in detail in Appendix G of the SCAMPISM A v1.3 Method Definition Document (MDD) downloadable from here.

One appraisal team’s experience on a multi-model appraisal (SCAMPISM v1.2 completed in 2010) at TCS is shared in a SEPGSM 2011 presentation by Ron Radice, et al, is available here.

Disclaimer – this note is not definitive, nor is it an “official” position paper of any organization or lead appraiser. However, it may be considered as one of the inputs while evaluating the option of a multi-model appraisal.

The current queries for multi-model appraisals are typically arising from organizations wanting to do DEV+SVC together, and hence we will use that situation as an example in this note. However, multi-model appraisals could comprise any combination of two or more of DEV, SVC, ACQ and People CMM®, and the factors discussed in this note apply to the other situations as well.

Here are the factors to consider.

Organizational Disruption. If you are a big organization, you could have either two (or more) long organizational disruptions, or one mega-ultra-long disruption. The choice is yours :-) .

Number of ATMs. In multi-model appraisal you are likely to need lesser number of ATMs trained on the models. Assuming that you will try to keep a gap of a few months between the two appraisals (if done separately), the number of ATMs trained on the models may need to be higher, if you are doing the appraisals separately. During the interval between the two appraisals, the ATMs may resign, retire, go on leave, be allocated to some other useful work (assuming that they are still capable of doing some other useful work :-) ), or just refuse to be ATMs again (“not another appraisal as ATM!”). So instead of training a bunch of 10-12 people on the models, you may have to train a higher number if you are doing the appraisals separately.

LA/ ATL requirements. For a multi-model appraisal, you will need to engage a lead appraiser appropriately certified as SCAMPISM-A LA for all the models (constellations, actually) covered in the appraisal. Therefore, the choice of LAs on multi-model appraisals may be significantly lower, especially if your appraisal is “high-maturity” (ML 4 or ML 5).

LA Willingness. The calendar time for the on-site activities for a multi-model appraisal is definitely going to be much higher than for a single model appraisal (this is also discussed as a separate factor later on). LAs may not be willing stay away from their families, pets and home city for such a long time. Or they may demand a fat sum as hardship allowance :-) .

Sampling of Projects (or Workgroups). This does not change whether you are doing a single multi-model appraisal or two separate appraisals. If there were X projects selected for DEV and Y workgroups selected for SVC, then in the multi-model appraisal, the number of instances would be X+Y. Sampling will be done as if they were different appraisals.

Overall Effort. This is one area where there is a lot of misunderstanding. Note that the sample size remains the same (multi or otherwise). Hence, the effort for artefact collection remains similar, the effort for artefact review by the appraisal team is also similar and so is the effort for interviews and discussions. There could be some (a tiny bit) effort reduction in a multi-model appraisal due to the following:

  • Single batch ATM training (instead of possible two batches). However, one batch of ATM training can have a max of 12 participants, so with backup ATMs you may have to run two batches anyways, even for a multi-model appraisal.
  • Sponsor meeting (assuming the same sponsor for both the appraisals)
  • Opening meeting can be a single one instead of two
  • Some economies of scale (not a lot) on artefact collection, artefact review, interviews and preliminary findings for “Oh” areas – organizational PAs like OPD, OPF, etc. However, organizational PAs will have to be investigated from both (DEV and SVC) the contexts explicitly. So the saving would be more in terms of being familiar with the terminology, document architecture and names/ faces of people running the “Oh” processes, assuming that the people are the same in the DEV and SVC contexts.
  • General effort saved for the LA and ATMs due to familiarity with the layout of the office, the security procedure, the parking lot, the cafeteria food, the washrooms, the office furniture, the room freshener, the air-conditioning, etc. (this factor may be invalid, if the appraisal team has to constantly move across buildings and cities anyway).

The project-level (or work-group level) process areas will have to be investigated for each instance separately (either in the DEV or the SVC context). Since the sample size is going to be determined the same way (whether it is a two separate appraisals or a multi-model one), the effort to investigate instance level data is going to be same. This includes the effort for the preliminary findings (or equivalent).

With the above micro-savings, the overall appraisal effort savings (LA + ATMs) is likely to be in the range of 15%-20% (i.e., the total effort for a multi-model appraisal is likely to be around 15-20% less than the total effort for separate appraisals).

Calendar Time. With the large one-time effort for the multi-model appraisal, the calendar time for the onsite period is also likely to be higher (than that of a single model appraisal), because there is a limit to the number of ATMs that an LA can handle. Hence the ATMs will need to be out of their day-job for a longer period. The long drawn absence of the ATMs from their day job can be disruptive. A reported multi-model appraisal done had an onsite period of close to the upper limit of 90 days (see here).

ATM / LA Fatigue. This is where the multi-model (for high maturity, large organizational scope) becomes untenable. As the on-site period start crossing three weeks, the fatigue becomes obvious. In organizations that use standard processes, and have done this over many years, one can expect the documents to be similar. The responses during the interview sessions will also be similar.

For the ATMs, after the glamour of being ATMs, the novelty of PIIDs and Process Area Worksheets, and the thrill of FI-LI-PI-NI-(and NY, of course) wears out, it is an extremely boring, mind-numbing and dull exercise.

(Digression: This may be one of the reasons that LAs have become good storytellers and general entertainers. I know of a LA who sings to keep the ATMs entertained, another one tells jokes on a non-stop basis. Some LAs have started blogging. SEI may have to initiate a study to understand whether LAs have a higher tendency to ….whatever :-) End of Digression).

After around three weeks, the productivity, alertness, and eye for detail falls down steeply for the ATMs as well as the LA. The other issue is the stress on the ATMs of maintaining confidentiality. They cannot talk to their friends and colleagues, or smile at passing acquaintances, because they may be asked that dreaded question “and how are we today?” Those who have been ATMs know about this, others readers may ask their friends/ colleagues who have been ATMs to confirm this :-) .

Target Levels/ Results. For the purpose of the results (ML/ CL), the multi-model appraisal will deliver two results, two different sets of ratings. Your target rating could be different for the two models and the appraisal result rating will also be different for the two models. This is the same as doing two separate appraisals.

Novelty / Publicity Value. “Will we be the first to do a multi-model appraisal?”; “No?”; “Okay, can we be the first do to it in this country?”, “How about this city?” and so on….

Well, if your organization is looking to announce itself as the first in something, we can surely work out some combination of conditions that you will be the first in. Not just the first, but maybe the only one. Ever.

===========================================================================

Having said all that, are there any conditions where it may be worth considering a multi-model appraisal? Yes, if you have the following it may definitely be worth considering:

  • Low number of Process Areas (ML2 kind of stuff) in both the models, and
  • Small organization (number of sampled instances are likely to be low)

Under these circumstances, the number of ATMs for each separate appraisal would be low (say 4-5), so one can increase the ATM team size to 8-10 and run the multi-model appraisal in the same number of days as a single separate one. So the organizational disruption time and LA cost can be much lower.  Also, ATM training can also be done in a single batch (max batch size is 12).

Finally, the issue is a complex one, and let us conclude by saying once again that “it depends” and that you should consult multiple LAs and take their opinions before coming to any firm conclusion.

Also refer to:

Thanks a lot to D Sankararaman, Mukul Madan, V Seshadri, Channaveer Patil, and Dan He.

Please feel feel to share your views, experiences or queries, using the “comments” feature available at the top of this article/ post.

Notes:

Nothing Official About It! – The views presented above are in no manner reflective of the official views of any organization, community, group, or association.

® CMMI and CMM are registered in the U.S. Patent and Trademark Office by Carnegie Mellon University.
SM-SCAMPI and SEPG are service marks of Carnegie Mellon University.
LA- is a short form of SCAMPISM Lead Appraiser. (It is not a term of endearment like “da”, “pa”, “ma”, or “po” used in different parts of the world :-) ).
ATM – Appraisal Team Member (not an Automated Teller Machine :-) )

You may also be interested in the following posts uploaded on the same blog:

 

Hi,

If you like the posts on this blog and would like to be informed whenever a new entry is made, here is what you can do:

  • Scroll back to the top of the page
  • On the right hand side there is section called “FollowBlog via Email”
  • In the space provided, type in your email id and click on the “Follow Blog” button (give your personal id, since companies often block wordpress)
  • You will get an email at that email id

CMMI® – Constellations, Representations and some food for thought

In presentations, training or orientation sessions on CMMI® the topic of constellations and representations does come up for discussions (even if the presenter wants to avoid it :-) ). I have in the past, found that the standard material on these topics has not always helped people understand and remember the concepts. The people, who I believed understood the concepts, surprised me later with a question or a comment that indicated otherwise. Till the time I hit upon an analogy that is easy to understand and easier to remember. And it is related to food.

I will use the example of a restaurant called Celesti-yummiNYTM (the restaurant owner fancies it as trans-galactic gastronomic delight). It has a menu that features three kitchens – food representing three different regions of the universe. One from the Devphinus constellation, another from the Severus constellation and the third from the Aquirius constellation.Restaurant Board

Each constellation (kitchen) serves a set of dishes – Devphinus has 22 dishes, Severus serves 24 dishes and Aquirius presents 22 dishes. These are all listed in Celesti-yummiNYTM menu card. Each kitchen-constellation has also created a recipe booklet that is publicly available, for free (however, fancy, bound versions are priced). People and outer-world aliens can use the recipe books to prepare the dishes, as long as they keep acknowledging the intellectual property and trademark ownership of Celesti-yummiNYTM.

(Digression: There may be another category of sapient beings called “earthly aliens”, since immigration counters in some airports have separate queues for such creatures. When I stand in such queues, I hope to quickly complete the formalities before someone like Ellen Ripley notices me. For more information on Ellen Ripley and how she handles Alien species, see the Wikipedia page here. :-) : End of Digression)

Let us now examine the menu card of Celesti-yummiNYTM. As explained before, each kitchen-constellation has a list of dishes (22 to 24 dishes). Each dish comes in three sizes – Small (CL1), Medium (CL2), and Large (CL3). From a kitchen, you may choose any number of the dishes, and specify the size of each dish (small, medium or large). This kind of order, for some reason, is called a “continuous” order by Celesti-yummiNYTM (which many restaurants call as an al-a-carte order), though there is nothing continuous about it.

Dish Sizes

In addition to the 22 to 24 dishes offered in 3 sizes, each kitchen-constellation also offers fixed meals (pre-plated meals or thalis).  There is a mini-meal (ML2), a midi-meal (ML3), a maxi-meal (ML4) and a mega-meal (ML5) that you can order from each kitchen. These fixed meals have a pre-decided set of dishes (a sub-set of the 22 to 24 dishes) at pre-decided sizes, with some very small variations. These fixed meals are called “Staged” meals.

For example, if you order the mini-meal (ML2) from the Devphinus kitchen menu you will get seven dishes, all of the medium (CL2) size. You have no choice in the matter. The only exception is for the dish called “Sammy’s Fav” which you can decline, provided you have a doctor’s certificate that you are allergic to some of the ingredients. You cannot decline any other dish. Nor can you change the size of the dishes if you order a fixed meal. Similarly the midi-meal (ML3) from Devphinus, will contain eighteen dishes, all of the Large (CL3) size. Again, you can decline the dish called Sammy’s Fav, with a doctor’s certificate. The mega-meal (ML5) from Devphinus will see a large platter with all their 22 dishes (and, you can still decline Sammy’s Fav, with appropriate justification).

Thalis

There are similar fixed meals with minor variations in the other two kitchen-constellations.

One warning – there are dishes with same or similar names offered by the three kitchen-constellations. Some are called “core” dishes and some are called “shared” dishes. Don’t be fooled by the names and the terminology. They taste significantly different (because of the way they are cooked, raw material, and interaction with the other dishes), though they are called by the same/ similar names. For example, the dish called “Risque-Salad” (offered by all the three kitchen and hence called a core dish) will look and taste significantly different, becuase of the ingradients, presentation, and seasoning.

Salads

There ends the explanation of constellations, representations (staged/ continuous),  core/shared PAs, ML/CL, etc.

Please feel free to add your variation and flavour to this explanation (use the comment feature)

Other related posts uploaded on the same blog:

®-CMMI and CMM are registered in the U.S. Patent and Trademark Office by Carnegie Mellon University.

NYTM- Celesti-yummi is Not Yet Trade Marked ;-)

Hi,

If you like the posts on this blog and would like to be informed whenever a new entry is made, here is what you can do:

  • Scroll back to the top of the page
  • On the right hand side there is section called “FollowBlog via Email”
  • In the space provided, type in your email id and click on the “Follow Blog” button (give your personal id, since companies often block wordpress)
  • You will get an email at that email id
  • In the email there is a clickthrough link – click on it and complete the formalities

What is (Project) Success in a High Maturity Organization?

Project success is measured by comparing the actual performance with what was budgeted, planned and committed – typically with respect to parameters of cost, schedule and quality. Projects that meet all parameters are considered completely successful, and those that meet some parameters are considered less successful. Projects that fail in most/ all parameters are labeled as failures. Of course, sophisticated systems may even use the extent to which they missed the objectives (near miss or missed by a mile/ kilometer) as a factor in determining the degree of success or failure.

Is this really how a high maturity (HM) organization (in terms of the CMMI® framework) should evaluate project success? I believe that the refinement in process and project management maturity should be used to fine-tune how we evaluate success.

A HM organization is “aware” that all processes have variations inherent in them. It “knows” that projects (that are composed of the processes) have a probability of achieving success in their objectives, but success is not guaranteed. The role of project management (esp. QPM) is to continually evaluate the probability of success and maximize the conditions to improve that probability.

When a single project goes through its life, those probabilities will play out. Which means that even if the probability of completing the project within its budget was 90%, a single project can overshoot the budget. Of course, if we run similar projects millions of times, only 10% of the projects will overshoot the budget; but we have only one project here. In such an “aware” organization, is the use of “actual budget compliance” a right way to measure success? If so, how is this organization different from a non-HM organization?

I believe that in a HM organization, project success should not be measured by after-the-fact results, but by the rigor and continual alignment of the project to maximize the probability of success. So, in a HM organization, a project is successful, if and only if:

*    The project, at start-up, consciously makes choices (composes the defined process, aligns plans) that maximize the probability of meeting its multiple objectives

*    The project continually evaluates the probability of meeting the objectives and revises its choices to maximize its probability of success

Now, in such an organization, the “best project” award may be given to a project which in the conventional sense has actually failed :-) – such an organization would be truly acting on the belief- “if we implement the process, the results will eventually follow”.

Your comments?

What comes first – SPC or a stable process?

An interesting topic, which has been discussed very often. In every discussion, people agree on what is right and what needs to be implemented. But in actual implementation the principles are forgotten. Therefore it is good to re-align ourselves to the basics time and again.

What is often seen in actual implementation of SPC (ineffective and incorrect implementation):

1)    A process is documented and used

2)    Data related to the process is collected

3)    When we need to do sub-process control (because we are aiming for High Maturity rating), an SPC chart is prepared.

4)    Data which are outliers are thrown out (root cause analysis is not possible, because the outlier data belongs belongs to a distant past, and the causes are lost in the mists of time)

5)    Control limits are recalculated

6)    Steps 4) and 5) are repeated till all (remaining) points demonstrate process stability

7)    The SPC parameters (center line, UCL/ UNPL, LCL/ LNPL) are declared as baselines and used for sub-process control. The fact that the limits are too wide or that a lot of data points were thrown out (without changing anything in the process) is ignored.

What we have in the above scenario is a maturity level 2/ 3 organization using maturity level 4 tools. Usage of tools alone does not increase maturity. We cannot create a stable process through the use of SPC, we can only confirm the stability of the process through SPC and get signals when the process is out of control or shows changes in trends.

The More Effective Implementation of SPC:

1)    A process is documented and used. As the process is used, variations in the interpretation of the documented process are qualitatively analyzed. Actions are taken to augment the process definition, training and orientation till the interpretation and the qualitative understanding of the process is consistent.

2)    Process compliance audits (PPQA audits) on the implementation of the process identify more actions that need to be implemented to fine-tune the definition, training and orientation related to the process.

3)    Once the audits show consistent compliance, data related to the process performance are collected. Integrity of the data is checked and the data collection process is streamlined and consolidated- till the collected data demonstrates the required credibility

4)    Now we start looking at the data somewhat quantitatively (without using full SPC) – does the trend chart show stability? Is it showing too much dispersion/ variation? Based on the findings, the definition, training and orientation related to the process is refined further

5)    This is point we start using SPC charts to confirm process stability. Each inflection of instability is analyzed. Corrective and preventive actions are identified to further standardize the process, based on analysis of past instability. Once we are sure that causes of those inflections are removed, we can remove the points from the analysis.

6)    We are still left with points which show instability, and our CAR analysis tells us that some of the causes are truly extremely rare events. These are then removed from the data pool. Now all the remaining points are a part of the process. If the process still shows instability, then we can do further analysis – are these really part of a single process? Beneath the surface, are there two or more processes, and we need to separate out the data (e.g., the process may behave differently in the “performance appraisal season”? :-) )

Having followed all the above steps, we now have a basis (and hence baseline) for an effective implementation of SPC.

Remember: We cannot create a stable process through the use of SPC, we can only confirm the stability of the process through SPC.

Size Does Matter! (for baselines and sub-process control) -Continued

Let us take the example of  examination/ test centers, that run an exam throughout the year, every day. Past one-year data shows – 30% of the candidates pass the exam and 70% fail the exam, all over India.

The Bangalore test center handles around 1000 candidates per month, whereas the Mysore center handles around 100 per month. Over the last one year, both centers have shown the same 30 pass: 70 fail ratio.

For the month of June 2010, one center has reported 38% pass and another has reported 29% pass. Which center (Bangalore or Mysore) is more likely (has a higher probability) to have reported 38%?

Well, Mysore is more likely to have the higher deviation from the average (+8%) than Bangalore (-1%), because Mysore, handling lesser candidates, has a lesser number of opportunities to “average out”. An easy way to figure this out is to take the case of a center that handles only 1 candidate. This center can have either 0% or 100%  pass percentage; a -30% to +70% deviation from the average.

Let us now get back to the process performance baselines that we create and the way we do sub-process control. Here are some things that we need to keep in mind while creating, publishing and using baselines:

1) Baseline (mean and standard deviation) for a sub-process parameter (like coding productivity) will be different depending on whether we consider each the coding phase of each project as a data point, or we consider each program coded in each project as a data point. The standard deviation in the first case (large base) is likely to be smaller than the second case (small base).

2) When we publish performance baseline data, we need to qualify it with the level of detail at which it applies.

3) When we use the baseline data to do sub-process control, it needs to be applied to the same level of detail. So, to do sub-process control on program level coding productivity, we need to use the baseline that was created using programs as data points (not each projects as a data points).

4) Baselines need to be created using similar situations of the base data. For example, we cannot combine the coding productivity on large programs with the productivity on small programs. Even if the average/ mean remains the same, the standard deviation will be higher when we take data from a smaller base as against a larger base.

The above points are not just “nits” but have an impact of the usefulness of baselines and sub-process control. Incorrect usage of baselines leads to incorrect displays process instability / stability.

Size Does Matter! (for baselines and sub-process control)

Here is a small brain-teaser.

Let us take the example of a examination/ test centers, that run an exam throughout the year, every day of the year. Analysis of the past one-year data shows that 30% of the candidates pass the exam and 70% fail the exam, all over India.

The Bangalore test center handles around 1000 candidates per month, whereas the Mysore center handles around 100 per month. Over the last one year, both centers have shown the same 30 pass: 70 fail ratio.

For the month of June 2010, one center has reported 38% pass and another has reported 29% pass. Which center (Bangalore or Mysore) is more likely (has a higher probability) to have reported 38%? Why do you think so?

See my post dated August 3, 2010 for the answer and implications.

Why Can’t Metrics be Used for Performance Appraisals?

While discussing collection and usage of metrics, one often hears an emphatic “We should not use metrics for individual performance management!”. The statement is made as if it is an unquestionable tenant of the religion called process management.

“And pray, why not?” Why should the performance management process be deprived of metrics? A process oriented organization would definitely not like to boast that their performance management system is completely subjective.

Here are some reasons why metrics should be used for individual performance management.

*    An individual performance management (including the appraisal part) needs to be SMART – the “M” stands for measurable.

*    Most individual performance parameters are the similar to and derived from the project, product and process objectives, they typically relate to cycle time, quality (defects), meeting commitments (schedule) and productivity (cost, effort and usage of resources).

*    A strong metrics system, that provides accurate, precise and valid data can support the project, process and individual performance management requirements.

*    Using the same sources of data, we can create a more aligned organization – the individual objectives are aligned to the project, product and process objectives. In this manner, individuals know that meeting their individual goals helps in meeting the other goals (and vice versa); conflict of interest is minimized.

The situations where we may not want to use process/ project metrics for managing individual performance are:

*    The metrics collection system is not stable, and there questions on the credibility of the data. In such a case, the use of the data for managing the project/ process is also diluted.

*    Usage of the data for individual performance management may make the individuals sabotage the process and the accuracy of the metrics. In which case, we need to strengthen the process and make it sabotage proof.

In the old SW-CMM® days, most metrics collection systems were unstable, and hence many experts of that time were pretty insistent on the metrics not being used for performance appraisals – some organizations even have policy level statements for the same!

We have now moved on from the SW-CMM® days for process management, so we need to move on in other aspects too.

Your comments?

Generating Lots of Data through Monte Carlo (a misuse?!?)

I have seen the metrics groups of organizations generating “enough” data for creating process performance baselines, from very few available data points, using Monte Carlo simulation.

Here is the method they use: Ten data points are available; using the pattern of the ten data points, they generate a thousand (or maybe a million) data points using Monte Carlo simulation. Now they feel that they have enough data points to generate a baseline.

But in reality the baseline has been generated using 10 data points. The 1000 data points only give a feeling of having lots of data and this is clearly a misuse of Monte Carlo simulation.

Normal Distribution is Actually Rare

When we often use statistical analysis tools and techniques, the underlying assumption is that process/ sub-process displays a “normal” behavior. Even if the limited data that we have shows non-normal behavior, we assume that the reason is the lack of data, and we approximate the distribution to normal.

This assumption and subsequent analysis, conclusions and decisions are therefore inaccurate, especially if we are combining “assumed” normal behavior across multiple processes, viz Process Performance Modeling.

“Normal” behavior is very rare in real life. For example, you travel from your home to office, let us say usually in 1 hour. The least time you have ever done the trip is in 30 mins. If the distribution was normal, the worst time should have been 1 hour 30 mins (symmetrical on both sides). You will find that on some days that you were delayed, the time could have been 2 or even 3 hours!

Another way of saying that real life does not behave in a “normal” way, is “there is a limit on how well you can do, but no limit on how badly you can screw up!”

There is more on this in the books “Fooled by Randomness” and “Black Swan” by Nassim Taleb — must-reads for anyone involved in high maturity CMMI® implementation.

Follow

Get every new post delivered to your Inbox.

Join 155 other followers