Using a Python script to generate Cumulative Flow Diagrams

Cumulative Flow Diagram is a graphical tool used extensively in Lean and Agile projects. It is used to track project progress, visualize work in progress (WIP) and obtain understanding of issues with the project including bottlenecks & scope changes.

CFD generated using Spyder 2 IDE with matplotlib and numpy modules
CFD generated using Spyder 2 IDE with matplotlib and numpy modules

Burn Up and Burn Down charts are much more commonly used by Agile teams than Cumulative Flow Diagrams (CFD) because they can be easily plotted on a white board using a marker pen. In the case of CFDs, it is not easy to plot it manually on a white board  and so many teams therefore, have to procure software, even when the team is co-located just for the sake of generating plots.

There is a way to plot CFDs using excel described here, but to automate it we need to script it. This article describes a very simple way to generate CFDs using a python script.

The input to this script is hard-coded in the form of arrays for each status, but these can be easily replaced by variables input from a web form.

The Kanban flow for this specific case has Requirements, Development, Testing, Verification and Done indicated by the arrays features_req, features_dev, features_tst, features_vfy and features_dne which contain the values from Week 1 (W1) to Week 12 (W12). The arrays are as below which forms a matrix.

 #Weeks ->     W1,W2,W3,W4,W5,W6,W7,W8,W9,W10,W11, W12
features_req = [1, 1, 4, 3, 1, 1, 0, 0, 0, 0,  0,   0]
features_dev = [0, 1, 2, 2, 3, 2, 2, 1, 1, 0,  0,   0]
features_tst = [0, 0, 0, 1, 1, 2, 2, 2, 1, 2,  0,   0]
features_vfy = [0, 0, 0, 0, 1, 1, 2, 2, 3, 1,  2,   0]
features_dne = [0, 0, 0, 0, 0, 0, 0, 1, 1, 3,  4,   6]

I will describe in detail the data (above) giving examples of the Kanban board for that week.

The Board has just one feature request in the first week.
Kanban Board – Week 1

The status of the board for a week should be read in vertical for that week. e.g. For week 1, there is just one feature request as indicated by the values in the column for W1 (Week 1) -> 1,0,0,0,0.

Kanban Board - Week 2
Kanban Board – Week 2

For Week 2, a new feature request has been made which arrives in the first column (Requirements). The first feature request has now been pulled into the development stage and is been worked on. So the column for W2 (Week 2) has the values -> 1,1,0,0,0

Kanban Board - Week 3
Kanban Board – Week 3

In the third week (Week 3), four feature requests have come into the first column (Requirements), the one request which had come in the last week is moved to Development so that you now have a total of two features in development. So the column W3 has the values -> 4,2,0,0,0.

Kanban Board - Week 12
Kanban Board – Week 12

The last column in the matrix has the values for Week 12 (W12), which is 0,0,0,0,6 and the corresponding Kanban board shows all the items (6) in the Done column.

This basically means that there is one-to-one correspondence between the column for that week and the Kanban board columns. It is straight forward to construct this matrix for your specific use case even to have a similar web form constructed to feed in the data representing the Kanban board.

The script uses python with matplotlib and numpy. The plot was generated using iPython in Spyder2 IDE. The script is as below.

import numpy as np
from matplotlib import pyplot as plt

weeks = np.arange(1,13,1)
#Weeks -> W1,W2,W3,W4,W5,W6,W7,W8,W9,W10,W11, W12
features_req = [1, 1, 4, 3, 1, 1, 0, 0, 0, 0, 0, 0]
features_dev = [0, 1, 2, 2, 3, 2, 2, 1, 1, 0, 0, 0]
features_tst = [0, 0, 0, 1, 1, 2, 2, 2, 1, 2, 0, 0]
features_vfy = [0, 0, 0, 0, 1, 1, 2, 2, 3, 1, 2, 0]
features_dne = [0, 0, 0, 0, 0, 0, 0, 1, 1, 3, 4, 6]
features = np.row_stack((features_dne, features_vfy, features_tst,
features_dev, features_req))

fig, ax = plt.subplots()
ax.stackplot(weeks, features)

# Add relevant y and x labels and text to the plot
plt.title('Cumulative Flow Diagram')
ax.set_xlim(1, 12)
ax.set_ylim(0, 6)

You can download it from here.

Productivity & the fallacy of cent percent utilization – Ask the right question !

In one of the organizations that I worked with, a newly minted Executive wanted to set things right and improve the productivity of his organization. The question posed to his management team was “How can we increase the productivity of resources (employees) ?”  After some discussion, it was decided  to measure “Employee Utilization“. A couple of voices of reason in the room was rapidly silenced. An experienced Project Manager and couple of senior technical staff was quickly commissioned to develop software to make this happen.

A few months later, we had the necessary software developed, tested and ready to deploy. After some fanfare, it was rolled out across the entire department to capture the utilization of resources. Managers were told candidly that they have to ensure that their teams fall in line and input the number of hours worked or face the consequences.

The outcome, however, was very different from expectations. Every one in the organization reported a utilization of at least 10 hours each day and people who did report a utilization of  less than 10 hours were penalized by being assigned to more number of projects to improve their utilization. Very soon, in a matter of weeks, the reports showed everyone in the department to be 100% utilized and since everyone was 100% utilized there was no further scope of improvement. The tool failed in what it intended to achieve and was rapidly decommissioned (without even a whimper).

In the above scenario, the crucial aspects of knowledge work was ignored – thought and collaboration.


Let us delve a little into the process of thinking which is so important for generating the results of knowledge work. We think in two modes – Focused Thinking and Diffused Thinking. These two modes are mutually exclusive, we cannot be in both focused and diffused modes at the same and also, we cannot be one mode for long.

Focused Thinking is when we think in a very focused manner on the task at hand. A good example is a programmer recollecting the syntax and semantics of a language while writing code. What does not happen in Focused thinking is the generation of new thoughts. New Idea generation and innovation does not happen in Focused Thinking. 

Diffused Thinking  on the other hand, generates new thoughts and therefore new and innovative ideas. This is the reason why sometimes a solution that evades us even when we think very hard about it, suddenly occurs to us when we are sleeping, relaxing or otherwise doing something else not related to that problem. Innovators like Edison developed ways on switching between these modes at will.

It is a basic requirement, just like breathing is required to sustain life, that the brain switch between the focused mode and diffused mode. If this switches do not happen, burn-outs are imminent. This basically means that, without the freedom and environment to think and express thoughts, productivity will be curtailed.


Knowledge workers achieve outcomes by collaboration. Especially in the IT industry, there are a lot of published material on the value of collaboration in the development of better software and how processes to ensure tighter collaboration results in better software.

It follows that, without an environment of collaboration, productivity, innovation and quality of the outcome will be adversely affected.

Measurement of productivity on the basis of utilization (“seat time” or “availability”) results in the following issues and the resulting undesirable outcomes.

  • There is pressure to produce with little thought -This basically means that, most of the work is done with the brain is stuck in the focused mode. Since new thoughts do not happen in focused mode, better ways of obtaining the same outcome does not happen – Innovation suffers. This also affects quality. In disciplines like software engineering, this gives rise to extra ordinary amounts of technical debt, non-optimal solution development and this has a cascading effect on software stability and customer satisfaction.
  • Compliance to an availability criteria is a prime requirement, everything else is subservient to this – This gives some returns initially, but gradually, employees suffer burn-outs. This gives rise to sick leaves and time-offs (including planned time-offs). Over time, the actual drive to deliver value decreases.
  • Pressure to perform at 100% utilization kills collaboration – In cultures where utilization is measured, collaborative activities are considered as productivity loses. However, when collaboration decreases a lot of desirable features otherwise seen in cohesive teams like increased morale, respect,  self-esteem, interpersonal awareness, group pride, loyalty, goal orientation all of which results in a much higher level of performance goes down the drain.

So what does employee utilization has to do with productivity (delivering customer value) ?        Absolutely Nothing!!

So what shall we do to make things better ?

Ask the right question – This is of prime importance.  What we do not ask, we do not attempt to measure; What we do not measure, we do not understand; What we do not understand, we cannot improve. From a customer value perspective, the questions may be What is the outcome that we desire ? What does the customer want ? How will the customer benefit ? How stable is the software ? How happy the customers are using our software ?

The measurements should obviously follow. The employees, since the outcome is being measured, will focus on delivering the desired outcome and since this is aligned with customer value, results in a better customer alignment, more business and better profits.

So ask the right question !