Project management is problem management

Change is inevitable in every human endeavor. Being a good software project manager is an inherently challenging role to accomplish as it encompasses forming reliable speculations about future events that are uncertain and also tackling unwanted changes or problematic situations that are certain. Software project management on high-level abstraction involves balancing the software economics with strategy and on low-level abstraction about solving everyday problems with tactics. The primary reasons behind problems encountered in software projects are evidently humane, rather than purely technical. Since problems occur mostly due to human factors, software processes advocating the agile manifesto can be eventually beneficial. The realization that problems are bound to occur in projects is critical to software project management. Experiential knowledge, pro-activity, the ability to adapt to change, a formidable combination of strategic and tactical prowess, and objectivity are some qualities ideal to project managers who belong to the software industry.

Read full document here.

On change propagation from one software entity to another


Predicting Change Propagation in Software Systems

By Ahmed E. Hassan and Richard C. Holt



The paper addresses how a change in an entity of a software system is propagated to other entities. It explores the importance of better change propagation to maintain consistency among interdependent entities in a system. A heuristics based approach is taken to overcome the challenges associated with change propagation. Source control systems are discussed in this regard, which records changes occurring at a file level but do not track changes necessary in entities depended on that change. The authors tried to fill this gap by working at the source code entity level to capture entity level changes such as addition and modification of a function and to track changes in dependencies between entities. The authors describe change propagation task as an iterative process involving a developer consulting an expert if unable to detect entity to change. The performance of the heuristics is based on two techniques – Recall and Precision. An example scenario is presented to depict the heuristic in action, showing formulation of a change set for a new bug fixing problem a developer is working on, who interacts with the heuristics and a human expert. Each heuristic is characterized by two aspects – heuristic data sources that aim to reduce the number of predictions for entities, and pruning techniques that is used to improve precision of a heuristic. To validate the proposed heuristics the researchers studied 5 systems developed for over 10 years and with 40 years of historical data. Based on analyzing 4 heuristics the authors suggest that historical information and code layout are better than code structure to help propagate changes. The researchers conclude that the historical co-change records can help derive more heuristics and encouraged more research to derive sophisticated heuristics for better change propagation.


1. The paper is a good source for researchers interested in the empirical analysis of software change propagation. The question “how change propagates through software entities” has been approached in a simple way which is helpful, even if the question itself is complicated.

2. The example scenario provided to describe how the heuristics for change propagation are measured is a strong point as it successfully conveyed the authors’ ideas.

3. The authors demonstrate the results of the experiment with a comprehensive set of data with tables, graphs and definitions. They use 5 software products developed and maintained over a considerable amount of time, which is a strong support to their findings.


With due respect to authors, here are several weaknesses that may be considered-

1. The researchers simplify their process based on the assumption that developer will only query heuristics for entities that are already suggested, not any random entity. This defies real world scenario. One can consider any entity that he/she thinks is change propagation candidate.

2. When defining the recall and precision values it is unclear why their values are set to 1 if there is no predicted entity. It would be nice if the mathematical background were provided for this decision.

3. None of the studied systems has UI and the results would be different for systems with UI. This is a limitation of the experiment, given that UI is common in software systems today.




Fine-Grained Analysis of Change Couplings

By Beat Fluri, Harald C. Gall, and Martin Pinzger



A Change Coupling involves two files that are committed at the same time by the same author and with the commit description. In this paper, the authors focus on structural changes between such commits and answer if the majority of change couplings involve structural change. They use Eclipse IDE to retrieve useful information from release history data and filter those couplings specifically caused by structural changes. The process involves Eclipse integration with version control system, retrieving all modification reports, storing them in a Release History Database (RHDB). A major part involves two concurrently running subprocesses – to cluster change coupling groups and to structurally compare two consecutively committed source files. To cluster changes the authors take the relation analysis approach and store the clusters in matrices. The columns in such a matrix are the revision vectors. Special emphasis is put on detecting frequently committed change couples, which is done by observing the revision vectors. At the same time in the process, to structurally compare two consecutively committed files the authors use Eclipse Compare plugin, which works in two steps: it converts an entity into a synthesized format suitable for a differencing algorithm. The differences are then saved in the RHDB. The researchers merge these two types of information and then extract change couples based on structural change with the help of change coupling cluster browser. They perform the filtering process on Eclipse compare plugin as a case study, and find that more than half of the change couplings are not caused by structural changes. This finding prompts the authors to perform more case studies. The authors also plan to work on detecting pattern in structural change sets.

Visualizing software programs with Codecrawler and Code City

Code City Screenshot / Image:

CodeCrawler – An Information Visualization Tool for Program Comprehension

By Michele Lanza, Stephane Ducasse, Harald Gall, Martin Pinzger


The paper introduces an information visualization tool called CodeCrawler that can be used for program comprehension and problem detection. CodeCrawler takes a lightweight approach and built with Smalltalk on top of Moose – a language-independent reengineering environment. Moose implements FAMIX, a metamodel suitable for modeling programs, especially ones written in object oriented languages. It uses Hotdraw to generate 2D visualizations. Furthermore, it has a three layered architecture style- the core, the metamodel, and a visualization engine. The authors emphasize on Polymetric view, which involves representing entities as nodes and relationships as edges to describe the semantics of a program. Metrics such as no. of methods, attributes are assigned to each node to correlate with height, width, color and co-ordinate positions of the nodes. The authors applied CodeCrawler on itself to obtain a system complexity view using node width for no. of attributes, height for no. of methods, and color for LOC. They also analyzed separately on three categorical views: Coarse-Grained View to inspect system hotspots and suitable for large systems; the Fine-Grained View to inspect various aspects such as class blueprint, relationships between entities; and lastly the evolutionary view, to inspect evolution of software. Additionally they analyze Coupling View to see the difference in software between releases. With CodeCrawler the difference is not easily graspable so a difference graph is constructed.

Program Comprehension through Software Habitability

By Richard Wettel and Michele Lanza



Visualization can help developers to comprehend large software programs by detection of patterns. 2D visualization is widespread in this regard; they provide easy navigation, interaction and construction of visualizations but lack a sense of physical space. On the other hand, 3D representation would in turn make navigation and interaction harder to achieve. The authors argue the prospect of using 3D to induce a sense of software enhanced reality which will enable developers to walk through a program, as if they are in a physical entity akin to real world, like being at home. They stretch the notion by emphasizing on the underestimated concept of Habitability applying locality to a virtual space using the City Metaphor. Code City is a program that can be used to visualize a program as a city with suitable level of interaction. It is applied to visualize two software products: ArgoUML and Azureus. Classes and interfaces are depicted as buildings so that their height map to no. of methods, width to no. of attributes and such. The conventional linear view uses real metric values to depict the city but it seldom creates unrealistic view with extreme figures, thus reducing habitability. On the other hand, Box-plotting based and Threshold-based mapping are used to depict a more realistic view and upon selecting types of buildings in the model provides real metrics. The authors conclude that to induce sense of Habitability, box-plotting and threshold based model generation is more suitable. Some related tools are analyzed too, and the authors consider implementing a landscape view in the future in addition to the city view.


Below are 3 strong points about the research in my opinion-

1. The decision to visualize Object Oriented Programs/methodology is a strong point because it conforms to the current trend in software industry. It is a good thing that Moose supports Java, C++ and Python.

2. Representing a software program with real world metaphor, aided with the concept of Habitability and Locality is a good initiative to help engineers understand large software systems better.

3. Comparing to other works in this area, Code City as a tool uses classes or interfaces as references to buildings, both of which can be seen as somewhat atomic structures in two different worlds. While other tools used methods as buildings or classes as whole districts, the level of granularity used for code city seems well suited.


With due respect to authors, here are several weaknesses that may be considered-

1. LOC for both ArgoUML and Azureus seems too less, in accordance of present day assumption of LOC for a large software system.

2. Why Code City is a better alternative to UML is not clear. UML is a widely supported and used standard to visualize software programs. While Code City is limited to understanding only, UML can be used for all stages of a software project for the purpose of visualization.

3. As the authors mentioned, lower level representation is needed to improve Code City.

How much time software developers spend to understand program code?

Quantifying Program Comprehension with Interaction Data
By Roberto Minelli, Andrea Mocci, Michele Lanza and Takashi Kobayashi

The paper aims to answer an often-underestimated question: whether or not program comprehension occupies a large part of the software development process. The approach is to quantify comprehension time by analyzing data obtained from sessions – during which developers interact with the interface of an IDE. 4 types of interaction events – Inspecting, Editing, Navigating, and Understanding are recorded in sessions attended by 15 Java developers and 7 Smalltalk developers. The events are captured by DFLOW and PLOG plugins for Pharo and Eclipse IDEs respectively. An estimation model is used to quantify percentage of time for each interaction event. The results suggest that percentage of comprehension time for Smalltalk program ranges from 54% to 88% while for Java the range is between 56% and 94%. The variation of percentages between them is 2% on the minimum and 6% on the maximum limit. Based on the interaction history, characteristics of individual developer are deducted too. A developer can be cautious, spending more time understanding before editing; while others may have a more aggressive approach, or exhibit a different interaction pattern. However, it is yet uncertain how experience can affect comprehension directly. The authors believe more research, an improved estimation model, and better tools than can capture micro-level activities accurately would contribute to their hypotheses. Nonetheless, it is evident that program comprehension occupies more time in software development process than previously believed.

An Exploratory Study of How Developers Seek, Relate, and Collect Relevant Information during Software Maintenance Tasks
By Andrew J. Ko, Brad A. Myers, Michael J. Coblenz, and Htet Htet Aung


The researchers behind this publication aimed to qualitatively and quantitatively gather evidence that could lead to discover patterns of how they seek, relate and collect information when doing software maintenance tasks. This is done by recording 70 minute development session for 10 Java developers, who were invited to solve 5 problems in a small paint application. Two of those five problems were big-fixing and three were enhancement related. To impose real life work environment, the sessions featured interruptions- developers had to solve simple mathematical problems every three minutes as part of distraction from their main tasks. The sessions were transcribed and analyzed with regard to factors such as time spent on tasks, success rate, and sequence of actions performed by the developers. The results yield to a new model of program understanding from the researchers which features three main tasks – searching, relating, and collecting, where relating involves a cascading effect on information mining to link one conclusion to another, and thus eventually leading to a solution. The researchers propose a few measures that can be helpful to developers to generate actions with more relevancy when understanding, as well as suggesting some UI enhancements for Eclipse. Tools such as Hipikat and FEAT can also improve navigation and find relevant dependency between program components.

1. Thorough human inspection enable to analyze situations more effectively, given program understanding is a cognitive process.

2. Attempt to improve the process of seeking relevant information is one of the strongest aspects of this paper. It brings some great insights based on human inspection.

3. The paper is most probably a great reference as an extension of Information Foraging to the area of Software Engineering.

With due respect to authors, here are several weaknesses that may be considered-

1. The assumption that developers are distracted every 3 minutes may be an overestimation. In good working environments offered to developers these days a lot of effort is there to maximize productivity. They are not distracted that much frequently in my opinion.

2. Original workflow during the sessions is most probably hampered because the developers are told all of their activities would be recorded. This might have created some sort of hesitation or uneasiness. Moreover, fining for wrong answers seems like an unsuitable mimic of real life penalty for misinformation.

3. Absence of concrete mathematical model behind the experiment is a downside. 5% error rate may be negligible for a team of 10 developers but it may cause significant misinformation if the experiment was on say, 200 developers. May be higher accuracy automated tools would be better.

Critique: Models for Comprehension of a Software Program

image credit:
“To walk through a program…” / Image:


Program Comprehension during Software Maintenance and Evolution

Armeliese von Mayrhauser, A. Marie Vans
Colorado State University

The paper aims to provide in-depth analysis of most notable program cognition models as well detailed comparison among them. The authors also introduce Integrated Metamodel to eventually improve the process of software maintenance and evolution. It identifies 5 fundamental tasks associated with software maintenance and evolution – Adaptive, Perfective and Corrective Maintenance; Code Reuse, and Code Leverage. The common activities among these tasks, such as Knowledge, Formulation of Mental Model are devised for better understanding of comprehension. The static elements of mental models formed by programmers include Plans, Chunks, and Text-Structure; and Dynamic elements of the mental model such as Cross Referencing are introduced. The most commonly known cognition models, which have been developed to understand cognition process behind the 5 fundamental tasks, are discussed thoroughly. They are Letovsky, Shneiderman and Mayer, Brooks, Soloway-Adelson-Ehrlich Models. The Integrated Metamodel, which seems the ideal for incorporating both top-down and bottom-up comprehension, is based on cognitive elements that also appear in models proposed by Pennington and Soloway, Adelson, and Ehrlich’s models. The authors emphasize on more research in related areas to aid comprehension of relatively larger-scale code or software systems. By understanding how programmers understand code is critical for building better tools, processes and guidelines to support the cognitive process of program comprehension.


Cognitive Design Elements to Support the Construction of a Mental Model during Software Exploration

M.-A.D. Storey, F.D. Fracchia, H.A. Muller


The paper is an attempt to analyze in detail the most commonly known cognitive models for program comprehension and suggests important cognitive design elements that should be considered when building software comprehension tools. It introduces several techniques such as reverse engineering, which uses information exploration as the key factor to gradually understand a higher-level abstraction of the program structure. It signifies coherence in visualizing programs and discusses use of tools for program visualization. The tools are classified based on factors such as program scope, interaction type, content types that can be visualized. A framework, which is based on cognitive design elements, is used by the authors to evaluate a tool called SHriMP. In the document, Top-Down and Bottom-Up Comprehensions, Knowledge based Understanding, Systematic and As-Needed Models are models discussed in details. However, there are factors that influence comprehension independent of those models, such as program size, complexity and very importantly, experience. To build a better exploration tool, a set of cognitive design elements for software exploration are analyzed for several tools. Some challenges regarding these tools are – establishing semantic and syntactic relations between software objects, providing abstraction mechanism. The ideal interface of such a tool has been analyzed by conducting user stories on SHriMP. The results of the experiment call for further research.


1. Support of a combination of top down and bottom up seems most suitable method of comprehension, since there are different kinds of programmers and in my opinion no one actually follows either one of them, but a combination of both.
2. The tree structure of the cognitive design elements is very helpful as it enables visualization in a hierarchical structure and provides a big picture.
3. Very good analysis of various program comprehension tools in use today on the basis of seven cognitive design elements. It can serve as a guide to anyone who wants to try such tools.


With due respect to authors, here are several weaknesses that may be considered-

1. Less emphasis on experience, which in my opinion is a major factor. Experience should be regarded as one of the factors and it would be great to read about experiments on SHriMP users of varying programming experience.
2. Although the different cognitive models are discussed in details, some amount of concrete mathematical data to support the theories would help to classify those theories according to correctness.
3. What is to be done to achieve semantic and syntactic relations is mentioned but not how to achieve it. But authors opinion on how would be helpful.