Thursday, December 24, 2009

Data Visualization and Web 2.0

I love data visualization techniques. From my early days as a operations data analyst and all of my software development career, finding patterns in data and finding an easy way to convey the pattern through a graph or other visualization has always been fun. Working on custom application development projects that provided a picture of how the business was doing, where customers were spending, etc is fun. Now working with our Business Information Services clients to help create innovative approaches to information discovery and data analysis is fun. It really is true that often "a picture is worth 1,000 words."

I vividly recall a stubborn memory leak my team had been trying to track down for several weeks. This was a long time ago, in the days of VB6 COM dll's running inside ASP web pages, and we were pretty sure our code was not leaking. The team had found memory leaks before, and tracked every single one of them down to circular references in our COM object model so that the automatic release never occurred. Historically, it had been easy to find a leak by running a simple load script, executing each page thousands of time in isolation and watch to see which page shows the memory leak. But not this time. We had run the load tests several times and never found the leak. We scanned the code thoroughly. We added as many "set obj = nothing" safety lines as we could. But still the production web servers kept leaking memory, and we were forced to move the automatic restarts of the servers from weekly to daily and hope our band-aid would hold.

One day, I had some down time and decided to see if I could find a correlation between the memory used and what pages were invoked on the production system. An hour or two later, I had pulled all the IIS log files, gotten dumps of memory traces from the systems team and started my analysis. A bit of awk, grep, Access, and other quick and dirty processes later to pull out the data I wanted, adjust for timezones, aggregate hits to cumulative 15 minute buckets and otherwise line up the datasets and I was ready to plot the data.

Instantly, the answer of where the leak was was obvious. The two lines, cumulative hits to a particular URL and memory in use were nearly on top of each other. The correlation jumped out, completely overwhelming the noise of other URL's, pages, etc. This is the power of a good visualization. (Of course, it turned out that the leak was coming from a web services API proxy URL, not a page in the website that everyone had focused on! Since the proxy was not 'in' the website it had been ignored for weeks as the team hunted for the answer.)

Recently, some colleagues and i were discussing what areas Alliance Global Services provides solutions to clients in. This is a pretty broad topic, and we talked about the types of industries we serve (including our focus on Business Information Services), the geographies we serve (mostly North East US, from about Virginia to Boston), the types of services we provide (Custom Software Development,Application Architecture Analysis). And we talked about the easiest way to visualize our coverage areas.

Well, today I had a little downtime before the holidays. So I took a list of our client locations used some simple geocoding tools, and put together two quick samples of mapping in the Web 2.0 world - one using a Yahoo! map through batchgeocode.com and the other using the Google visualization API.

Batchgeocode.com made it very easy to process the first set of data and create a map but then you were stopped. Google was a different story - getting the map running required coding, but then I had full control. To see the first map, visit this blog on Alliance Global Services.

Obviously it's not perfect, but lots of fun for a quick afternoon's work!

Saturday, November 28, 2009

10 Steps To Software Development Process Improvement

I spend a lot of time workingwith customers and talking with colleagues about how to build better software,faster, and cheaper. Yes, "better, faster, cheaper" is hard,maybe impossible, but boy is it an enticing goal in custom application development. Ourdiscussions often look at various technologies that promise improvements -Silver Bullets. Every few months someone launches a new technology withpromises of huge efficiency gains, and a lot of these do offer productivitygains, after an appropriate learning curve - of course, first the productivityactually drops, as the team learns the new technology, spends time oninfrastructure setup, etc. The better, smaller options have very easylearning curves and quickly add to overall productivity, but there is always some initial drop before the the benefits begin to accrue.

We are technologists, and weare always looking for interesting new frameworks, tools, and ideas to improvethe technology we use to deliver our projects. Whether it is a newautomated static analysis tool, like CAST Software to help with application architecture analysis or abetter IDE for Java application development (likeIntelliJ) or new testing tools and frameworks to make our software testing outsourcing serviceseven better we often get excited about a "shiny new toy" that we canuse to be more productive and deliver better software every day.

Of course, these SilverBullets usually turn out the same way - a good tool, with some definitebenefit, but definitely not a game changing advance that will provide aorder-of-magnitude improvement in the quality of software developed. Themost effective way to improve the the overall software delivered - making itbetter, faster, and cheaper - is to improve the SoftwareDevelopment Process. When we sit down to work with clients, weoften talk about a list such as the following as a starting point for ensuringsuccess -- measured in terms of Business Value in the delivered software -- forour projects.

1. Focuson the top 20% of features

2. Breakthings up into smaller projects

3. Noneed for minuscule details

4. Letthe system evolve

5. Obtainuser feedback

6. Empoweryour users

7. Mistakesare a way of learning

8. Lesspeople in meetings

9. Smallerteams

10. Somethinghas to give in the Iron Pyramid (Quality, Time, Cost, Features)

It's no surprise that this list closely embodies many of the principles that the Agile development movement espouses. Agile development concepts are rooted in basic idea of trying to produce more value for an organization. In software development terms this means better software for the same cost (or faster, or cheaper, or all 3!). By having the business customer prioritize the features in the backlog, showing incremental progress each week or two, and being focused on measuring Running, Tested, Features the investment in software development is always focused on providing the best value for the business customers for the given investment.

In future posts I'll expand each of these topics further.


Saturday, November 14, 2009

IntelliJ Open Source -- Good for Java Application Development

I was excited to see the announcement that JetBrains is making the core IntelliJ platform available under an Apache 2.0 Open Source license. IntelliJ is a great IDE, and one of the main ones we use for Java application development. Having a community edition that makes the platform available more widely is great and hopefully will enable more developers to take advantage of the great quality checking and productivity tools IntelliJ has.

During a recent discussion with a client, we brainstormed ways to customize or add on to the open source IntelliJ platform to improve the software development process and improve the quality of the code produced. We quickly came up with a number of ideas for custom facets - templates for common types of classes, custom analysis rules, and specific refactorings for best practices in this particular code base. All good ideas that could make the web application development faster and result in higher quality. A good win!

Application Risk Assessment, Code Quality, and C#

I have the opportunity to review and analyze a lot of different application code bases, across a number of difference technology stacks. Some of these are custom software applications that Alliance is building or maintaining for our clients. Some of these are open source packages we are using in our work. Others are analyzed for our clients as part of our Application Assessment and software testing outsourcing solution.

One of the things that continues to surprise me is the wide variance in code quality. After seeing so many different applications, created by so many different development teams of different skill levels Iknow I shouldn't be surprised at some of the things that I see. Every now and then a particular issue jumps out as so obvious, why didn't the original developers write the code better?

One recent example involves a Application Architecture Analysiswe did for a government client using the CAST Application Intelligence tool. The application was a medium sized .NET web application connecting to an Oracle database. The application was a port of an existing Powerbuilder desktop app, and exhibited a lot of classic problems with simplistic porting. Each screen in the desktop application was directly mapped to a web screen, without regard for whether the type of navigation and state management - not to mention browser round trips - made sense in a web application.

At the start of the engagement, the client identified that they had concerns around the correct handling of database connections. CAST is a great tool for finding problems like this in custom .NET applications, as the .NET analyzer is able to identify specific methods in which a database connection is opened but not closed. It's certainly much faster and more accurate to browse an easy-to-use Dashboard pointing to the exact 18 locations in the 400,000 Lines of Code (400 kLOC) that should be checked rather than trying to manually search ASP.NET pages and code behind files, dozens of VS.NET projects, and hundreds of C# classes, to find the needle in this haystack.

After identifying and fixing the specific problems, the database connection leaks were fixed, and the application was able to proceed through user testing. A perfect quick win for Application Risk Assessment & Code Quality analysis.

One of the surprising (yeah, I know, I shouldn't be surprised!) things in this case was that in many parts of the code base the developers had taken advantage of the "using" keyword in C# to automatically perform resource management of the database connection. This is a much simpler approach than trying to enforce correct usage of finally{} blocks for resource clean up and has been available for several versions and many years. Yet many developers are not aware of it, and do not use it regularly! This is a simple best practice that would enable higher quality code with less effort and fewer defects.

And that means higher productivity. Sounds like an obvious win to me!

These are the types of best practices we teach to our teams as part of our RightWare Software Development Process.

Friday, November 6, 2009

Automation in Software Development

There are many discussions about productivity and ways to increase quality in software development. There is no single magic bullet, but by far the most important overall technique I've ever seen is to aggressively automate the software development process and overall lifecycle.

This is a broad topic and covers many different specific areas such as automated unit testing that can also serve as the foundation for automated regression testing. Automation includes the build & deployment process. Automation includes functional testing and acceptance testing. It includes monitoring and error alerting. It includes code quality analysis and compliance checks. It includes providing self-test harnesses to prove an environment is configured correctly. It can also include automatically generating code or portions of an application as part of the software development process.

Without fail, when we look across our custom application development projects the ones that have implemented a significant amount of automation are able to deliver more value (in terms of running, tested features) per time or per unit cost than those that have skimped on automation. This holds across technologies and types of systems and holds across various team sizes.

Why does automation play such a big role in improving Application Quality and Productivity? It's fairly simple and relates to key ideas in Lean Development and other "manufacturing" optimization strategies. Automation enables the whole software development team (including the development engineers, testers, operations, and business sponsors) to focus on adding specific value and avoid waste by:

  • Eliminating repetitive, low-value tasks - freeing time for knowledge creating work
  • Lowering QA and bug fixing effort by finding errors sooner through executing test suites more frequently
  • Enabling more complete test coverage by executing test suites automatically not through brute force
  • Enabling faster, moge agile, development by providing a robust safety net to catch problems sooner
  • Eliminating wasted time chasing configuration problems by reliably producing builds and deploying to all environments consistently
  • Enabling new team members to contribute value sooner by speeding the creation of new environments and providing a framework to show how the system works
  • Producing higher quality and lower maintenance systems by automating redundant code or module development
  • Converting time spent on routine monitoring to value-add investigations in to problems
  • Quickly alerting operations and development support teams to problems with running systems
Teams that focus on keeping the unit tests Green, look at their code coverage metrics, and know the current production system performance characteristics can spot problems before customers do. Teams that continually look for ways to streamline their tasks and ensure that any common process can be done through one command or one click have fewer problems and spend much more of their time building new features rather than solving the same silly problems over and over. Teams that ensure the basic smoke testing can be run by every engineer every day know that many fewer bugs will creep in to the testing environments and less time will be wasted with back and forth rework. These are the traits of successful teams that provide more value to the business and are truly successful!

Expert Knowledge Model and Custom Application Development

I recently read Andy Hunt's book "Pragmatic Thinking and Learning." It's a very interesting book combining ideas about cognitive science, management, software development, and personal development. (It's also well written and fun to read)

The 2nd chapter discusses the Dreyfus skill model, and the "Journey from Novice to Expert" along five stages of skill development. This is relevant for any skill, but especially for technical skills requiring years of practice to achieve a suitable level of mastery. Certainly different people are capable of learning at different rates, and some people are able to actively learn and apply new concepts while others seem to repeat the exercise but not learn the concept (the classic interviewing criteria of actually having 10-years of experience, rather than 1-year of experience 10 times over), but overall when looking at a community or organization (or team) recognizing the different skill levels and capabilities associated with them is critical.

For instance, many agile teams have individual developers designing the implementation for a specific user story (or more accurately, a task as part of a user story). By recognizing that a particular developer may be a Novice and a different team member may be Proficient the team can change the dynamic around the design effort when these two are paired together. The Proficient team member can act as a mentor for the Novice, more carefully reviewing and enhancing any designs the Novice developer creates and fully explaining any designs the Proficient developer creates as a teaching exercise. This enables the peer review process to work as both a technical review exercise and as a learning process. Similarly, the team may recognize that it does not make sense for a Novice developer to peer review an architecture that is created by the Expert member of the team - it would be more appropriate for someone who is Competent or Proficient to peer review, as they will more easily understand the design and be able to more easily offer suggestions to improve the design rather than feeling overwhelmed or lost when reviewing the design.

Additionally, when establishing the engineering practices for a custom application development effort, a team of all Novices or even mostly Advanced Beginners should not be expected to realize the best approach to continuous integration, automated unit testing, and test coverage. A more knowledgeable mentor should help the team implement the best practices, teaching as they go, but insisting on the appropriate integration and automation techniques to ensure high quality.

And of course, the skill model applies equally to the technical and business domain aspects. It is important for both the individual developers, managers, testers, and architects to realize that although they might be Expert at their technical craft, they may be Novices in the business domain of their current application. Once this realization occurs, the team members should seek out additional information on the business domain - through web sites, books, the Product Owner or business sponsor, or other more knowledgeable members of their organization. This type of self realization that you "know what you don't know" is a key step towards ensuring you are always delivering high quality, useful,valuable software.

Saturday, October 10, 2009

Custom Application Development Best Practices

A colleague of mine read my recent post about code quality and reminded me of the first time he read my musings about code quality, and work habits. He pulled up a team Best Practices document I had written back in 2001 for one of our first .net application development projects, and it's great to see how much is still 100% relevant today.

Work Habits

  • Do it now!
If you see a change that you need to make, it's best to do it right now! It is very unlikely you'll get a chance to come back to it later. Plus, other people will begin to code with your bad code/names/etc. and the change that is needed tomorrow will be bigger than the change needed today. If you cannot do it right now, add it to your task list (in Outlook) and set a reminder for tomorrow morning. That way you will have another chance at it.
  • Finish one task before starting another!
In general, it will be much better to be 100% complete with one piece of code/functionality than 50% complete with 2 or 3. You are focused, and able to actually deliver a fully working, tested piece of functionality to the client, rather than having 2 or 3 half-broken, half-completed, half-baked ideas floating around. Note, this is a bit in conflict with the previous point, and that is a little intentional.

  • Understand an entire function/process before editing it!
When you are about to make a change, enhancement, or bug fix to a routine, read it first. Understand what the code is supposed to do (hopefully explained in the comments at the top) and then how it is (or isn't) actually executing. Making changes with only ½ understanding is guaranteed to cause additional bugs that have to be fixed later.

  • Keep it simple!
Keep your code as simple as possible. Keep it maintainable, readable, and easy to debug. Most of the time on our project(s) is spent in QA and debugging. Most of the project is fairly basic - read some data from a database, modify it, and write it back out. If we can do this 80% of the project simply and correctly we can spend more on the interesting fun parts. If we make it complicated and buggy, we will spend 120% of our time fixing this easy section!!

  • Optimize at the end, after it works
Write clean and simple code. Understand what your piece of code needs to do, how many users will use it, and how quickly it needs to execute. Plan for this, and code for it. Make all your code work. Then find the slow spots and, if they need it, optimize them.
Eight years later I might choose to add bullets about automation or unit testing or investing time & effort in the software not the design doc but the fundamentals are still the same!

Code Quality and Software Metrics

All good developers have a sense for Good Quality Code. They may call it "clean code" or talk about it is easy to maintain. When code is not good, they talk about "code smells" or "ugly code" or that it is simply "unreadable". Good developers have this sense, even when "good" is not strictly defined and is not measurable. Good developers go out of their way to keep the code that they work on clean, maintainable, and easy to read. Because they know they (or one of their colleagues) will be reading that code sometime in the future trying to figure out what it does and why that darn bug has slipped in to it.

About ten years ago the book The Pragmatic Programmer recounted studies about the effect of visible defects (a broken window) and the way people behaved in terms of caring about their surroundings. The findings apply directly to software quality as the developers on a project typically look to the existing code base to figure out the style of code to implement. In the worst case, a developer will start a new web page, batch job, stored procedure, or other module by simply copying an existing one and hacking up the code to do the new feature. In a typical case a developer will look for references in the existing modules to see "how is that done on this project?" So leaving the "broken windows" in your code base quickly leads to more broken windows as the ugly code is copied or used as a reference and more windows are broken, more graffiti is sprayed on the walls. Once the code base is littered with this low quality code, it's hard for a developer without a very strong internal sense of what is good to tell what is considered good on this project.

By paying attention to the little details, by setting coding standards and making sure people follow them, by requiring intelligent, useful comments in each module describing its purpose the tech lead or the team as a whole sends the message that they want to work in a clean, safe, easy-to-move-around-in environment. They don't want to work in a littered, trashy, graffiti covered neighborhood with broken glass everywhere. They care about the quality of the code.

One of the things I love about using automated tools to check the entire code base, every build (or at least every week) and is that it makes it possible to check for all the little details every build. In most business applications it is rare for every line of every module to be code reviewed. With an automated tool, it's easy - in fact it's automated! Each build produces clear, objective metrics about the quality of the code base.

The benefits of paying attention to little details, of sending the message that you care about the quality of the application code ripple through the entire application maintenance cycle. Developers are more productive, because the code is easy to read and understand. New developers are able to learn the code base more quickly because the code is clean and commented. Fewer errors are made because methods are short and implement a single function in an easily understood way. Global state, side effects, and tricks are not present in the code to cause trouble when their use is not fully understood by the next developer to modify them. Bugs that do slip through or odd corner cases that seem to only occur on one server are easy to track down because the code base has useful logging, has defensive checks, has error handling with relevant error messages, and only uses data values near to where they are populated.

A powerful psychological benefit for good developers when using an automated tool is that the entire team can see the quality metrics - ideally published on a regular basis and taped up on the wall of the team room - and take pride in watching the quality level go up as new features are added andsmelly code is cleaned up. The team can know that not only are they making the application work today - they are making something of good quality that will keep working in the future.

Importance of Business Domain Understanding

As a "dweeb", I love learning about new technologies. New programming languages, new frameworks, new Open Source tools. I love talking with colleagues about how we can use these great new technologies to solve interesting problems Faster, Better, Cheaper in order to make our clients happy. When working with one of our Outsourced Product Development clients or speaking with an Business Information Services prospect about how Hadoop can be used to streamline content processing it sometimes seems like that technology is important in and of itself.

But, sorry to say, the technology is not important by itself. And being an expert in a given technology is not enough. The true value that a "developer" or technology consultant brings is the ability to solve business problems, often by using technology in innovative ways. In order to solve those business problems, you have to understand the business domain.

Having an understanding of the business, the associated terminology, how the business model works, how the business makes a profit, how the proposed solution will add to that profit is critical in making all the little tradeoffs that must be made each day while designing and implementing a solution. In our RIGHTLine(TM) software development process we identify the need to have Knowledge Acquisition as part of the software development lifecycle. It's not enough to document requirements in a Business Requirements Document, the people writing and reading the requirements also have to have a shared understanding of the context that the solution and the requirements apply to.

When you try to apply a more agile development methodology, with more decisions made by the developers during implementation, this knowledge of the business becomes even more critical. A quote from one of our clients summarizes this well, he wants to have "ad server developers, not Java developers."