Learn How to Ask the Right Questions!

ask-question-2One thing that sets high-performing individuals and teams apart from lower performing teams is experience. Experience is a shortcut. It saves time. It helps you skip the time and effort required to research a particular problem. Experience lets you cut directly to the solution.

Experience allows you to skip the questions.

Unfortunately, experience can also give you a false sense of security. You feel secure that you know all of the answers to the relevant questions, that you know the complete context of the problem. You assume that nothing has changed since you have last evaluated the problem space.

That assumption can be wrong, and detrimental to your work.

A Short Story

storytellerSo, once upon a time, about a month or so ago, I was running a hackathon with a team of developers. They were a great team, coming together for the first time in the organization’s history to solve a problem agilely, that is to say, all hands on deck – developers, testers, operations engineers, and the product owner sitting together – working together towards a shared common goal, to the exclusion of all other unrelated tasks.

At some point we realized that we have a dilemma regarding the form of the output of our feature. The default way, the way that their primary customer has established with them a long time ago, and the way that they have been working ever since, was to provide a full “baseline” dataset, in the form of a CSV file, and then provide daily “modification files”, i.e. files that add new records or remove the old ones.

The problem was that we now have to support not just the primary customer but many other customers, some so small that they have little in the way of an IT department capable of writing a process to take monthly baselines and apply daily changes automatically. The solution that some of the team came up with was to send them full daily datasets. Each day they were to use the new CSV as the sole source of truth. These small customers could consume the full daily files with Excel, with no additional IT work needed.

Of course, a lively discussion ensued. Part of the team said that we cannot change how things work because our primary customer relies on it. The other half argued that the many smaller new customers simply cannot deal with the “old way”.

After letting them discuss it for a while I proposed we call the customer representative. The team leader did. After a short introduction, referring to the new feature (which they’ve previously discussed), the team leader, a proponent of the old solution asked “do you want us to send you the files in the same way we did before with the other features, that is monthly baseline with daily changes?” The customer representative said that “yes, it would be fine”.

I thought about it for a moment. Our many new smaller customers needed a different solution. We needed to reframe the question. I asked the team leader to relay the following question to the customer: “Would it be acceptable if instead of using the old method we’ve had before, for this feature we will send full daily datasets?”

The customer representative’s response was, and I believe I am quoting her verbatim, as follows:

“Oh my god, yes! That would be so much simpler for us to implement on our end!”

High-fives and fist bumps were had across the entire room. Team spirits went up dramatically.

The Moral

man-looking-to-question-markThe team leader was very experienced with the system we were modifying, and very familiar with the customer. They talked with one another regularly. This blinded him from the simple truth that while the current solution was acceptable, a better solution would be greatly appreciated.

The rule I drew from this experience is this – if you want to change something, never immediately assume that current conditions are immutable. If you’ve never discussed the possibility to change, you cannot know whether others – your partners, customers, teammates, or managers – would resist, or the reverse – enthusiastically gush and support the change. Perhaps they feel the same pain that you do, the pain driving your desire for change, and merely lack the initiative to stand up and do something about it.

In short, don’t hesitate to ask. Phrase the question in a way that benefits you, and highlights the change you wish to make. It cannot hurt. You might be pleasantly surprised with the results!

Posted in Uncategorized | Leave a comment

4 Ways to Integrate Development and Operations Efforts

virtuous-cycleSo you’ve heard about DevOps. That is great. You’ve decided that your organization could really benefit from a “DevOps transformation”. Even better. You’ve even gone above and beyond, and memorized the following definition for DevOps:

DevOps is the union of people, process and technology to enable continuous delivery of value to customers.

Awesome.

But what should you do next? How do you deal with the fact that the developers in your organization cannot or will not own the process of delivering into production? How do you deal with the fact that your operations engineers cannot or do not fully understand the software to which they have been given stewardship?

How do you get your “Dev” and “Ops” to work together? How do you become a “DevOps” organization?!?

Method #1: You Build It – You Operate It!

build-it-yourselfIf your developers are already, among other things, for whichever reasons, tasked with deploying software, monitoring it in production, and dealing with live site incidents (LSIs), then congratulations, you are already half way there. More than half, really. The organizational structure, tearing down the walls between “Dev” and “Ops” is as difficult as it was about a decade ago, when the Agile movement introduced the notion of tearing down the walls between testers and programmers (a.k.a. “Dev” and “QA”), as well as that between developers and business (i.e. Scrum teams consisting both developers and business people). It is even more difficult in some organizations, where developers report to the VP of R&D, while operation engineers report to the VP of Operations; the first common manager who connects both departments is the CEO. Not the person who should be burdened with getting these two realms to cooperate.

Other organizations are not so lucky. But do not despair; the following methods describe how to enable an organization’s DevOps efforts through collaboration of separate “Development” and “Operations” departments/teams.

Method #2: Embedded Ops

400px-PuzzleIn an organization where development and operations exist in two separate hierarchies (a.k.a. “silos”), if you have enough operations engineers, you should assign – embed – one in each development team. This engineer would be responsible to field all of their development team’s questions and requests, and would be responsible to help the development team deliver their software into production.

This setup has the benefit of being able to create self-managing teams who ultimately own the responsibility of deploying software to production, and making their product “ops-friendly” (e.g. automatically testable, deployable, include monitoring hooks, etc.), while maintaining a governed center of excellence (COE) that defines how software should be built, deployed, tested, and monitored.

If you have more operations engineers than development teams (yes, I know how unlikely that is…), the remaining engineers can focus on cross-cutting concerns such as monitoring overall systemwide health, and creating tools to be used by the teams for their operational efforts.

The embedded Ops engineers can, instead of merely handling the operational workload for their respective teams, help the developers cross-train so that they can become more independent, allowing such teams to move towards the first method. This can free up the embedded engineer to rejoin the main Ops team, or to move on to helping a newly formed team, or an existing team that requires extra attention.

Method #3: Liaison Ops

juggleMany organizations, like those described in the previous section have separate “dev” and “ops” departments. Unlike the aforementioned group, however, these organizations do not have enough operations engineers to dedicate to each development team. In this case, we want to put each Ops engineer in charge of serving several teams. This engineer will have to help each team, and balance their workload with the needs of all of the teams with whom they are liaising. The liaising operations engineer will have the same responsibilities and roles as do the embedded engineers; the only difference is in the number of teams they support.

Sooner or later, liaising Ops engineers become bottlenecks, or at least find scheduling to be difficult. For this reason, if the embedded Ops engineers have the option to train their teams to operational independence, the liaising ops must make training their developers for independence a high priority.

A question that invariably comes up is “how many teams can a liaising operations engineer support?” The answer is, of course, “it depends”. It depends on both the engineer and the teams.

Method #4: The Operational Service Team

SelfcheckoutLowe'sMeyerlandSooner or later, organizations with embedded or liaising operations engineers want to – or must – gravitate towards the servicing model. In this model, the operations team does not support individual development teams directly, except in extreme cases. If an engineer from this operations team has to help a development team deploy their code or debug it in production, then there was a procedural failure at some point.

The role of the service team is to create tools, skills, and knowledge that enable development teams to deploy and own their code in production. This may include creating scripts and templates that automate the build and release processes, setting up automated security and quality checks as gates for continuous delivery, setting up dashboards, system health checks, etc. If it is unique to a team’s product, the team owns it. If it is a cross-cutting concern, the ops team owns it. The ops team provisions environments, or enables teams to provision environments within guidelines and constraints; the developers consume the services.

The Operational Service Team may also be responsible to create DevOps-related knowledge for the teams to consume. This may come in the form of any thing from wiki pages to providing formal or ad-hoc training.

Note that this method is very similar to the first method. In both cases, the development teams are responsible for their code all the way in to production. The primary difference, is that in the first method, there is no operations team; in the fourth (this) method, the operations team serves as a center of excellence.

Summary

Hopefully, this post will help you determine where your organization is in its DevOps journey, and help you figure out where you want to be.

Good luck, and safe journeys,

Assaf

Posted in Uncategorized | Leave a comment

DevOps and Your Definition of Done

its-over-its-finally-overRegardless of the agile methodology you are using to drive your software development efforts, you should have an explicit definition of done. I say explicitly, because you will always have one – whether you define it or not. Regardless of your process, even if you are following a Waterfall based process, the level of quality you demand (or allow) your software to reach before you ship it is your definition of done.

Explicitly defining what Done means to your organization helps communication and collaboration, and allows you to set the bar for quality, as well as drive your process improvement efforts.

In this post I will provide some guidance on how to use your Definition of Done to drive collaboration efforts between your developers and the operations engineers, in an organization that is trying to adopt a DevOps mindset.

DevOps

devops-cycleMicrosoft’s very own Donovan Brown gave us what I view as the best definition of what DevOps is. Even if you’ve heard it before, I believe that it bears repeating, in order to drive the point of this topic:

DevOps is the union of people, processes, and product to enable continuous delivery of value to the users

An organization trying to adopt a DevOps mentality should have every single member’s job be defined by that statement.

The Definition of Done

done_r_hiThere are many definitions for the Definition of Done (ironic, I know). I’m rather partial to Scrum.org’s definition, in the official Scrum Guide:

When a Product Backlog item or an Increment is described as “Done”, everyone must understand what “Done” means. Although this may vary significantly per Scrum Team, members must have a shared understanding of what it means for work to be complete, to ensure transparency

The emphasis is my own.

What most development teams end up doing is coming up with a laundry list of quality demands like the following:

  • Code complete
  • Unit tests code coverage is <insert some number here>% or higher
  • Automated build
  • QA Tested

Yours may have a few others, or missing some, or have minor variations on the same theme. The result is often disjointed, where some items may be nice and lofty asperations, but not achievable by the organization and the resources and knowledge they currently have.

Done for DevOps

ask-question-2DevOps, among other things, is about collaboration, and a shared responsibility for delivering features into production. The product’s Definition of Done should reflect this.

What I prefer to do is to use directed questions to drive the teams’ Definition of Done. These questions are, of course, asked with the aforementioned DevOps definition in mind.

The following are some examples of shared questions that drive this point and (hopefully) start driving the change in mentality required for success as a DevOps organization.

Start a conversation with the delivery team (developers, testers, ops, business unit, i.e. anyone responsible for the delivery of the product) during your next retrospective, post mortem, or whenever you discuss process improvement options, and ask one or more of these questions.

What must we do to ensure continuous delivery of these features?

The answers to this question is this question are as much the responsibility of developers as it is the testers and the operations engineers. Deploying in small batches of changes, architecting the solution in a certain way that makes (automated) deployment easier, using certain development practices, such as the use of feature flags, test automation, setting up infrastructure as code, and deployment patterns such as blue/green deployment increase the ease and likelihood of successful deployment.

Note that some of the aforementioned practices are in the hands of the developers, some in the hands of testers, and others in the hands of operations.

Collaboration is key.

What must we do to ensure that features delivered are valuable?

Let’s face it – ask 10 development teams why they are building their current feature, and at least 9 of them will answer “because it’s in the backlog”, “because my manager said so”, or “huh?”

The fact that our stakeholders, sponsors, product owner, project manager, or team leader, or CEO requested or even demanded a feature, may be a good enough reason to do something, but they are not all-knowing. We don’t know that the features will in fact be valuable to our business.

For an internal app, we will probably want to know if and how the new change affects time to complete a transaction or error rates. What measures could be put in place to prove that?

For commercial apps, we will probably want to know if and how the new change affects conversions, sales, consumption, etc.

For public sector apps, we might want to know how the new change affects consumption, speed of use, efficiency, backend costs, etc.

The business unit should provide these metrics. Developers must develop the features with hooks to enable measuring these things. Testers must verify that the metrics provide the expected data, and operations must monitor these.

What must we do to ensure that features are working properly?

Yes, of course you need to test your system, ideally with automated tests, topped off by some manual exploratory tests by your testers. Unfortunately, that is not enough.

You want to be able to know that your system is working in production. You want to be able to identify problems before they affect your users.

With the right measures and performance counters in production, monitoring not only your diskspace, memory consumption, and other hardware concerns, but also how your business scenarios are performing.

Make sure your definition of done asks questions such as “What do I need to measure in order to identify problems before our users are affected?”

Defects will escape your delivery teams’ quality processes. That’s a fact. Make sure they do not repeat themselves by asking yourself something like “What can I monitor to guarantee that this problem doesn’t show up again?”

Be sure to not only ask these questions; make sure you also answer them, and implement the hooks that will allow you to monitor these.

Summary

These are but a few conversation starters. Asking them is important. Answering them is crucial. Following up with an implementation will guarantee that your definition of done is in line with your business goals and your organization’s endeavor to adopt a DevOps mindset.

Keep Calm and DevOps!

Posted in Agile, Definition of Done, DevOps, Scrum | Tagged , , , , | Leave a comment

How to Maximize the Value of Your Planning Session

See the source imageBy now it is commonly accepted that the old way of developing software, in silos, with a big up front plan and design, with only a single true delivery to customers at the end of the project, also knows as the infamous Waterfall approach is not the best way of doing things. Many development teams have fully embraced the agile approach, while others have not (yet) fully done so.

Partial, or early agile transformation attempts can often be characterized by embracing some (but not all) ideas, or certain practices commonly associated with agile development frameworks, such as Scrum, but without fully embracing its principles and philosophies; these teams often get the “what” part of it right, but have not yet embraced the “why” or the “how”.

In this blog post, I will focus on how to improve the agile planning session, also known in the Scrum framework as the Sprint planning session.

A Quick Overview of Sprint Planning

planA sprint is a time-box of one month or less (commonly 2-4 weeks) during which the delivery team will develop, test, and build a potentially releasable product increment.

Sprint Planning, quite simply, is the act of figuring out the work that needs to be done in order achieve the sprint’s goal, I.e. create the releasable product increment, adding what the Scrum team has decided is the next most valuable functionality to deliver.

Sprint Planning, like all other Scrum practices, is a collaborative effort, which is to say, that while the product owner is responsible for maximizing the value created by the delivery team, they, the team are responsible to figure out how to deliver the work. Both are expected to work together to define each sprint goal, rather than having the product owner tell the team what to do.

Change is Difficult

time-for-a-change-897441_1280For many teams, especially those transitioning from a command and control style of organization to an agile, self managing style, embracing the change in management style is much more difficult than embracing a different cadence and/or set of practices. This often manifests in planning sessions where the product owner describes what needs to be done, and the team just listen. It really should be more of a discussion.

The delivery team needs to collaborate, discuss the goal and the plan with the product owner to ensure that they understand what they are asked to deliver.

The SPIDR Technique

spider-technik-user-storiesWriting user stories, while not strictly required in Scrum, is one of the most commonly known Scrum practices. The idea is to describe increments of value from the perspective of one or more users of the product, thus ensuring that the product increment is, in fact, valuable to the users.

In order to ensure that user stories are well enough understood, and that they are small enough to fit in a single sprint, the SPIDR technique provides a way to decompose stories into multiple parts, each delivering some observable incremental value to the users.

SPIDR is an acronym defining the most commonly used methods of decomposing a story: Spikes, Paths, Interfaces, Data, Rules. The SPIDR technique was created by Mike Cohn, and fully described in his blog.

New Practice: SPIDR Planning

collab-spidrIn my work with delivery teams transiting from a waterfall-based world to a more agile, Scrum-based paradigm, I have found that encouraging a rote, a checklist of questions inspired by the SPIDR decomposition technique, helps teams get into the right mindset for discussing a plan, rather expecting a plan to be given from up high.

These are (some of) the questions a team may ask, to ensure their understanding of the value they are asked to deliver in the sprint:

Spikes

A spike describes a short burst of activity, often necessary to experiment with potential solutions for a complex problem. This will often be the last set of questions that the team will ask, after the requirement itself is understood.

Ask your team mates: Do we know know how to do this? Are there any APIs or components that we have to use that have never been used before? Is there anything missing in our skillset, toolset, or our definition of done, that needs to be in place in order to deliver this user story?

Paths

A path describes a specific flow, or a scenario of using the product. Ask and discuss with your product owner, stakeholders, or subject matter experts: What are the conditions under which the users will be interacting with the system? Are there other ways that this functionality may be used? What if something goes wrong? What do we show our users – any and all users? What can go wrong? Each if statement, switch/case, loop, or other control block marks a different path to be considered.

Interfaces

There are often many different ways to interact with a product. Gone are the days where users would only have one option – their desktop or terminal. What are all the various interfaces we need to consider? Web? Desktop? Tablets? Smartphones? Wearables? Command line tools for administrators and automation tools? How should the experience differ from one interface to another? Which interfaces are most valuable, and should thus be considered first?

Data

Different inputs beget different outcomes. What data will be provided? What are the data sources? Have we interacted with that data before (do we need a spike to figure out how to get that data?) Consider environmental data – does location affect the outcome? the time or date? Random elements? What data is required vs. optional? What if data is missing? What defaults to use in place of missing data?

Rules

A rule, either business or a technical standard, describes certain operational constraints. What are the business rules? Special cases? Are there any standards? Quotas? Compliance rules? Security standards?

Summary

Asking these questions during sprint planning is a great way to start a conversation that will both achieve the principles of collaboration and engagement, and ensure a greater understanding of the value that the product is expected to deliver to its users.

This list of questions is by no means an exhaustive list, and is intended to demonstrate the ways in which a delivery team might engage with the product owner, to negotiate with her or him.

<

p align=”justify”>What do you think? Are there any important questions that I have missed? Please let me know in the comments

Posted in Agile, Definition of Done, How To, Scrum, Sprint Planning, User Story | Leave a comment

The Importance of Limiting Your WIP

importanceIn the previous post we discussed what WIP (Work In Process/Progress) is, and how to track it. In this post I want to discuss why WIP limits are so important and how they contribute to improving the team’s effectiveness and throughput.

So There’s This Little Law…

LawsLittle’s Law describes the relationship between the average throughput in a system, the average process time (a.k.a. Cycle Time) for an item in the system, and the amount of items being processed at the same time (your WIP). For the mathematically inclined reader, this relationship is described in the following formula: L=λW, or in other words, the long-term average WIP (L) is equal to the long-term average throughput (λ) multiplied by the average wait/process/cycle time (W).

This theorem is both simple and profound; while the formula is intuitively easy to grasp, it also means that it is unaffected by other factors such as the arrival process distribution, service distribution, order of process, or for all intents and purposes – anything else!

This law therefore holds true for both simple and complex systems of any nature.

Okay. So this is interesting (humor me here – let’s assume that if you’re reading this post you are finding this interesting) but what does this mean?

Applying Little’s Law to Product Development

implementIf we play around with the formula, we will arrive at a way to determine the process time – or in product development terms, the amount of time to develop a feature – by dividing the WIP by the system throughput, or W=L/λ. This means that we can reduce the amount of time it will take us to implement features by limiting the amount of items we develop concurrently!

This too should be intuitive. Even if we completely ignore all of the disadvantages of multitasking, context shifting, resource allocation, deadlocking, and conflicting priorities, it is simple to see that if we could, for example, complete only 80% of the planned work, it would be more valuable to have 80% of our items 100% done and delivered, than to have started work on all 100% of our items, but each only 80% complete, and in no shape to ship. Yes in the described scenario, as a manager, you could find how to blame your developers for not completing 100% of the planned work, but this was preventable! That’s on you!

However, there is another subtle application that is implied by the modified theorem. WIP predicts process time!

Predicting Process Time based on WIP

predictPredicting process time is not new. We just like to call it estimating. Estimation is defined as making an extremely hopeful guess about the amount of time it will take to develop or (and?) deliver something, based on personal or shared past experiences (or more often, an arbitrary demand, dictated by someone in a higher tax bracket than the people doing the work). Okay, so perhaps it is not defined quite like this, but in all honesty it really should be.

Regardless of what technique (if any) you use to come up with your estimates, all of the data that feeds your estimates are based on the past. In other words, they are trailing indicators. This means that the numbers come after the fact – they could be used to explain the previous items’ process time, and we statistically assume (read: hope) that they will hold true for our upcoming work.

By contrast, your WIP is a leading indicator! This means that the WIP predicts (affects) the process time for items entering the system, I.e. not yet developed.

Let’s look at the following Cumulative Flow Diagram from a VSTS project:

cfd

Let’s try to reduce the work in progress. We will do this by focusing our efforts on closing items that are already active instead of beginning work on new items. This will be represented by taking items out of the Active state, and putting them back in the New state, and in return, moving items from the Resolved state into the Closed state. Doing so will result in the following modified cumulative flow diagram:

cfd2

Note that as our WIP is reduced, so does our process time! We have reduced the time to deliver an item – any item – doing nothing more than limiting the amount of work the team processes concurrently! And this doesn’t even account for the extra benefits that knowledge work such as software development derives from focusing on a single small unit of work!

Conclusion

In this blog post we’ve discussed Little’s Law and its implications for product management. Likewise, we demonstrated the importance of WIP limits as leading indicators for process time, and how reducing the amount of work we have in process concurrently reduces the amount of time to process items through the system. Armed with this knowledge I hope you will be able to make use of this because all you need to do is not take on a new job before you are done with (most of) the ones you’ve already begun.

Stay lean because you can(ban),

Assaf

Posted in Uncategorized | Leave a comment

WIP Your Product into Shape

What is WIP?

Enjoy-Drink-Cozy-Cappuccino-Cup-Coffee-Cream-703146WIP simply means work in process (also sometimes, Work In Progress). This metric simply measures how many items (features, stories, backlog items, tasks) your team have started to develop, but have yet to complete. In other words, how many items are currently being developed. This simple metric is extremely important, and a useful number to track and control. In this post we will discuss the reasons for limiting your WIP, how to do so, and how to track your work in process using VSTS.

You’re Doing Too Much

overloadImagine the following scenario: You are walking down Main street, carrying a box. Not a problem. The box is small enough that you can easily pick it up and carry it from wherever it was that you got it to wherever you are going. All is fine.

Now imagine, that you are carrying two boxes. Still not a problem. Granted, carrying one box would be easier, but you believe that the discomfort of carrying two boxes is preferable to the discomfort of having to make the trip twice. You can do it.

Now imagine that you are trying to pick up and carry three boxes. Not easy at all, and the weight, strain, and bulkiness of the trio makes you reconsider the wisdom of trying to take so much with you at once. Staggering carefully might end up taking longer than a second trip would…

Now imagine that you are trying to pick up and carry four boxes. Blind, because you cannot see around all of the boxes, you bump into someone else carrying a box, and you both fall, the contents of your boxes spilling, some shattering.

You really should have limited how many boxes you carry at once…

Setting WIP Limits

What is a WIP Limit?

stop-limit-reachedWIP limits are exactly what they sound like – you limit the number of items that you will work on concurrently, not taking on any new work, until the number of items you are developing is less than the limit. In our above story, the poor courier should have set his limit to 3, possibly even 2 in order for his work to flow optimally, or for him to be effective. The courier could easily carry one item, and could increase his throughput when he carried two items, but slowed down when he was carrying three, and literally crashed when trying to manage four. Note that this is purely individual – another courier could possibly manage three or four boxes.

I hope the metaphor is obvious. The courier is you and your development team. Boxes are your backlog items – whatever you’re tracking (stories, features, tasks, etc.). Some teams can work on one item in collaboration, others no more than two. A third team might prefer to work on twice as many tasks as they have members. It is individual, and it depends on the team, the individuals, and of course, the nature of their work.

But how do you know what limit to set?

Picking a WIP Limit

NO-SILVER-BULLET-HEADER-1080x675Here’s the simple truth: there are no silver bullets. No one can tell your team what the ideal WIP limit is. You should start with whatever makes sense to your team. Change that number to whatever makes sense to you. Change it when circumstances change. Change it to experiment.

If your team tends to collaborate frequently, start with a low number – three, because why not three… If your team rarely collaborates, start with a higher number, tied to the number of developers you have on your team, e.g. 1 or 2 per team member, plus/minus 1.

Next, start tracking metrics that are important to your team. A few examples are:

  • Throughput – the number of items processed (developed) in a given period of time (day, week, sprint – whichever, just be consistent). You want this number to be high
  • Defect Rate – the number of bugs/defects found (in QA, UAT, production, etc.) in a given period of time. You want this number to be low
  • Deployment Frequency, and deployment time. You want these numbers to be high and low, respectively
  • Lead Time – the amount of time that passes between requesting a change to the system and its delivery to the system’s users. You want this number to be low

At a regular cadence, for example, every sprint retrospective, evaluate the numbers. Experiment with lowering your WIP limit. After a period of time, look at your metrics. Did they improve? If so, great! Keep on doing what you’re doing. If not, either adjust (increase) your WIP limit or adjust your practices so that you perform better at the lower limit. Measure again. Rinse and repeat.

Tracking WIP Limits in VSTS

Tracking Your WIP Limit in Your Kanban Board

Limiting your WIP is a practice that should be mostly autonomous, that is to say, at the team’s discretion. As such, you need to set it and visualize it where the team visualizes their work. If you go to VSTS’s boards, you will notice two numbers beside the name of each column (except for the ‘New’ and ‘Done’ columns). The first number represents the number of items in the column, the second represents the WIP limit. In the following board, the ‘Approved’ column has a WIP limit of 5, meaning that the team must approve (or refine) a work item before they can consider a sixth item. This team’s ‘Committed’ WIP limit is set to 6, meaning that they can work on no more than 6 concurrent work items, as a whole team.

image

If the team takes on a 6th item to be approved, VSTS won’t block them, but the number will turn red to note that the team is doing something wrong:

image

Tracking Your WIP Limit in the VSTS Team Dashboard

The VSTS boards put this information in the team members’ faces, enough to be noticed, but not so much as to get in the way of progress. This may be enough for the team. Some teams may want more. Some teams may want to have this displayed on the team’s dashboard, where it might be visible to anybody, or always visible. Perhaps the Scrum Master or the team leader may want to know what is going on. Perhaps they want to show the dashboard to middle management in order to prove that reducing WIP is important.

One thing you might do is add a Work Item Query that tracks the work items in a given column (add a where clause like Board Column =  Committed), and add a Query Tile to your dashboard that counts the number of items returned by the query.

image

You can even set the tile with conditional formatting that would, for example, color the tile green if the count is less than the WIP limit, yellow if it’s at the limit, or red if it is above:

image

The end result is a query tile that really sticks out and lets whoever needs to know how well you are limiting your WIP:

image

Conclusion

In this blog post, we’ve discussed what WIP limit is, how to set it, and how to track it in VSTS. In the next post, we will discuss why reducing the WIP limit is so important, and how to incorporate WIP limit management into your real, day-to-day, corporate life.

Stay lean,

Assaf

Posted in Agile, Productivity, Scrum, Self Management, Team, Uncategorized, VSTS | Tagged , , , , , , | Leave a comment

5 Ways to Reduce the Impact of Failure

brush dirt off shoulderIn the previous post we discussed risk management, how many of us manage it today by attempting to control the likelihood of failure, and why we should instead focus on reducing the impact of failures, as a way to manage the risks involved with software development.

Below are 5 techniques that software development teams can use to reduce the impact of failures in production:

1. Reduce Batch Sizes

reduced-speed-smSmaller workloads take less time to complete – in each phase, and altogether. Smaller workloads include less functionality, thus fewer potential flaws, each likely to have a smaller impact on the system as a whole. Small batches are easier to deploy, thus easier to revert – or rather it is easier to design a successful rollback plan, meaning that the recovery time will be shorter.

Less functionality in a deployment unit results into a smaller area of impact. Faster deployments with faster rollback options result in a reduced time to recover, thus reducing the impact on the system.

Reducing the batch size exponentially reduces the impact of failure on the system.

2. Deploy as Early and Often as Possible

early-birdDevelopers depend on feedback in order to know whether or not, they created the right thing, as well as whether or not they created the thing right. Building the system every time changes are made is a good start, and running automated unit tests is even better, but some defects may only be detected in production or a production-like environment!

By shortening the development cycle, reducing the amount of time that a developer must wait before finding out if his or her changes were successful, the developer will:

  1. have an easier time correcting any mistakes
  2. have an easier time identifying cause and effect between changes made and defects in production
  3. have an easier time learning from these mistakes and findings

Reduced feedback cycle, therefore, results in faster fixes, thus reducing the impact of failures, while the increased learning results in a happy side-effect of reduced likelihood of failure.

3. Shift-Left and Automate Audits and Controls

Cog-icon-grey.svgRisk aversion, or fear of failure, is the primary reason that we have so many audits, checks and controls, and sign-offs for every single change that we introduce in the system. As mentioned in the previous post, these measures are expensive (time-wise) and ineffective at reducing risk; they are introduced too late in the development lifecycle to serve as an efficient method of reducing risk. Worse, most of these controls delay the deployment of changes into production by days, even weeks, thus increasing the feedback cycle, effectively increasing the impact and likelihood of failure!

See the irony here? The very measures taken to mitigate risk actually make it worse!

By replacing post-factum audits and governance boards with automated checks that analyze the code, and test it in a production-like environment, and by discussing operational concerns earlier in the planning process, we can introduce these audits earlier in the lifecycle, even as early as while the developer is coding! Automation also helps reduce the cost and the length of the feedback cycle.

With earlier and faster warnings, fewer quality issues will reach production. Those that do get through these measures are likely to be the smaller and less significant ones. This means that both impact and likelihood will be reduced.

4. Decouple Sub-Systems

break-chainHuge monoliths might be easy to develop, and often offer the greatest performance. Unfortunately, they come with inherent disadvantages:

  1. It is difficult or impossible to deliver changes to part of a monolith; monolith deployments are usually an all-or-nothing endeavor.
  2. Tightly coupled, monoliths are often designed in a way that if one part of a workflow fails, the entire workflow crashes.

By decoupling the architecture, separating steps into individual components that communicate with each other asynchronously, using message queues or event-based communication systems, we can:

  1. Deploy each component separately, ensuring that defects introduced into a system component are isolated from other components, reducing the impact of failures to one localized component.
  2. Rather than fail the workflow if something goes wrong, we can flag defective messages for service teams to handle, notify the user that completion is delayed, and eventually complete each flow when services are restored, thus reducing the impact of failure even further.

As a bonus, decoupled systems are much easier to scale as demand for services increase. A whole class of problems can be completely avoided by architecting loosely coupled systems.

As for performance concerns, make sure that you are developing for good enough performance, rather than best. Remember that Good Enough is by definition – good enough.

5. Continuously Improve the Definition of Done

continuous-improvementWhether your development and operations team(s) use Scrum, Kanban or any other agile methodology or framework to drive the product, the key to successful risk management is to uphold and improve the quality level you demand for anything that you develop and deploy to production.

Following any and all of the aforementioned techniques will greatly reduce the risk to your production pipeline, but never totally eliminate risk.

The most important way is to make sure that the same issue does not cause a failure twice. Any failure that does get through whatever quality measures you already have in place, must be analyzed, and you must figure out how to make sure that this class of problems never goes uncaught again.

By rigorously applying this technique, you will be able to continuously improve your quality controls, ensuring that failures consistently grow smaller until they are no more than a nuisance.

Conclusion

time-for-a-change-897441_1280Nobody and nothing is perfect. How ever some are closer to perfection than others. If any or all of these ideas are new to you, I would highly recommend that you start with your definition of done, and look at the most harmful failures that you have recently had, and identify the measures that are most valuable for you to introduce into your production line!

And then move on to the next one.

Posted in Agile, Change, Delivery, How To, Productivity, Release, Scrum, Software, Team | Tagged , , , , , | Leave a comment

Risk: You’re Managing it Wrong!

warningNo offense, but despite your best intentions, you might not be handling risk properly. In this day and age, everything is software-dependent; even if you do not consider yourself a “software-firm” per-se, even if you are just running a small development team that develops in-house software, your business still depends on said software to run smoothly, and any outages cost money. The bigger the problem, the greater the cost. If you, like many other modern software-based organizations, try to reduce risk by taking every precaution to avoid the occurrence of failures, then I am talking to you. If you are (still) following the waterfall methodology (why would you do that???), then I am definitely talking to you.

In this blog post I will explain what is fundamentally wrong with the waterfall way of addressing risk, why you should resist the temptation to avoid failure, and what you should be doing instead, in order to truly reduce risk that is inherent to delivering software.

What is Wrong with Waterfall?

Amboli_waterfallWhen following waterfall-based methodologies, software projects get developed in phases – first you gather all of the requirements (system, software), then you analyze the requirements and come up with a program design to satisfy the requirements. Once designed, you implement, or develop it. Once the development is done, you (hopefully) test the system thoroughly, and finally, you hand it over to operations, to deploy it and maintain it.

So, what is so fundamentally wrong with that, you might ask. This is a very simple and straightforward process. The problem is, of course, that, as even Winston Royce, the author of the now infamous paper about from 1970 titled “Managing the Development of Large Software Systems” said, this can only work for the most simple and straight forward projects.

It boils down to this: Software development is complicated. In waterfall, we proceed from one phase to the next, when the former completes successfully. Unfortunately, success is by no means guaranteed, or even likely. Worse, we tend to detect most problems only after we completed the development phase, during the testing phase, deployment phase, or worst of all, only after we have already released the flaws into production. What really makes this difficult, is that some of the problems that we uncover may have been introduced prior to development (the design, analysis or even requirement gathering), and as everyone knows, the cost of fixing a problem grows exponentially over time.

So what do we do? How do we mitigate the risk that we might introduce a costly flaw into the system? Intuitively, we attempt to get everything right the first time. We try to think of everything that the system or software might require, create comprehensive design documentation that proves that we thought really hard about the problem, and create lengthy, highly regulated processes and checks that prove that we crossed every ‘T’ and dotted every ‘I’ (and a few lowercase J’s for extra measure).

In other words, we attempt to reduce risk by reducing the likelihood of a problem/incident.

And here our intuition fails us.

Risk Management in Modern Software Projects

Reducing Likelihood of Problems is the Wrong Approach

snake-eyesThere are many different ways that a project may fail. Too many to count them all. Missing a requirement, getting a requirement wrong, designing the wrong architecture, designing a system too rigid to change, developing the wrong capabilities, developing a capability incorrectly, deploying incorrectly, not designing for the right scale, insecure code, etc. The list goes on…

So we set up policies, we come up with plans, we have audits, we enforce waiting periods, we have sign-offs, and because releasing new software is so complicated and scary, we do so rarely, often no more that four times per year.

But here’s the problem – we aren’t eliminating risk. We are – at best – reducing the likelihood of something getting through our safety gates. This means that things will get through eventually, because given enough time, anything that can happen, eventually will.

And when failure does happen, the flaw in the system expresses itself in its full glory. In other words, there might be a 1% chance of a bug reaching production, but when it does, it’ll be there 100%!

All of our audits, sign-offs, and controls fail to stop us from making mistakes. At best they catch some of the mistakes that we’ve made when it is expensive to fix them; often the mistakes get caught too late – discovered flaws become too hard or too expensive to fix. These defects get shipped anyway, hopefully fixed in a service update. Worse, and all-too-often, these audits, controls and sign-offs do nothing to help identify problems, and are instead in place in order to identify whom to blame for the failures – a useless endeavor, in my opinion.

Worst of all, our lengthy processes delay the feedback that analysts, architects, and developers need, making it impossible to learn from mistakes! A bug found 6 months after it was introduced, will do nothing to teach the responsible party how to avoid making the same mistake again. Cause and effect becomes all but lost at this point.

Finally, due to the infrequency of releases, we are not used to dealing with deployment-related issues, and therefore we are surprised and scared every single time we have them.

Manage Risk by Reducing the Impact of a Problem!

brush dirt off shoulderWhat if rather than attempting to minimize the chance that something goes wrong, we instead try to reduce how badly the problems affect us? Ask yourself this, given the choice between having a 1% chance of suffering a heart-attack, or a 100% chance to suffer something that is 1% the strength of a heart-attack, perhaps a flutter, or skipping a beat – which would you pick? I’d definitely go with the latter. In software development, not only is the likelihood of a production-failure more than 1% likely to occur, it does so every quarter or however frequently you release changes.

Agile Risk Management

agile_balance1Agile project managers, whether following Scrum, Kanban, or any other methodology or framework are designed around the following notions:

  1. Software is complicated
  2. Complicated things risk failure
  3. Complexity is directly proportional to risk and to impact of failure
  4. Complexity increases with the size of the workload
  5. Therefore, we design processes that reduce complexity, and thus – impact

We should follow methodologies that allow us to reduce the size of our workload. In Kanban, we focus on single-item flow. In Scrum, we iterate through our entire release process in one month or less. High-performing teams deploy to production small increments of functionality even more frequently, often multiple times per day!

In the next post, I will cover the steps that an organization can take, to reduce the impact of the risks involved with developing software.

Posted in Agile, How To, Productivity, Release, Scrum, Software, Team | Tagged , , , , , , | Leave a comment

How to Track Impediments in VSTS

Note: While this post discusses impediments in VSTS, everything mentioned can be applied to TFS as well.

Visual Studio Team Services (or VSTS) has great tools to support Scrum teams. The product owner can use the backlog and board to track the progress of individual teams or the entire product, at the Product Backlog Item (PBI) level, at the Feature level or even the Epic level, throughout the entire lifetime of the product. The developers can track the progress that they are making within the sprint, and see how their work (tasks) fit into the larger picture, by associating them with the PBIs that make up the product.

But what about the Scrum Masters? What tools do they have in VSTS to help them track their work and their progress?

What are Impediments?

According to Scrum.org’s official Scrum guide, one of the services that a Scrum Master provides to the development team, is the removal of impediments to the team’s progress. An impediment is anything that causes a developer to be unable to make progress towards completing the sprint’s goal. Whenever a developer has a problem that cannot be solved within the scope of the team, it is the Scrum Master’s responsibility to remove it.

Impediments in VSTS

Visual Studio Team Services has a work item type dedicated towards tracking impediments, and progress on their removal. For projects using the Scrum Template, this work item type is called an impediment. For projects using the Agile or CMMI templates, this is called an Issue. Regardless of template, they both serve the same purpose: They mention a problem, and their state machine tracks the progress.

Unfortunately, impediments and issues do not show up in VSTS’ backlogs or boards. Those are designed for tracking progress on the delivery of the product, and the Impediment work item type is not included. That said, how should a Scrum Master and the Scrum team track these impediments, especially in large distributed projects, where face-to-face communication and jotting a note on a pad is not a viable solution?

Step 1 – Gathering Impediments

The first step to being able to track impediments in VSTS is obviously to enter the impediments into VSTS. The best way to guarantee that impediments do, in fact, get logged in VSTS, is to make it quick and easy to do so. I suggest using widgets in the Work Dashboard. This should be located prominently in whichever dashboard all the Scrum team members view regularly.

The default dashboard in VSTS (named “Overview”) has a widget titled New Work Item. This widget has a text box for setting the title of the work item, and a drop-down list to select the work item type. You can rearrange the dashboard, if you wish, to make sure that the widget is conveniently located, but otherwise, all you have to do, is select the Impediment (or Issue) type, enter the title (e.g. “I need an MSDN Enterprise license”), and click on the Create button:

clip_image002

This will open a new work item form, where you can add a description, or any other detail that you might want, as you would with any other work item:

clip_image004

Finally, just click Save & Close, or press Ctrl+Enter to save and exit the work item.

Step 2 – Create a Query for Tracking Impediments

As a Scrum Master, I will want to keep track of all the impediments in the team. I will want to note new impediments that are not assigned to anyone for removal, and I will want to keep track of those assigned to me (if there is more than one Scrum Master in the project, which may occur in large projects).

In order to set up your impediments query, you will need to go to the Work | Queries submenu.

clip_image005

At this point you have two options. You can customize the existing Open Impediments (or Open Issues, in the other templates) query, which you will find in the Shared Queries section, under the Current Iteration folder, and customize it, or you can create a new query. If you choose to create a new one, just be sure to save it as a shared query, so that you may use it later with some widgets.

Regardless of whether you’re updating the existing query or creating a new one, make sure you set the following elements:

· Work Item Type should be equal to Impediment (or Issue if in an Agile or CMMI project). This is already set in the existing query

Alternatively, you may make the following variations on the query. You can save these as separate queries, or apply them to the one you are working on:

· You may optionally track only impediments that have not yet been resolved. To do so, you should set State to be equal to Open. This is already set in the Open Impediments query

· You may optionally set a filter to track only the impediments in the current iteration. To do this, set Iteration Path to be equal to @CurrentIteration. This is already set in the existing query, but you will want to modify the value from its current setting which defaults to ‘Your-Project\Sprint 1’

· You may optionally set a filter to track only unassigned impediments. To do so, set “Assigned To” to be empty (the operator should be set to ‘=’ and the value should be empty)

clip_image007

Step 3 – Visualizing Impediments in VSTS

Having a query that shows the list of open impediments is important, but the information must not only exist and be available, it also needs to be accessible. In Scrum parlance, we call this Information Radiation. In this step, we will make sure that this information is in the face of the Scrum Master.

The Dashboard

Depending on how cluttered your Overview (the default) dashboard is, you may want to create a separate dashboard, just for the Scrum Master. Doing so is extremely easy. Just go to the Dashboards section of the project, and click on the + New button on the far right. Give the dashboard a name, e.g. ‘Scrum Master’, and then click on the OK button:

clip_image009

You will now have an empty dashboard that you may fill with widgets. We will use the space to add widgets that help us track impediments. The ‘Add Widget’ sidebar will open (you can also click on the edit button, and then the + button at the bottom right to open it. You may now add widgets.

Adding Widgets

There’s probably no end to the widgets you may add at this point, but I would like to point out the following, as I find them to be of most value:

Query Tile

This widget simply displays the total number of results for a query. Add this widget, and click on the wrench to configure it, as follows:

· Title: Call it ‘Open Impediments’ or something similar. The query name will be the default title

· Default background color: I suggest setting it to blue, or another color that denotes a calm or good state (blue is better than green for accessibility reasons)

· Conditional formatting: Click on the ‘+Add Rule’, and select the red color, with a condition of the number of items being greater than 0

Query Results

This query simply displays the results of a query, as a list. Add this widget, and click on the wrench to configure it, as follows:

· Title: Open Impediments, for example

· Size: You can leave it at the default of 3×2, or whatever size you like, experiment with it

· Query: Choose the ‘Open Impediments’ query

· Display Columns: Choose the columns that you wish to display. I would make sure to have the Title, State, and ‘Assigned To’ columns.

Chart for Work Items

Visualizes the work items with a chart, such as a pie chart etc. You could use this to track impediments over time. This is especially useful with a query that is not filtered by state. You can create a pie chart for impediments, comparing the number of opened and closed impediments, and so on. Add this widget and click on the wrench to configure:

· Title: Open Impediments

· Size: 2×2 is the default, you may change it or leave it, as desired

· Query: Select the open impediments (or a new query for all impediments)

· Chart type: Pie, for example

· Group by: select something to group by, like State, or Iteration Path

· Aggregation: usually Count of Work Items

· Series: Select the color for each group (e.g. red for open, blue for closed)

If you’ve followed my examples, your dashboard may look something like this:

clip_image011

If you can set this up on a monitor that is always on display in a team’s room, this can be a very powerful tool for Scrum Masters.

What other queries and widgets would you suggest for the Scrum Master’s dashboard? Let me know in the comments’ section.

Thanks,
Assaf

Posted in Agile, How To, Productivity, Scrum, VSTS | Tagged , , , , , | 1 Comment

How to Set Up Multi-Level Hierarchies in VSTS

Background

Both VSTS (Team Services) and TFS (Team Foundation Server) have a set organizational hierarchy. An individual developer belongs to a team. A project (also called Team Project, or TP) has multiple teams, and a VSTS account – as well as a TFS collection – contain multiple projects.

While it is possible to run queries in Team Services and TFS that return data from multiple projects, and in TFS, reports that bring data from multiple data sources (i.e. multiple collections), the agile governance tools that VSTS and TFS offer do not aggregate beyond the project level. This means that the hierarchy that VSTS offers has 3 levels: Project, Team, and Developer.

In this post, I will show you how to set up VSTS so that you can create a larger reporting hierarchy, with as many levels as you want.

Teams, Teams, and more Teams!

First, a disclaimer – the following technique, while giving us almost everything that we could need out of multiple hierarchies, cannot create new types of containers or entities. The highest level is still the project, and individual members still belong to teams.

What we can do is set up teams within teams – or at least create the illusion of having done so.

A team in VSTS has the following attributes – it has members (the individual developers), it has its own set of dashboards, its backlogs and boards, and it is assigned an Area under the project.

The way that we specify that a work item is assigned to a certain team, is by specifying that the item’s Area Path is in or under an area assigned to said team.

The trick that we will use to accomplish our goal, is to create teams whose areas are under the area of another team. Each “level” will be a different hierarchical level. We will usually assign the products highest governance (or steering) team to the project root.

For example, if we want our project to have the following hierarchy:

  • Division
  • Group
  • Team

We will create “Division” teams under the project root, “Group” teams under the divisions, and “Team” teams under the groups, as in the following hierarchy:

image

Again, the teams themselves are “flat” there is no team hierarchy. The illusion is created by assigning some teams a default area that is a “parent node” for another team’s area.

In this example, Alpha Group’s area is  MyEnterpriseProduct\Blue Division\Alpha Group, and the Apollo team is MyEnterpriseProduct\Blue Division\Alpha Group\Alpha Team, which is beneath it, but neither team has any other attribute that marks one as higher than another, hence the “illusion”.

But Will it Blend?

So we have successfully created a list of teams, some assigned to areas above others. How do we make sure that the illusion is kept when dealing with boards, backlogs and dashboards?

The trick is to set all but the “leaf” teams (the teams lowest in the hierarchy) to include sub areas, i.e. each team owns its own area and those beneath:

image

Setting the teams like this gives groups a supervisory view of teams, divisions of groups, and the “steering committee” can oversee all of the work being done in the project.

This means that the steering committee’s boards, backlogs, and dashboards will track all of the work being done in the project, while Alpha Group will oversee only the work done by its teams. Each of the “leaf” teams will see the work that has been assigned to them:

Steering Committee's backlog
The Steering Committee’s backlog

image
Alpha Group’s backlog

image
Ares Team’s backlog

This filtering is preserved for the Kanban and Scrum boards as well, and each division, group, and team can have their own set of dashboards to highlight whatever they want to see and use to drive their decision making!

Summary

By creating an Area tree that matches the organizational hierarchy, and assigning teams to their proper nodes, VSTS teams can be made to create a hierarchy as high as the group needs it to be!

I hope you find this useful. If there are any questions, please feel free to ping me in the comments!

– Assaf

Posted in Uncategorized | Leave a comment