How to measure success? Is it even possible?
Just to set the scene, I haven’t figured this out yet. I can tell you my story, in all its ugliness. It’s based on a real story, but it’s changed significantly from reality to protect the innocent. I hope you still find my account useful.
I worked in a context where the optimizing goal was to free people’s capacity to do “better things” and to free up some cash so that the organization wouldn’t have to go to the money markets to invest in innovation. It was non-tech. The “better things” included leaving the planet better than we found it.
I was challenged to define organizational agility that would leave all people involved in no doubt about expectations. Therefore it could not be pithy, as it would be too open to interpretation.
Here is my definition of organizational agility, and it keeps evolving:
The ability to drive disruption in society, the industry & the marketplace, an adaptive way of being/learning/sensemaking, through ↑effectiveness, ↑frequency-of-impact, ↑quality, ↑learning, ↓impediments, ↑flow, ↑efficiency, and ↑sustainability.
It looks like small cognitively-diverse (Syed, 2019) cross-skilled cross-functional teams or teams-of-teams using Agile, Lean, Lean/Agile methods, or the Agile Manifesto principles.
It feels like ↑sincerity, ↑empiricism, ↑caring-about-planet-earth, ↑shared-purpose-that-is-not-all-about-customers-and-money, ↑psychological-safety, ↑engagement (society, employees, customers, partners, and other stakeholders), ↑respect, ↑leaders-embracing-uncertainty, ↑leaders-serving-teams, ↑leaders-as-coaches, ↑excellence, ↑transparency, ↑inspection, ↑adaptation, ↑team-based-commitment, ↑openness, ↑inclusion, ↑courage, ↑passion, ↑focus, ↑energy, and ↑fun.
It is underpinned by the Agile Manifesto at www.agilemanifesto.org.
It’s a helpful way of being in this Volatile, Uncertain, Complex, and Ambiguous and threatened world. It is a required skill set for the 21st century, regardless of specialty or function.
...where ↑ means more and ↓ means less. Feel free to sort the words differently. For my context, it was a bit different from what’s above.
My intent is not to guilt-trip people in organizations that damage society. Follow the kaizen spirit, what is the smallest thing you can change with your product, the way you work that would damage the plant less, or even repair it? Can you #passonplastic or #eatinseason or #flyless or at least #flyintherightdirection perhaps? You get the idea, I hope...
I have another version that hasn’t evolved as much...
The ability to drive disruption in the industry & the marketplace, an adaptive way of being/learning/sensemaking, through ↑effectiveness, ↑frequency-of-impact, ↑quality, ↑learning, ↓impediments, ↑flow, ↑efficiency, and ↑sustainability.
It looks like small cross-skilled cross-functional teams or teams-of-teams using Agile, Lean, Lean/Agile methods, or the Agile Manifesto principles.
It feels like ↑psychological-safety, ↑engagement (employees, customers, partners, and other stakeholders), ↑respect, ↑leaders-embracing-uncertainty, ↑leaders-serving-teams, ↑leaders-as-coaches, ↑excellence, ↑transparency, ↑inspection, ↑adaptation, ↑team-based-commitment, ↑openness, ↑inclusion, ↑courage, ↑passion, ↑focus, ↑energy, and ↑fun.
It is underpinned by the Agile Manifesto at www.agilemanifesto.org.
It’s a helpful way of being in this Volatile, Uncertain, Complex, and Ambiguous world. It is a required skill set for the 21st century, regardless of specialty or function.
Pick your poison:).
Why is knowing what agility is and what the optimizing goal is? Well, if we’re measuring, maybe it’s useful to know where we’re going, or at least where we’d like to go. Ok, so we knew where we were going.
So what to measure?
The quality of a measure increases as we progress from measuring activity through outputs through outcomes to impact. The difficulty is that many impact measures are lagging; that is we don’t find out until later where we did make a positive impact as intended. Proxy measures (things we measure because we can’t measure as almost substitute measures, are dangerous. Nevertheless, leading measures are useful, and the closer we get to measuring the impact, the better.
So, I debated with my peers, and I asked what’s the devil in the measure. In other words, how could the measure get mis-used or abused? What are the unintended consequences? In my cynical mind, they are what will happen. I’ll try to turn on my happy face later in this post :).
So look at what we came up with, and we had to shortlist measures from this bunch of measures “to keep things simple”.
So naturally, I argued against activity measurements, and for least bad measures (that didn’t lag so much or maybe gave an early indication of how we were doing). See the shortlist for my context, please don’t blindly copy. Besides, there were issues with these measures which you’ll see later.
What does 85th percentile mean for cycle time? Oversimplifying here as is my style, it means only 15% of cycle times are worse in your history. You can see that I dismissed the measurement of practices as (to my mind) low-life activity measures.
But we were still missing a trick. According to Scrum.org’s Evidence Based Management, we only measured current value, time to market, and what I refer to as the inability to innovate. You can see my positivity shining through, I hope:). We were leaving money on the table with un-realized value, which is a pity. That’s where the context was, no appetite for innovation other than innovation for problem-solving...
I did one thing right, though, I did not compare teams. It’s the road the hell to compare team, and it leads to all sorts of gaming. I’m talking about the devil a lot, so you can see I have issues:). I only cared if each initiative or team was getting better or worse and if worse I’d ask if help was needed, first showing the data and trends.
I did another thing right - I had a bs detector. No matter what one measures, the measure loses its quality as soon as it gets attention, and we unleash creativity in making the numbers. I checked for bs Product Backlog Items (PBIs), for example. No one was going to lunch and calling that effort a PBI on my watch:). The bs detector was a series of questions to validate/invalidate the numbers we counted. I tried sending out surveys, no one cared. So I iterated through some experiments. I turned up ever two months, joinning the daily meeting and then taking over the end of the daily meeting to start a new meeting. People would, right in front of me, complete anonymous surveys. I made sure they were anonymous, I couldn’t even tell the IP address). I was only there to answer questions on the questions, to clarify basically.
- Interview teams and capture trends in performance & how it feels to work here (they already had separate Gallup12 surveys, so I had the liberty to zone in on agility for attaining the optimizing goal)
- Interview teams and capture how leaders are showing up
- Star rate the journey towards the optimizing goal, and if less than 5 stars give specific comments to help improve without proving any hints of who is making the comments
And there was a trend. Outside of my sphere of influence, the trends flatlined, therefore no improvement.
I had all sorts of telemetry, and I found four key trends:
- it felt faster than this time last year, but not as fast as last time I checked three months earlier
- people were less convinced quality was getting better than three months earlier
- 70+% of people were convinced executive leaders understood the issues for delivery
- 20+% of people were convinced the leaders were doing anything about the issues for delivery
Getting back to the teams and initiatives, I was like “holy cow, what to do now!” I needed some indication that while trends of “least bad measures” were not improving, something might happen in the future. You’re seeing a sparkle of my search for hope now:).
I did a full 360-degree turn. I decided to measure standard practices across the agility spectrum. The hypothesis was if people are doing practices commonly associated with success, maybe that’s a leading indication for impact eventually improving, I couldn’t just do that for the “bad teams”, as that would be a comparison of teams through the back door. So I measured every team and initiative on practices. This is what I found, eeek!
All of these measures initiatives & teams had coaches. It looked to me like they were using methods/frameworks that were a bit too far removed from their belief systems.
I used similar measures in several contexts, with similar results. From my other telemetry, I noticed a correlation between a solid team launch (with just in time review of options, understanding the direction of travel & the work, learning) and team performance. This ties in with the well-researched work of Richard Hackman in his book Leading Reams (Hackman, 2006).
How were the leaders showing up?
To learn more, look ay my scrum hints blog post where I have useful survey links on page 2 of the two-pager.
Well, by understanding how leaders are showing up, I (think I) know what content I should discuss with them in order to not lose the audience. Losing the audience can happen by talking about utopian dreamy agility while (for example sometimes) the audience doesn’t agile or complexity). A series of workshops can get leaders out of this jam. But some won’t make it, and while they stay unimproved from a 21st-century point of view, progress towards the optimizing goal can get impeded.
A wise man told me earlier this year, measure the number of inspired people, even if they leave because those people are so inspired. I scoffed; how could I be expected to tell that story to whoever was paying my invoices. I now believe that the wise man was hitting the bullseye. Silly me! The more I know, the more I realize I don’t know anything.
It ended up being a case study of a round trip to helping people, not how to measure success.
Context is king. I’m curious what you tried. Please write to me. Thank you.
Hackman, J. (2006). Leading teams. Boston, Mass: Harvard Business School Press.