Thursday, July 03, 2014

Uncovering Inherent Structures in Organizations

Vladimir Levenshtein
This post should have a subtitle: "Using Team Analysis and Levenshtein Distance to Reveal said Structure." It's the first part of that subtitle that is the secret, though: being able to correctly analyze and classify individual teams. Without that, using something clever like Levenshtein distance isn't going to be very much help.

But that's coming in towards the end of the story. Let's start at the beginning.

What You're Going to See

This post is a bit long. Here are the sections I've divided it into:

  • What You're Going to See
  • Premise
  • Introducing ACME
  • Categorizing Teams
  • Category Example
  • Calculating the Levenshtein Distance of Teams
  • Sorting and Interpretation
  • Conclusion

However, you don't need to read the whole thing to obtain the main benefits. You can get the Cliff Notes version by reading the Premise, Categorizing Teams, Interpretation, and the Conclusion.

Premise

Companies grow. Teams expand. If you're well-placed in your industry and providing in-demand services or products, this is happening to you. Individuals and small teams tend to deal with this sort of change pretty well. At an organizational level, however, this sort of change tends to have an impact that can bring a group down, or rocket it up to the next level.

Of the many issues faced by growing companies (or rapidly growing orgs within large companies), the structuring one can be most problematic: "Our old structures, though comfortable, won't scale well with all these new teams and all the new hires joining our existing teams. How do we reorganize? Where do we put folks? Are there natural lines along which we can provide better management (and vision!) structure?"

The answer, of course, is "yes" -- but! It requires careful analysis and a deep understanding every team in your org.

The remainder of this post will set up a scenario and then figure out how to do a re-org. I use a software engineering org as an example, but that's just because I have a long and intimate knowledge of them and understand the ways in which one can classify such teams. These same methods could be applied a Sales group, Marketing groups, etc., as long as you know the characteristics that define the teams of which these orgs are comprised.



Introducing ACME

ACME Corporation is the leading producer of some of the most innovative products of the 20th century. The CTO had previously tasked you, the VP of Software Development to bring this product line into the digital age -- and you did! Your great ideas for the updated suite are the new hottness that everyone is clamouring for. Subsequently, the growth of your teams has been fast, and dare we say, exponential.

More details on the scenario: your Software Development Group has several teams of engineers, all working on different products or services, each of which supports ACME Corporation in different ways. In the past 2 years, you've built up your org by an order of magnitude in size. You've started promoting and hiring more managers and directors to help organize these teams into sensible encapsulating structures. These larger groups, once identified, would comprise the whole Development Group.

Ideally, the new groups would represent some aspect of the company, software development, engineering, and product vision -- in other words, some sensible clustering of teams doing related work. How would you group the teams in the most natural way?

Simply dividing along language or platform lines may seem like the obvious answer, but is it the best choice? There are some questions that can help guide you in figuring this out:
  • How do these teams interact with other parts of the company? 
  • Who are the stakeholders in feature development? 
  • Which sorts of customers does each team primarily serve?
There are many more questions you could ask (some are implicit in the analysis data linked below), but this should give a taste.

ACME Software Development has grown the following teams, some of which focus on products, some on infrastructure, some on services, etc.:
  • Digital Anvil Product Team
  • Giant Rubber Band App Team
  • Digital Iron Carrot Team
  • Jet Propelled Unicycle Service Team
  • Jet Propelled Pogo Stick Service Team
  • Ultimatum Dispatcher API Team
  • Virtual Rocket Powered Roller Skates Team
  • Operations (release management, deployments, production maintenance)
  • QA (testing infrastructure, CI/CD)
  • Community Team (documentation, examples, community engagement, meetups, etc.)

Early SW Dev team hacking the ENIAC

Categorizing Teams

Each of those teams started with 2-4 devs hacking on small skunkworks projects. They've now blossomed to the extent that each team has significant sub-teams working on new features and prototyping for the product they support. These large teams now need to be characterized using a method that will allow them to be easily compared. We need the ability to see how closely related one team is to another, across many different variables. (In the scheme outlined below, we end up examining 50 bits of information for each team.)

Keep in mind that each category should be chosen such that it would make sense for teams categorized similarly to be grouped together. A counter example might be "Team Size"; you don't necessarily want all large teams together in one group, and all small teams in a different group. As such, "Team Size" is probably a poor category choice.

Here are the categories which we will use for the ACME Software Development Group:
  • Language
  • Syntax
  • Platform
  • Implementation Focus
  • Supported OS
  • Deployment Type
  • Product?
  • Service?
  • License Type
  • Industry Segment
  • Stakeholders
  • Customer Type
  • Corporate Priority
Each category may be either single-valued or multi-valued. For instance, the categories ending in question marks will be booleans. In contrast, multiple languages might be used by the same team, so the "Language" category will sometimes have several entries.

Category Example

(Things are going to get a bit more technical at this point; for those who care more about the outcomes than the methods used, feel free to skip to the section at the end: Sorting and Interpretation.)

In all cases, we will encode these values as binary digits -- this allows us to very easily compare teams using Levenshtein distance, since the total of all characteristics we are filtering on can be represented as a string value. An example should illustrate this well.

(The Levenshtein distance between two words is the minimum number of single-character edits -- such as insertions, deletions or substitutions -- required to change one word into the other. It is named after Vladimir Levenshtein, who defined this "distance" in 1965 when exploring the possibility of correcting deletions, insertions, and reversals in binary codes.)

Let's say the Software Development Group supports the following languages, with each one assigned a binary value:
  • LFE - #b0000000001
  • Erlang - #b0000000010
  • Elixir - #b0000000100
  • Ruby - #b0000001000
  • Python - #b0000010000
  • Hy - #b0000100000
  • Clojure - #b0001000000
  • Java - #b0010000000
  • JavaScript - #b0100000000
  • CoffeeScript - #b1000000000
A team that used LFE, Hy, and Clojure would obtain its "Language" category value by XOR'ing the three supported languages, and would thus be #b0001100001. In LFE, that could be done by entering the following code the REPL:


We could then compare this to a team that used just Hy and Clojure (#b0001100000), which has a Levenshtein distance of 1 with the previous language category value. A team that used Ruby and Elixir (#b0000001100) would have a Levenshtein distance of 5 with the LFE/Hy/Clojure team (which makes sense: a total of 5 languages between the two teams with no languages shared in common). 


Calculating the Levenshtein Distance of Teams

As a VP who is keen on deeply understanding your teams, you have put together a spreadsheet with a break-down of not only languages used in each team, but lots of other categories, too. For easy reference, you've put a "legend" for the individual category binary values is at the bottom of the linked spreadsheet.

In the third table on that sheet, all of the values for each column are combined into a single binary string. This (or a slight modification of this) is what will be the input to your calculations. Needless to say, as a complete fan of LFE, you will be writing some Lisp code :-)

Partial view of the spreadsheet's first page.
(If you would like to try the code out yourself while reading, and you have lfetool installed, simply create a new project and start up the REPL: $ lfetool new library ld; cd ld && make-shell
That will download and compile the dependencies for you. In particular, you will then have access to the lfe-utils project -- which contains the Levenshtein distance functions we'll be using. You should be able to copy-and-paste functions, vars, etc., into the REPL from the Github gists.)

Let's create a couple of data structures that will allow us to more easily work with the data you collected about your teams in the spreadsheet:


We can use a quick copy and paste into the LFE REPL for two of those numbers to do a sanity check on the distance between the Community Team and the Digital Iron Carrot Team:


That result doesn't seem unreasonable, given that at a quick glance we can see both of these strings have many differences in their respective character positions.

It looks like we're on solid ground, then, so let's define some utility functions to more easily work with our data structures:


Now we're ready to roll; let's try sorting the data based on a comparison with a one of the teams:


It may not be obvious at first glance, but what the levenshtein-sort function did for us is compare our "control" string to every other string in our data set, providing both the distance and the string that the control was compared to. The first entry in the results is the our control string, and we see what we would expect: the Levenshtein distance with itself is 0 :-)

The result above is not very easily read by most humans ... so let's define a custom sorter that will take human-readable text and then output the same, after doing a sort on the binary strings:


(If any of that doesn't make sense, please stop in and say "hello" on the LFE mail list -- ask us your questions! We're a friendly group that loves to chat about LFE and how to translate from Erlang, Common Lisp, or Clojure to LFE :-) )



Sorting and Interpretation

Before we try out our new function, we should ponder which team will be compared to all the others -- the sort results will change based on this choice. Looking at the spreadsheet, we see that the "Digital Iron Carrot Team" (DICT) has some interesting properties that make it a compelling choice:
  • it has stakeholders in Sales, Engineering, and Senior Leadership;
  • it has a "Corporate Priority" of "Business critical"; and
  • it has both internal and external customers.
Of all the products and services, it seems to be the biggest star. Let's try a sort now, using our new custom function -- inputting something that's human-readable: 


Here we're making the request "Show me the sorted results of each team's binary string compared to the binary string of the DICT." Here are the human-readable results:


For a better visual on this, take a look at the second tab of the shared spreadsheet. The results have been applied to the collected data there, and then colored by major groupings. The first group shares these things in common:
  • Lisp- and Python-heavy
  • Middleware running on BSD boxen
  • Mostly proprietary
  • Externally facing
  • Focus on apps and APIs
It would make sense to group these three together.

A sort (and thus grouping) by comparison to critical team.
Next on the list is Operations and QA -- often a natural pairing, and this process bears out such conventional wisdom. These two are good candidates for a second group.

Things get a little trickier at the end of the list. Depending upon the number of developers in the Java-heavy Giant Rubber Band App Team, they might make up their own group. However, both that one and the next team on the list have frontend components written in Angular.js. They both are used internally and have Engineering as a stakeholder in common, so let's go ahead and group them.

The next two are cloud-deployed Finance APIs running on the Erlang VM. These make a very natural pairing.

Which leaves us with the oddball: the Community Team. The Levenshtein distance for this team is the greatest for all the teams ... but don't be mislead. Because it has something in common with all teams (the Community Team supports every product with docs, example code, Sales and TAM support, evangelism for open source projects, etc.), it will have many differing bits with each team. This really should be in a group all its own so that structure represents reality: all teams depend upon the Community Team. A good case could also probably be made for having the manager of this team report directly up to you. 

The other groups should probably have directors that the team managers report to (keeping in mind that the teams have grown to anywhere from 20 to 40 per team). The director will be able to guide these teams according to your vision for the Software Group and the shared traits/common vision you have uncovered in the course of this analysis.

Let's go back to the Community Team. Perhaps in working with them, you have uncovered a hidden fact: the community interactions your devs have are seriously driving market adoption through some impressive and passionate service and open source docs+evangelism. You are curious how your teams might be grouped if sorted from the perspective of the Community Team.

Let's find out!


As one might expect, most of the teams remain grouped in the same way ... the notable exception being the split-up of the Anvil and Rubber Band teams. Mostly no surprises, though -- the same groupings persist in this model.

A sort (and thus grouping) by comparison to highly-connected team.
To be fair, if this is something you'd want to fully explore, you should bump the "Corporate Priority" for the Community Team much higher, recalculate it's overall bits, regenerate your data structures, and then resort. It may not change too much in this case, but you'd be applying consistent methods, and that's definitely the right thing to do :-) You might even see the Anvil and Rubber Band teams get back together (left as an exercise for the reader).

As a last example, let's throw caution and good sense to the wind and get crazy. You know, like the many times you've seen bizarre, anti-intuitive re-orgs done: let's do a sort that compares a team of middling importance and a relatively low corporate impact with the rest of the teams. What do we see then?


This ruins everything. Well, almost everything: the only group that doesn't get split up is the middleware product line (Jet Propelled and Iron Carrot). Everything else suffers from a bad re-org.

A sort (and thus grouping) by comparison to a non-critical team.

If you were to do this because a genuine change in priority had occurred, where the Giant Rubber Band App Team was now the corporate leader/darling, then you'd need to recompute the bit values and do re-sorts. Failing that, you'd just be falling into a trap that has beguiled many before you.


Conclusion

If there's one thing that this exercise should show you, it's this: applying tools and analyses from one field to fresh data in another -- completely unrelated -- field can provide pretty amazing results that turn mystery and guesswork into science and planning.

If we can get two things from this, the other might be: knowing the parts of the system may not necessarily reveal the whole (c.f. Complex Systems), but it may provide you with the data that lets you better predict emergent behaviours and identify patterns and structure where you didn't see them (or even think to look!) before.


No comments: