Categories
Tech

Decentralized IT, real life experiences

In an average working week I meet (or better Zoom, Teams, …) with an average of 10/12 different customer ops teams randomly ranging from the large-enterprise to the tiny company. Here is something I am seeing happening more and more throughout the spectrum of all the companies that develop their own software regardless their size and industry. This entry of the 9:30 is about automation, culture, organization, process and practices of decentralized IT organizations.

Everyday, I hear people talking about microservices, micro frontends, serverless architectures, modern app … and, at the end of the day, I have the strong feeling that they are really talking about independence. All of them are in the journey from a centralized IT organization to a decentralized IT organization.

Centralized IT

In a centralized IT organization Infrastructure and Operations are managed by a team that takes care of provisioning, configurations, backup, security, monitoring, capacity management … when developers (i.e. line of business, teams) need new resources or changes they have to open a ticket (in some bad cases they need to write an email) and wait that someone fulfill their requests. Most of the times this requires a lot of back and forth to address details in the attempt to accommodate two different perspectives/goals: the central infrastructure team try to comply with internal standards/procedures/technologies, while developers try to obtain just what they need as quickly as possible. We all know this story and we all know this model proved to be inadequate and that it led to Shadow IT, do you remember? 8/10 years ago we were all talking about shadow IT. 100% of the cases I was exposed, Shadow IT was in the public cloud and organizations came up with some compromise that allowed Shadow IT to emerge from obscurity through a public cloud adoption initiative that allowed to have procedures, practices, visibility, etc. for operations, security, cost control, etc.

This is the organizational context where microservices, micro frontend, serverless, etc. are being applied with different level of maturity. If we look at these architectures from an organizational point of view, they definitely allow more scalable organisations with decoupled, autonomous dev teams so that many teams can work simultaneously on large and complex system. This rely a lot on APIs, in this context I like to think of an API as a contract among teams. My understanding is that developers like independence and they like APIs, so when they turn their attentions to infrastructure they would like to deal with it as they deals with other teams that is through APIs, there is not space for tickets.In this new context centralized IT cannot keep up.

Decentralized IT

In a decentralized IT organization teams (Devs, App Owners, SRE’s …) directly access configuration of infrastructure resources (through API, CLI’s, IaC and sometimes even through UI) in order to get exactly what they want as fast as they can. You can call it self-service IT infrastructure. So what happens to central IT people? What happens to standards? Who is responsible for the infrastructure in this context? Here is where different organizations do different things. Hereafter I’ll briefly touch different implementation I directly experienced and I’ll try to summarize pros and cons as they were presented to me

You Build It, You Own It

In my personal experience I know a few customers that let IT decentralization happen in the wild by extending the motto “You build it, you own it” to the infrastructure. In this case every team has its own public cloud of choice, IaC, configuration management, monitoring, log management, security, etc. Dev teams beef-up, while people in the central IT team reduced as a consequence of narrowed scope (they do not disappear as there is always some legacy infra to maintain), some of them moved into dev teams and some simply left the company. From the voices of who is directly experiencing this approach the pros are: freedom of choice, agility and possibility to experiment. The cons are: duplicated effort, reduced economy of scale, huge frustration in teams with company wide scope. I also heard some funny/scaring things like: “we forgot backup!” and “the person in charge of security left 2 weeks ago, so now we are doing shift left security”. One thing that hit me is that nobody mentions is speed, I would expect a lot of speed in this context. Asked this to a young guy that quickly moved up in the ranks in one of these orgs and he replayed that speed is great but dangerous, he prefers agility that in is view is the ability to recover from mistakes he can make very quickly.

Teams innovate, Central IT operates

A couple of very large customers told me they believe central IT and decentralized IT models need to coexist: they see an organization model where innovation happens into decentralized form while mainstream/traditional/legacy technology is managed by central IT. Few relatively small teams are responsible to introduce new platform technologies in the organization (e.g. on-premise Kubernetes, on-premise Cloud Foundry, public cloud) taking care of needs from development, operations and security. Once the platform is mature enough it is then transitioned to the central IT. These companies have experience with enterprise management software vendors with all the bad stories that comes with them: product end of life, clunky integrations, little evolution, difficult upgrades, etc. In their view in 2020’s no product/vendor on the market provides the capability they need, so they decided to build their own tools. Basically they rely on open source projects (Terraform, Ansible, Prometheus, ELK, Argo CD) that they stitch together with some other proprietary tools by means of custom code they build and maintain. They are trying to build a platform where ultimately central IT team and Dev/SREs teams can coexists with different roles and responsibilities. These are a quite ambitious initiatives, but I think it is a not new idea. At the time of writing this post they are in the process of building their own custom tools while pivoting to a decentralized IT model, so there are a lot of moving parts and I do not have real pros and cons from them. I can feel a lot of excitement for the new thing and the possibility to experiment, but they are suffering from a painful coexistence of legacy tools with the new thing they are doing and in other parts of the organization someone is already arguing with some choices and replacing some piece with something else.

Self-Service with Guardrails

Some other companies are trying to coexist central IT management along with decentralized consumption. The idea is that the central IT team is responsible to made available cloud resources (public, hybrid and private) as a set of homogeneous API to dev teams so that they can help themselves in getting access to what they need. Behind the scene the central team enforces policies (aka guardrails) related to resource placement, security, cost control, tagging, monitoring, log collection, audit, etc.

This API layer has another important role as it allows to decouple dev tooling from the infrastructure, this means that on side of the APIs each dev team has freedom of choice for languages, IDEs, repositories, pipeline tools, IaC tools, etc. and they are responsible for these tools. On the other side of the API the central IT team has freedom of choice on the cloud infrastructure and technologies. One thing to make it very clear is that they are NOT trying to build this API layer as a sort of lingua franca for the different clouds and technologies APIs they supports. Instead, I have seen companies implementing this approach successfully by allowing teams apply their existing IaC templates (Terraform, Cloud Formation, VMware Cloud Template …) through the API layer provided by the central IT team.

To me this third approach that I named “Self-Service with Guardrails” is the most promising as it allows both parties to get what they need: central IT can guarantee standards/ procedures and gain efficiency, while dev teams they have all the freedom they need to build their stuff.