Turning Shadow IT Into Sunshine IT

Max van Rest
Picnic Engineering
Published in
13 min readMay 3, 2022

--

Apart from software developers, also many employees in other roles help companies move forward by developing different kinds of software systems. Increasingly, the tools to develop these systems, from spreadsheets to programming languages, become more user-friendly and better available. At the same time, people are getting better and better at using such tools to their advantage. Ideally, it is easy for people to experiment, prototype, test and deploy new ideas and systems quickly. However, a growing number of systems come with different risks. It becomes especially problematic if there is no transparency with regards to these systems, their risks, and dependencies. Hidden systems, often referred to as shadow IT, are widely regarded as a threat to a successful business.

At Picnic, we see great value in those systems. We have up to a thousand systems in production scattered over numerous business teams. All of these systems combined take care of a significant part of our operations. In order to remove all the risks that come with hidden systems, we built an environment with all the possibilities that these systems deserve. In this blog post, we are shining a light on this development environment used by analysts, which we proudly call our Edge Systems Environment. And yes . . . we do not like the name shadow IT ;)

In this blog, we will cover three parts:

  • Part 1: The risks and origins of hidden IT systems
  • Part 2: The Edge Systems Environment
  • Part 3: The results

Part I: The risks and origins of hidden IT systems

Let’s first have a look at the challenges of hidden systems for companies, and explore their origin and the reasons why building them can be attractive. This should not only clarify the need for a solution but should also clarify the problems of developers and companies that the solution should address.

In fast-growing companies like Picnic, the number of analysts and engineers increases rapidly, and the number of applications that are being developed and maintained is even faster. Picnic’s customer base is also growing rapidly, meaning that applications have an impact on a larger number of customers every day.

By default, any system comes with certain risks in areas such as reliability and vulnerability, and those risks in and of themselves are not necessarily bad. It is the actual impact related to the risk when an incident occurs that is problematic. Different mitigation measures can be taken to lower the impact, but this is only possible when the risks are clearly identified. That is exactly the challenge when it comes to hidden systems: you don’t always know that the risks exist.

Why do hidden systems exist?

It is not without reason that hidden systems usually are present in large numbers in an organization: there are several advantages of creating a system that is not immediately deployed in a transparent environment. It is vital to address and understand these advantages as they are a direct translation of the needs of the developer of the system. Designing a solution to overcome hidden systems that do not address the needs of the developer, means that the solution will be an unattractive prospect to the developer.

At Picnic, we see that hidden systems exist for the following reasons:

  1. Rapid experimentation — Especially in the first stages of a product (or system), you would ideally ensure that the development and deployment effort is as low as possible to be able to test and learn as quickly as possible. Spreadsheets or scripts are great tools for quickly developing a prototype.
  2. Limited software expertise required — Colleagues might not have the expertise to develop high-quality software, but could still solve their problems using spreadsheets and simple scripts. In this way, people with less software expertise are able to contribute with valuable systems, which makes it beneficial for the company as well.
  3. Unawareness of the risks of hidden systems — This goes more or less hand in hand with the previous point. Developers that don’t have a high level of software expertise are often unaware of the risks that come with (low quality) software systems. Therefore, the urge to maintain a decent level of quality and transparency is not always felt.
  4. Perceived lower maintenance — Last but not least, it takes more time and effort to deploy and maintain a system in a transparent and safe environment. This is another reason why hidden systems could become the preferable option for a developer.

Measures can be taken to avoid system failures. The negative impact of a system failure depends on the users and other systems that are dependent on that system. The more users or other systems there are that are dependent on a system, the bigger the negative impact would be of a system failure, so the more measures you want to take to avoid failures and to have transparency in your systems.

Since it costs time and effort to ensure systems are not hidden, you also want to avoid spending too much time and effort on systems with smaller risks (usually with fewer users or dependencies on other systems).

Companies often have a split between two options:

  • Systems with low impact and low maintenance costs (hidden systems are often these types of systems)
  • Systems with high impact and high maintenance costs (usually dedicated product teams)

Though, there is a whole range between those two extremes. There are a lot of scripts and spreadsheets that are fundamental for certain operational processes, but putting a whole software team around it would be overkill.

Usually, applications evaluate over time and most often they grow in functionality, users, stakeholders and dependencies on other systems. At a certain point, the risks of a hidden system becomes so significant that shining a light on it becomes worthwhile.

The other way around could occur as well, though not very often: a system with a high quality and transparency level shrinks in users and dependencies and therefore risks. At a certain point it does not pay off anymore to keep the same quality level, and it could be turned into or replaced by a low maintenance system, possibly in a hidden environment.

Part II The Edge Systems Environment

Our mission was clear: we needed to come up with a solution that would reduce the risks of hidden systems, but would not limit the developer to experiment and prototype quickly. We wanted to create a solution that would become the preferred option to develop systems for those Picnic employees who would normally develop hidden systems.

We noticed the biggest need for such a solution coming from business analysts, who make many systems that are crucial for our operations but don’t have the skills and experience to build these systems in a reliable software environment. Many of these systems end up in hidden environments such as local drives. Since we could make a lot of impact on analysts and their systems, we decided to start focusing on this group first.

Objectives

So far, we have seen several challenges of hidden systems and several reasons why people create hidden systems. To come up with a successful solution, we set the following objectives that target both the challenges as well as the user issues and wishes:

  1. It should solve the issues for the company, for example by:
  • Reducing risk for the company and its customers
  • Ensuring that the company can continue growing fast even though users have limited software expertise

2. It should solve issues for the developer, such as:

  • Not having the right knowledge to develop a transparent system
  • Not having the time to develop and/or maintain a transparent system

3. For developers, it should be the preferred alternative to other (hidden) solutions, by:

  • Creating awareness of the risks of hidden systems
  • Offering more benefits than the tools and platforms used for hidden systems

Objective 2 ensures that developers want to use the solution. Objective 3 ensures that developers are going to use the solution. Finally, objectives 2 and 3 together enable objective 1, which makes the solution valuable to the company.

The Edge Systems Environment

Our Edge Systems Environment comprises of 3 components that we believe are critical to the success of the environment:

  • The onboarding
  • The technical framework
  • The community

The onboarding

We want our analysts to make a flying start, so we make onboarding into the Edge Systems Environment a breeze. In the first few weeks, an experienced senior Edge System developer (we call them Edge Explorers), gives an in-person presentation to each new analyst, shows some examples of successful Edge Systems, introduces them to the technical framework, and helps in scoping potential projects that the analyst would like to create in the ESE. This way, we ensure that new onboards have a dedicated contact, know which existing plug-and-play modules have been created for common challenges, and link the new possibilities of the ESE directly to a relevant use case.

The analysts are then given an internal handbook which serves as a reference for the information presented so far, and also introduces them to a tutorial in which they will make their first Edge System. The tutorial covers a range of topics, from implementing an ETL cron job to having a reproducible environment and a structured review process. The review process, for instance, introduces the analysts to the concept of pull-requests and how to structure these effectively.

The technical framework

The ESE offers a technical framework to analysts to reduce the effort of building and maintaining their Edge System, but also to ensure that the minimum quality standards are met. The goal is to have analysts focus as much as possible on the business logic of their system only. The framework consists of a standard set of DevOps tools and Python libraries, each with a specific purpose, such as facilitating interaction with Picnic’s core systems like Picnic’s Data Warehouse, facilitating CI/CD, and improving code quality with linters and formatters.

All Edge Systems created are easily set up using Picnic specific cookiecutter templates. We noticed that our analysts often have a similar use case: a scheduled (repetitive) job to gather, process, and analyze data and load the insights into a Slack channel or a spreadsheet. Therefore, the first template that we offer to analysts is a Cronjob template. Currently, we’re working on offering a Web Service template as well.

The result of this framework is that the analyst can focus on the business logic without spending much time on everything else that is needed for a reliable system.

Good to know: If you’re interested in the technical details of the Edge Systems Environment, we are planning another, in-depth article about the technical framework. Stay tuned!

The community

Besides providing support on the individual level, we set up a network of Edge Explorers and Edge Analysts and are building an active community around ESE across Slack and Github.

In each team of Edge Analysts there is one Edge Explorer, who functions as a coordinator and first point of contact and support for Edge Analysts. Edge Explorers help analysts during onboarding, review their code, and coordinate the ownership and priorities of the Edge Systems in the team. Edge Explorers play an important role in keeping the ESE scalable.

In Slack, new Edge Systems, common challenges, and operational critical support requests are openly discussed. On Github, teammates and Edge Explorers review each other’s code before adding new features or altering existing code. Reviewing each other’s code on GitHub enables a basic level of governance, since more than a single person is aware, has access to, and can contribute to the latest state of any ESE project.

This self-serving community, therefore, enabled us to scale up the number of analysts in the ESE to over 50 analysts and 45 Edge Systems in less than 2 years supported by only 3 engineers!

The Edge Systems network existing of Edge Analysts, Edge Explorers and the Python Platform team

Quality assurance

Creating functional, performant and maintainable code is important when building highly impactful systems. As mentioned earlier, all Edge Systems code is hosted on Github which enables various (automatic) quality assurance processes. First, we maintain and improve code standards by using a collection of linters and formatters all wrapped up in an easy to use internally developed command line tool, which we call picnic-guide. This tool automatically runs at each pull request and prevents newly detected code mistakes from making it into an Edge System. Most common warnings/errors are proactively documented and the ESE community is also always there to help with more complicated situations. We presented the tool at a Picnic Meetup and the response was so positive that we will investigate the possibility of open sourcing this tool in the future. Dependencies are automatically suggested to be updated using the Renovate bot. We opted to still require a human confirmation at this step, since updated packages can introduce incompatibilities with regards to the existing code, or can introduce new bugs.

In the final quality step, each PR undergoes a light review by a Software engineer who uses a standard checklist to assess if the ESE framework is used correctly, if the new approaches are maintainable, and if additional access to internal or external systems is required.

Part III The results

In the coming weeks, the Edge Systems Environment will have been operational for two years, which is a great reason to present some of the benefits and results we have observed so far. This part will give an overview of the achievements and the challenge that we are still facing.

Achievements

This section provides the achievements, linked to each objective stated at the beginning of this chapter.

Objective 1

It should solve the issues for the company, for example by:

  • Reducing risk for the company and its customers
  • Ensuring that the company can continue growing fast even though users have limited software expertise

Results

Over 45 critical hidden systems have been changed into reliable and well-maintained Edge Systems. This number increases exponentially due to:

  • A growing community of Edge Analysts
  • A growing list of critical hidden systems that the owner wants to turn into an Edge System
  • Growing awareness of the risks of hidden systems
  • Growing reach of the ESE and its benefits

Over time, the community is not only growing in number but also in the software skill level of Edge Analysts, enabling them to build more and safer Edge Systems. Almost 50 analysts have been onboarded as Edge Analysts which is almost half the amount of analysts we currently have in our business teams.

Apart from that, several systems that originated in tech teams were turned into Edge Systems enabling analysts to maintain and improve them, while freeing up software engineers from the tech teams.

Objective 2

It should solve issues for the developer, such as:

  • Not having the right knowledge to develop a transparent system
  • Not having the time to develop and/or maintain a transparent system

Results

Again, almost 50 colleagues completed the Edge System tutorial and are now able to turn hidden systems into systems that satisfy all the quality requirements of a critical system. At the start of the tutorial, most of them only had basic Python experience. On average the tutorial takes them around 3 days to complete, usually spread out over a month or two. The existing Edge Systems are owned and maintained by the analysts, with minimal support from the Edge Explorers and the Python Platform team.

Objective 3

For developers, it should be the preferred alternative to hidden systems, by:

  • Creating awareness of the risks of hidden systems
  • Offering more benefits than the tools and platforms used for hidden systems

Results

As more and more colleagues build, use, or become aware of the ESE, the overall awareness of the risks of hidden systems is significantly increasing. We notice this from the increasing interest in the ESE from analysts and the type of questions they have about the ESE (‘How?’ instead of ‘Why?’, and even ‘Why not for this too?’ and ‘Why don’t we do this for all the systems in our team?’).

Even though it takes a considerable amount of time to go through onboarding and development, analysts and teams are aware of the benefits of Edge Systems which makes the ESE attractive to them. Benefits include:

  • It is by far the quickest and fastest way for analysts to set up a reliable system
  • It reduces risk for the company
  • It enables better maintainability
  • It allows easy integration with other Picnic systems
  • It is a rewarding and effective way for an employee to improve their software skills

Ongoing challenges

Currently we mainly face one challenge, which is the increasing need of support from the Python Platform team. Currently, the Python Platform team consists of 3 engineers and it feels like a great achievement to be able to support and continuously improve over 45 critical systems. Though, we see that the required support is increasing rapidly, due to the increasing amount of Edge Systems, Edge Analysts, deployment markets and required resources. We are currently improving this by lowering the amount of support time required per Edge System/Analyst, by:

  • Automation of resource provisioning (currently a large part of the required support)
  • Better and clearer onboarding documentation and processes
  • Python courses and Python Guild
  • More support responsibilities for Edge Explorers, who can play an important ‘first-line of support’ role for analysts

Apart from this, we introduced a specified timeframe for the support we deliver. Together with the Edge Analysts and Explorers, we prioritize which Edge activities the Python Platform team is going to support each quarter. We ask Edge Analysts and Explorers to give an indication of the added value of their planned work to the strategic goals of Picnic. Together with the estimation of the required support from our side, we prioritize the different projects. Thanks to this process we immediately have a clear insight into the value that the Edge Systems have to Picnic.

A final note to the reader

Hopefully, this article gave you clear insights into how we, at Picnic, turn shadow IT into sunshine IT with our Edge Systems Environment. Feel free to reach out if you have any questions, tips or suggestions. We are very curious to hear stories and approaches from other teams and companies on overcoming hidden IT systems. Of course, we would appreciate it a lot if you’d like to think along with us and help make the Edge Systems Environment an even bigger success!

Next, we are planning to write another article that will go deeper into the technical aspects of the Edge Systems Environment. Be sure to keep an eye out for it if you’re interested!

Would you like to join the Python Platform team and help bring the Edge Systems Environment to a higher level? Take a look at this role!

Interested in joining one of the other amazing teams at Picnic? Check out our open positions and apply today!

--

--