2023 in a nutshell —ride along!

Paulo Geusau
Picnic Engineering
Published in
9 min readDec 19, 2023

--

With operations in full swing to pull us through the busiest time of the year, the code slush we apply in some of our teams allow us to take a step back and reflect on another exciting year in the crazy little groceries roller coaster we call Picnic. In this blog, we’d like to give you a glimpse into some of the major developments in Picnic Tech in 2023. Join us and have a read!

January: Year of OpsEx and DevEx 🧑‍💻

The end of 2022 marked the beginning of our journey in enhancing Developer Effectiveness, a key initiative for 2023. Drawing inspiration from Google’s well-established approach, we focus on three core dimensions: feedback loops, cognitive load, and flow. We aim to streamline the development process, enabling every engineer to maximize their impact​. Refining our feedback loops ensures that our developers receive timely, relevant, and constructive feedback, directly impacting their productivity and code quality. We also address cognitive load, simplifying developers’ daily complexity and allowing them to focus more on creative problem-solving. Lastly, by fostering an environment that supports ‘flow,’ we aim to maximize developers’ time in a highly productive and creatively fulfilling state. One initiative that we rolled-out is auto-merge, increasing our overall development velocity.

In tandem with DevEx, in 2023 we also started our Operational Excellence (OpsEx) initiatives. These initiatives focus on incident handling, observability, and Service Level Objectives (SLOs). Towards the end of the year, the focus shifted towards running stable operations to minimize incident costs. Our OpsEx strategy revolves around developing robust processes for quick and effective incident resolution, minimizing downtime, and maintaining high service reliability. By setting clear, achievable SLOs, we establish a standard of performance that guides our operational efforts. Moreover, improving alignment and cross-understanding with the business and operational teams. This approach not only helps in maintaining system stability but also in predicting potential issues, enabling proactive measures. Combining efficient incident handling, establishing resilience by design, and strict adherence to SLOs are pivotal in ensuring our services remain resilient, reliable, stable, and user-centric.

February: Elevating our Continuous Integration Setup! 🛠️

At Picnic, we are constantly looking to refine our development procedures. One bottleneck we identified was our CI setup. Faced with slow build times and long wait queues, we sought a robust new solution aligning with our expanding scale and ambitions. Evaluating over 10 alternatives, we ended up selecting TeamCity for its feature-rich nature, flexibility, and seamless cloud integration.

In February we completed the migration which unfolded in multiple stages, involving hands-on testing, evaluation periods and a careful fine-tuning process. The result? A setup that is working great for us! Our TeamCity configuration, linked with GitHub via webhooks, provisions AWS agents swiftly, executes build steps reliably, and can be configured flexibly by our engineers. The impact? Build times flew down from 45 to 10 minutes, queue times shortened to a mere 2–3 minutes, and maintenance reduced to a few days per quarter. The migration was a success and opened new opportunities for improvement, allowing us to continue the mission to find a development process that works optimally for Picnic!

For more information on our migration, see this blogpost (Bye Travis CI, hi TeamCity!).

March: Simulating our Fulfilment Center using Python🪞

How do you test software for an automated fulfilment centre without halting operations? You simulate it all! Early spring of 2023 marks the release of our Python-based warehouse simulation, as described in this (Replicating a multi-million Euro automated warehouse in Python) previous blog post. Simulating our fulfilment centre allows Warehouse Control System (WCS) developers to rigorously test their strategies before deploying to real-world operations. Although discrete event simulation is great at simulating theoretical machines, WCS in the real-world deals with a lot of unknowns: object tracking is not continuous, there is latency in communications and there are even real people interacting with the system. We coupled a real-time system to a virtual timeline in our discrete event simulation, and learned how to write stable tests. Test scenarios range from happy flows, such as complete picking of items into a tote, to complex error flows, such as a tote getting lost in the system. The result is a robust testing environment, underscoring our commitment to technological innovation and developer effectiveness. Now we use simulation-driven automated testing as a stage of our CI/CD pipeline, totaling over 900 pull requests since the release!

April: Organizing the first Incident Management Training 🧑‍🎓

As part of our commitment to Operational Excellence, the Tech Academy (Meet the Trainers of Picnic’s Tech Academy) at Picnic launched a comprehensive in-person incident training initiative called ‘Incident Response Management’, available for everyone within the Tech team. This training aims to equip everyone within tech with a deeper understanding of Picnic’s incident response process. Approximately 75 attendees have participated in the training in 2023, gaining valuable insights into efficiently resolving incidents. We cover several important topics, beginning with understanding the incident process at Picnic. Participants are then engaged in a practical exercise, solving an incident in a controlled environment and simulating real-world scenarios without the associated risks. Communication is critical in incident management, and our training emphasizes best practices, such as using incident roles, to ensure clarity and effectiveness in high-pressure situations. Focus on problem analysis and root cause analysis equips our developers with the skills to address the symptoms of incidents and identify and mitigate their underlying causes. The training also includes guidance on writing clear, comprehensive incident reports and conducting effective post-mortem meetings. These meetings are crucial for learning from incidents and preventing future recurrences.

May: Redesigning our Picnic App Back-end 🌀

In May we introduced a completely new approach to personalising content in the Picnic app. In 2022 we started presenting multiple Picnic app layouts to different customers instead of our previous one-fits-all approach, so customers would be shown a more relevant app based on factors like their region and customer profile. These layouts would be configured weekly by developers based on input given by business teams. To improve flexibility and decrease the workload for developers, we launched a complete Store back-end redesign in May, called the Picnic Page Platform (PPP). With PPP we can easily place and configure templates to enable a targeted, customised app layout for customers. Behind the scenes, an integration with Apache Calcite converts business team’s SQL queries to data used in templates, considerably speeding up the development of novel app features. Calcite also plays an important role enabling access to information that we couldn’t query before because of the nature of the data or the data source, such as warehouse stock levels or customers’ basket contents. This allows business teams to personalise Picnic’s store as far as the imagination reaches!

June: State-of-the-art AI Models 🤖

At Picnic, the use of transformer models in demand forecasting (Running demand forecasting machine learning models at scale) represents a significant technological leap. These advanced AI models enable our Data Science team to accurately predict customer demand for products; minimizing waste and avoiding product unavailability. This enhanced forecasting is critical for managing the supply chain, especially with perishable goods like fruit, where the balance between surplus and shortage is delicate. By accurately predicting daily demand, Picnic ensures customer satisfaction by having the right products available, reduces waste, and strengthens sustainability efforts. This approach shows Picnic’s commitment to leveraging cutting-edge technology for efficient, customer-focused, and environmentally conscious operations.

July: Introduction of a new Transport Planning System 🚚

A lot has changed since Picnic started out in 2015 with a single fulfilment center and a single hub, the place where we load our iconic electric Picnic vehicles. Nowadays, we have around 10 fulfilment centers and 65 hubs to serve all our customers in The Netherlands alone! Not only have we expanded within the Netherlands, but we’ve also ventured into Germany and France, where we are creating fulfilment centers and hubs in a lot of new regions.

To get all of the groceries in the right place at the right time we have trucks driving between various sites pretty much around the clock. In the beginning of the year, the scheduling of these truck movements still included a lot of manual steps. In July, we introduced a new Transport Planning System which saves a lot of manual work and makes the process more efficient. The new planning system sets us up for scaling our operations in Germany and France and uses an advanced search algorithm to optimize the efficiency and sustainability of our transportation.

August: Security Awareness Week 🛡️

In August, we hosted the Security Awareness Week (Creating an Engaging Security Awareness Program), an initiative guided by our Security Team in collaboration with the Digital Workspace Team. This annual event, now in its third year, has the primary objective to educate and engage Picnic employees about the critical importance of cybersecurity. We increase their awareness of common security threats, build a culture of cybersecurity vigilance, and promote best practices to safeguard personal and company information. Key benefits of this initiative include fostering a proactive security culture, where open communication about security issues and reporting suspicious activities are encouraged.

This proactive stance empowers our employees to take ownership of their security responsibilities and actively contribute to Picnic’s robust cybersecurity framework. Moreover, demonstrating our commitment to cybersecurity helps enhance customer trust. As we handle large amounts of customer data, we must ensure our employees are well-equipped to protect customer information.

September: Observability using Datadog 🐶

In our journey towards developer and operational excellence, a key element that we haven’t addressed so far in this blog post is observability. September marks the migration of our tracing and application performance monitoring to Datadog, as already described in this (Picnic’s migration to Datadog) blog post. Our focus for vendor selection was observability-as-code, as it enhances ownership, decreases operational workload and helps accelerate adoption of observability practices across all teams. We especially like that Datadog centralizes multiple different types of information, such as database performance, service level objectives and Kubernetes network monitoring. We are not done yet, though, because we plan to expand monitoring across-the-board in the near future, including frontend monitoring and event-driven application tracking.

October: Continuous Deployment for everyone ⏭️

Step-by-step, observability tools like Datadog bring us closer to operational excellence. Regarding DevOps practices such as Continuous Deployment (CD), operational excellence is a big enabler for developer excellence too. The first Picnic teams embraced CD already some time ago, but in October more teams followed. Although we see the benefits of CD practice in code development regarding developer flow and the number of manual steps required during development, it’s important to realise that it is not a panacea to any development cycle. As each team is unique, our goal is to help individual teams at every step of their way to developer excellence. For teams equipped with the prerequisites for continuous deployment, we support their adoption of this practice, empowering them to elevate their development strategies.

November: Migrating our Automated FC to the Cloud ☁️

In 2022 we opened our first automated fulfilment centre in Utrecht. Our custom warehouse management software controls the movement of totes filled with groceries all around the warehouse. This process involves many rapid decisions where minimal latency between servers and operational hardware is crucial. For this reason, when opening this fulfilment center we decided to host all the applications on-premise. As a cloud-native company, this was a significant departure from our usual practices. Managing the on-premise hardware proved more challenging than anticipated and when experiments with introducing additional latency demonstrated that it was feasible to migrate all the applications to the cloud we decided to go for it! During two nights in November, we successfully migrated all our applications to the cloud. This migration required extensive preparation of many teams and we are proud to announce that we are full cloud-native, again!

December: Preparing for Christmas 🎄

We made it to the last month of this eventful Picnic year! December is one of our favourite months: it’s a time when festivities are in full swing, and at Picnic we take this opportunity to spread the holiday cheer for both our customers and colleagues, while ensuring our systems are ready to deliver excellent service.

Our whole app is transformed into a winter wonderland: groceries are delivered by a sleigh, our mascot Peter is wearing his favourite Christmas sweater, and our online store is filled with Christmas recipes and products. But it’s not only the customer-facing app that gets a Christmas skin! The apps that our shoppers and runners use to first fill baskets full with nice Christmas products and then bring them to our customers’ doorstep are also decorated!

Written by Pepijn van Aken, Berry Vermin and Paulo Geusau. Many thanks to Sander Mak, marc.marschner and Maxim Oei for reviewing the draft of this post.

--

--