How to empower teams to better support software systems?

Creating new systems from scratch is so much fun. I love it when you can dream up a project. I have a candy shop full of technologies I can choose from. It is fun creating all those shapes and connecting the lines when laying out the architecture of the system. The highlight for me is when the development starts. Not so much fun but necessary is setting up the CI/CD pipelines, and then that magical moment when you promote the production application! I have the best job in the world!

Well, not for all engineers. What about the engineers or team(s) that have to maintain the system? these engineers don’t have the in-depth understanding of the system I had because I was there from day 1. During any previous planning phases, did we think of them? Probably not.

I want to put the focus back on two often neglected functions to ensure that support and maintainability are taken into account during the initial stages and not reactive to it. This will make supporting any system a more pleasant and productive experience for the next engineer or team.

Handover to another team for support

Once a new system moves into the production environment that is when the real “fun” starts. It is seldom the case that Team A  develops the system or the feature and maintains it until its end-of-life. Let’s for a moment assume that is the case at some point people leave, and the team with the same name now looks different but the system has not changed apart from new enhancements or more features.

Team A, who is the original developer of the system hands it over to Team B, the team supporting the new feature. Team A with a specific skill set and a high-performing fast execution team moves onto a new project. Energy flows where the attention goes. Team A neglected to create sufficient documentation on potential troubles the system experienced during the development and initial production phases.

When Team B takes over it more often than not requires a handover meeting with Team A. Team A now needs to spend energy to get all the documentation up to date and add additional documentation as gaps were identified by Team B’s efforts. The timing sucks because Team A has to context switch and create more documentation in a hurry because they have other priorities to deal with and the focus has already shifted. The quality of the documentation as well as the communication during the handover suffers. 

Look at the following scenarios

Team ACreates complete documentation
Team B Read all the documentation. Is self-serving and productive
Table A

Team A Creates incomplete documentation
Team BReads incomplete documentation. Identifies the gaps
Team AMeets with Team B
Team BMeets with Team A
Team AUpdates outstanding documentation
Table B

I’ll admit. Table A looks like a pipe dream. Nonetheless, let’s marvel at the beauty of it. Extremely efficient!

During the development phase, many issues are identified and fixed which provides a good opportunity to document these problems for future reference. Not all problems have to be documented because of code fixes but there will be some processes and pipelines problems that will re-occur again.

Production is the best place to identify the one-percenters. Systems behave differently in different environments. The production database is much larger than in other non-prod environments. These lower environments have obfuscated data and much less of it. During the initial production phase, there will be enough new problems to document because of the sheer volume and combinations of data to be served.

It was impossible to account for all these scenarios and exceptions during the development and testing phases. ETL Processes fail and they will. Pipelines break during a deployment to production and there will be some other data issues. It is a perfect opportunity for Team A to document all these issues.

In addition to troubleshooting documentation, there must be installation and configuration guides as well. Think of EVERYTHING that will make life easy for newcomers to the system.

Documentation should be written like you are explaining to a non-technical person because often we try and help another person by assuming they have a certain predefined context of the system. That is where we often create problems for ourselves and waste our own time. People will come back and come back again because we neglected to explain the entire context before providing the solution.

Administering and supporting the system

I fail to recall any time in my career in software during the design and development phase when the architects and engineers envision how the system might function, and what the potential problems or one-percenter scenarios will be. It may have been part of the process initially but eventually falls behind due to time and money pressures. It is a habit to only think of the happy path.

We use the best coding techniques and practices. We use all the patterns that make the system robust. We have done everything to make the system perfect.

Then we run into problems…

Most large systems are dependent on a wide variety of data from different sources. In my experience most often data is fed using large ETL batch processes or it can be a high-volume transactional system or both! It becomes complicated to apply any large-scale fixes during failures, data removal, or large-scale data integrity problems. Flexibility is gold!

These will be the standard questions to ask to ascertain from a technical point of view if the system can be remedied at its simplest.

  • Is there a job/process that can be run to fix any data issues which might have been processed incorrectly or incorrectly inserted during input?
  • Can these processes be run during business hours?
  • Does everyone on the team have sufficient permissions or access to the servers?
  • Do we have the ability to update records in batches?

If it is yes to all the above it is still not the ideal situation. Some engineers lack confidence and if these services are restarted and fail or done at the wrong time it can affect clients and SLAs. Not everyone has elevated permission and access to database servers for example. We often lack the flexibility and by that I mean tooling to remedy any data processing problems at a large scale. Let’s take a look at the potential two areas which can make optimize support in this area.

Automation

In the outline of work, we have the luxury of the skill to automate critical business processes and/or mundane tasks we have to execute daily. We are good at automating releases and testing. Those are fundamentals of our systems and are necessary! We have to!

I don’t think we are any good at automation for self-healing or self-correction. Netflix’s infrastructure is too large for humans to monitor and out of necessity they implemented intelligent systems to monitor and apply corrections. Does this kind of intelligence always have to be born out of necessity? Why can’t it be baked into just the good practices philosophy? Are we afraid of losing control and losing our jobs? Or are we lazy in the mind?

The fact is that this type of automation will set you apart from the rest. Remember; Energy flows where the attention goes. Having these processes in place opens engineers up to focus on innovation and revenue opportunities or to pay back the debt (tech). You don’t have to be a large-scale company to achieve this.

Let 3rd party tools do the night shift instead of you looking at your email or teams/slack message every 2 hours. If these are properly configured they can do a lot more than you and best of all it never gets tired! We have lambdas, Azure function, python scripts, PowerShell, Apache Nifi, or any other robust task automation tool. There are multiple options and no excuses.

Administration interfaces

What if we need to invalidate client records and remove them from the system on request? What if we need to spot-fix a couple of records with incorrect information and it is too expensive to re-run the processes or to fix them at the source? Typically, a team would run a script directly injected into the database. Is there a review process? Is the script optimized enough if it needs to delete thousands of records from the database?

Will this script content for resources on the database server during business hours or after hours? Is this process secure? With some planning upfront and thinking about these scenarios, engineering teams can create APIs and administration UI systems to capture these one-percenters. APIs are effective to use if UIs aren’t available and with proper authorization and authentication, non-technical people can use these too.

These problems are those one-percenters but they tend to take so much time and effort to mitigate and remedy. It would be prudent to develop these mitigation steps during the development phase to administer these difficult requests. Every microservice or system must have some administration component for the data it produces.

These administration functions both UI or API endpoints that can be used by non-engineers and typically will be safe to use removing expensive time and effort from engineers to execute these processes or scripts. It creates flexibility and confidence in the system.

Conclusion

It is important to include this documentation, administration, and automation during the design and development stages. Create awareness and be diligent about it from the start. The price you pay upfront will be relatively small to the huge price you will pay later.

It is not only about making life easier for you or your team. Be different and make it easier for everyone supporting the system down the line. The lifespan of a good system can easily last 5-10 years. There will be many people and engineers responsible for it over that period.

I would love to hear your thoughts…

How to empower teams to better support software systems?

165 thoughts on “How to empower teams to better support software systems?

  1. Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?

  2. Have you ever thought about publishing an ebook or guest authoring on other websites? I have a blog based on the same subjects you discuss and would love to have you share some stories/information. I know my subscribers would appreciate your work. If you’re even remotely interested, feel free to shoot me an email.

  3. В этой статье-обзоре мы соберем актуальную информацию и интересные факты, которые освещают важные темы. Читатели смогут ознакомиться с различными мнениями и подходами, что позволит им расширить кругозор и глубже понять обсуждаемые вопросы.
    Изучить вопрос глубже – https://medalkoblog.ru/

  4. This is the correct blog for anybody who desires to find out about this topic. You notice so much its virtually onerous to argue with you (not that I really would want匟aHa). You positively put a new spin on a subject thats been written about for years. Great stuff, just great!

  5. I and my guys ended up examining the great hints located on the blog and so all of a sudden got a horrible feeling I had not thanked the web site owner for those strategies. My women are already happy to see them and have in actuality been loving them. I appreciate you for genuinely so kind as well as for using this sort of brilliant information most people are really desirous to know about. Our own sincere apologies for not expressing appreciation to you sooner.

  6. There are some interesting cut-off dates on this article but I don’t know if I see all of them center to heart. There is some validity but I will take maintain opinion until I look into it further. Good article , thanks and we want extra! Added to FeedBurner as well

  7. The Ethics of AI: Should We Fear or Celebrate Llama3’s Self-Awareness? [url=https://telegra.ph/Sign-Up-for-Tinder-Without-Your-Real-Phone-Number–Heres-How-06-03]can you browse tinder without an account [/url]

  8. Can you be more specific about the content of your article? After reading it, I still have some doubts. Hope you can help me.

  9. I have not checked in here for a while because I thought it was getting boring, but the last few posts are good quality so I guess I will add you back to my daily bloglist. You deserve it my friend 🙂

  10. MarioTot says:

    ¡Saludos, participantes del juego !
    casino online extranjero sin comisiones ocultas – п»їhttps://casinosextranjero.es/ casino online extranjero
    ¡Que vivas increíbles victorias épicas !

  11. This text is in English, so I’ll respond in English.

    Your perspective on creating systems that last and benefit everyone involved is truly inspiring. It’s refreshing to see a focus on long-term sustainability rather than just quick fixes. I particularly agree with the idea that a good system can last 5-10 years—it really speaks to the importance of thoughtful design and planning. However, I’m curious about how you balance innovation with the need for stability in such systems. Do you have any specific examples where this balance was successfully achieved? Also, how do you ensure that future engineers or team members can easily understand and maintain the system? Your insights could be incredibly valuable for those of us looking to implement similar approaches. One last thing—are there any tools or frameworks you recommend for building systems with such longevity?

    We’ve integrated libersave into our regional voucher system. It’s fantastic how it simplifies bundling various providers on a single platform.

  12. ¡Bienvenidos, participantes de emociones !
    Casino fuera de EspaГ±a sin geolocalizaciГіn – п»їhttps://casinoporfuera.guru/ casinoporfuera
    ¡Que disfrutes de maravillosas momentos memorables !

  13. Hello navigators of purification !
    Air Purifier Cigarette Smoke – Affordable Picks – п»їhttps://bestairpurifierforcigarettesmoke.guru/ air purifier to remove smoke
    May you experience remarkable tranquil settings !

  14. Этот текст написан на английском языке.

    The article provides valuable insights into creating systems that benefit everyone involved, not just the immediate team. It emphasizes the importance of long-term thinking, which is crucial for sustainability. I appreciate the focus on considering the future engineers who will maintain the system. However, could you elaborate on specific strategies to ensure the system remains efficient over its lifespan? WordAiApi

  15. ¡Hola, buscadores de recompensas excepcionales!
    Casino online sin licencia EspaГ±a sin impuestos – п»їhttps://casinosinlicenciaespana.xyz/ casinos sin registro
    ¡Que vivas increíbles instantes únicos !

  16. Throughout this grand scheme of things you get an A+ with regard to effort and hard work. Where exactly you actually misplaced me ended up being on all the facts. As people say, the devil is in the details… And it could not be more correct in this article. Having said that, let me reveal to you what did give good results. Your writing is highly engaging which is possibly why I am taking an effort in order to comment. I do not make it a regular habit of doing that. Next, even though I can see the leaps in logic you come up with, I am not necessarily certain of how you seem to connect your points which inturn make the final result. For the moment I will, no doubt yield to your position but trust in the foreseeable future you actually link your facts much better.

Leave a Reply

Your email address will not be published. Required fields are marked *