<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Educated Software Engineer]]></title><description><![CDATA[Being a modern software engineer is complex. But it doesn't need to be.]]></description><link>https://newsletter.oliverjumpertz.com</link><image><url>https://substackcdn.com/image/fetch/$s_!9g4o!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828ee572-ec04-43f9-b0e0-79cc8e7dff8b_256x256.png</url><title>The Educated Software Engineer</title><link>https://newsletter.oliverjumpertz.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 08 Apr 2026 05:35:08 GMT</lastBuildDate><atom:link href="https://newsletter.oliverjumpertz.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Oliver Jumpertz]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[theeducatedsoftwareengineer@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[theeducatedsoftwareengineer@substack.com]]></itunes:email><itunes:name><![CDATA[Oliver Jumpertz]]></itunes:name></itunes:owner><itunes:author><![CDATA[Oliver Jumpertz]]></itunes:author><googleplay:owner><![CDATA[theeducatedsoftwareengineer@substack.com]]></googleplay:owner><googleplay:email><![CDATA[theeducatedsoftwareengineer@substack.com]]></googleplay:email><googleplay:author><![CDATA[Oliver Jumpertz]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[5 Most Important Lessons That Helped Me Grow As A Software Engineer]]></title><description><![CDATA[The most difficult lessons I had to learn to unlock my full potential and grow beyond myself to become a great software engineer]]></description><link>https://newsletter.oliverjumpertz.com/p/5-most-important-lessons-that-helped</link><guid isPermaLink="false">https://newsletter.oliverjumpertz.com/p/5-most-important-lessons-that-helped</guid><dc:creator><![CDATA[Oliver Jumpertz]]></dc:creator><pubDate>Sat, 09 Dec 2023 11:30:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!iP3m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iP3m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iP3m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iP3m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iP3m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iP3m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iP3m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg" width="1200" height="686" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:686,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:162493,&quot;alt&quot;:&quot;5 Lessons That Helped Me Grow As A Developer&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="5 Lessons That Helped Me Grow As A Developer" title="5 Lessons That Helped Me Grow As A Developer" srcset="https://substackcdn.com/image/fetch/$s_!iP3m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iP3m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iP3m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iP3m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf7e36d4-3969-474b-ab66-2bee7db05066_1200x686.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Depending on what you try to achieve in life, you might want to advance in your career or simply in the profession of a software engineer. Perhaps you want to become really good before you start your own business, or you want to climb the corporate ladder. But no matter what your motivation is, growing as an engineer does not only have a lot to do with your hard skills. It's also about soft skills and your own mental model.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Here are the five lessons, as far as I can recall, that helped me immensely to grow as a software engineer. Let me share them with you in the hope of them helping you as much as they helped me.</p><div><hr></div><h2>Learn to listen</h2><p>There is a German saying: &#8220;Reden ist Silber, Schweigen ist Gold&#8221;, which roughly translates to: &#8220;Talking is silver, being silent is gold.&#8221;</p><p>What it basically means is: Sometimes, it&#8217;s better to not say anything unnecessary. In an even broader scope it can be understood as: Listening to others can often be more valuable than talking yourself.</p><p>Especially in software engineering, just listening to your peers can be a very valuable action to perform. You don&#8217;t always need to comment on everything, especially if a comment wouldn&#8217;t add a lot of value. The amount of knowledge you can gain by simply letting your colleagues talk about their experience with certain problems, or approaches they have taken in the past, is invaluable.</p><p>You also don&#8217;t need to interrupt your peers if you have questions. Let them finish first. If you are afraid to forget the question, write it down. Ask the question later.</p><div><hr></div><h2>Accept you don't know everything</h2><p>Software engineering is too broad and too deep as if we could ever really know everything. You need to accept and understand that. The more senior you become, the more you realize what you don&#8217;t know. But that&#8217;s not a bad thing. It is just what it is.</p><p>There are many different fields in software engineering, and even many fields are so deep that you can never learn everything about them. Even if you try, your older knowledge will be outdated the moment you reach a quarter of what there is to learn in your field of interest.</p><p>Instead, focus on what you have most interest in and let projects drive your curiosity. On-demand learning can often be way more powerful than learning things only for the sake of learning them. Especially if you have a solid foundation of knowledge built up, it becomes way easier to absorb new concepts and technologies on the fly.</p><div><hr></div><h2>You will never catch all edge cases</h2><p>Bugs are everywhere, and they will always be. It&#8217;s not optimal, but once again, it is what it is. That is what quality assurance measures like tests are for, to catch as many of them as possible. But at some point, it&#8217;s not worth putting more thoughts into catching all possible cases. No matter how much energy you put into it, there will always be cases you can&#8217;t even think of. It&#8217;s better to preserve this energy for more important tasks.</p><p>A better way to deal with this circumstance is to try to catch the most common ones you can imagine and then guard against the unforeseen cases as good as possible. In the end you need to make sure that bugs you couldn&#8217;t catch don&#8217;t bring everything down. Sometimes, a circuit breaker can lessen the blow of even the craziest cases of upstream errors, and sometimes a generic exception handler makes your day. And sometimes, you can even implement software in a way that leaves no room for unforeseen cases. That is also something you learn with more experience.</p><div><hr></div><h2>Software engineering is a team game</h2><p>There is not much space for lone wolves in software engineering. We work in teams for a reason. Those teams are usually assembled to make working on difficult problems easier by having an available pool of talent who can learn from and help each other. Whether it&#8217;s just talking about an issue with your peers, a system design session with the whole team, or just the reviews your colleagues do for the code you produced, team members are a part of the process.</p><p>Not leveraging your peers as what they are (next to hopefully being awesome humans), valuable resources to make use of, is a waste of said resources. It&#8217;s even counter-productive.</p><p>You should also always take care of preserving a good atmosphere within your team. It makes no sense to ruin that atmosphere only because of a small dispute between you and one of your colleagues. Make sure to keep everything as inclusive as possible, including reviews. Healthy and happy teams are known to produce better results, and a good atmosphere, paired with great results, helps to keep your own morale up. You could call it a win-win situation.</p><div><hr></div><h2>Automated testing is not optional</h2><p>Look anywhere online, and you will basically identify two camps of developers and engineers. The first one advocates for writing tests because they&#8217;re helpful. The other one advocates against tests and says that they are nice to have, but should never block you from launching something new.</p><p>The reality is, however: Automated tests are nothing else than an (drumroll, please) automation of all the manual tests you do over and over again. Especially if you launch a product on your own, your users won&#8217;t have infinite patience with you. If you repeatedly break something, users will drop your product in favor of another, more stable one.</p><p>Look at it from the perspective of a lazy human being: Why bother to manually click through a frontend over and over again if you can just let your computer do the work? The only thing you need to learn is a few more APIs, and you will probably need a little creativity to really test what you want to test.</p><p>Testing can also have a pretty positive impact on your career. Always having to ship hotfixes after each release will sooner or later catch your manager&#8217;s attention, but not in a positive way.</p><div><hr></div><h2>Summary</h2><p>Before we end this article, let's do a quick summary of the five tips previously presented:</p><p>- Learn to listen</p><p>- Accept you don't know everything</p><p>- You'll never catch all corner cases</p><p>- Software engineering is a team game</p><p>- Automated testing is not optional</p><p>Those are the most important lessons that helped me grow. They are really easy-to-implement and won't cost you a lot. So, what are you waiting for? Go and try them out, they might help you to grow, too!</p><div><hr></div><p>You have (finally?) come to the end of this issue, so let me tell you something:</p><p><strong>Thank you for reading this issue!</strong></p><p>And now? Enjoy your peace of mind. Take a break. Go on a walk. And if you feel like it, work on a few projects.</p><p>Do whatever makes you happy. In the end, that&#8217;s everything that counts.</p><p>See you next week!</p><p>- Oliver</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What you need to know about resiliency]]></title><description><![CDATA[How you can really improve the resiliency of your software and what you need to do]]></description><link>https://newsletter.oliverjumpertz.com/p/what-you-need-to-know-about-resiliency</link><guid isPermaLink="false">https://newsletter.oliverjumpertz.com/p/what-you-need-to-know-about-resiliency</guid><dc:creator><![CDATA[Oliver Jumpertz]]></dc:creator><pubDate>Sun, 03 Dec 2023 19:50:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Hcm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Hcm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Hcm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!-Hcm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!-Hcm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!-Hcm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Hcm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:341387,&quot;alt&quot;:&quot;What you need to know about resiliency&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What you need to know about resiliency" title="What you need to know about resiliency" srcset="https://substackcdn.com/image/fetch/$s_!-Hcm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!-Hcm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!-Hcm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!-Hcm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56cb0d0-d7b6-4af2-b80f-40acf9ccdccf_1200x630.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Developing software is easy. Just implement a few features, add a few tests if you feel like it, and that&#8217;s it. Or is it? Well&#8230;it depends.</p><p><strong>Let me tell you a story:</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>There once was a team that built software like many other teams. They implemented features, they tested thoroughly, they even had regular load tests, they added all kinds of alerts, and they monitored their software closely. Their software ran on Kubernetes and was thus deployed as a container on a pretty large cluster. Their deployment strategy was simple: Just a rollover of the deployment. Release a new version, use Helm to trigger the rollover, and let all the magic within Kubernetes do its thing.</p><p>The software did not have many dependencies. It just needed a Redis to store some data inside, and that was it. Get a request, make a Redis lookup, do some more processing, and then return a response. Not much that could actually fail.</p><p>Everything ran fine for one and a half years, until one Saturday evening, they got a call from a Vice President. Everything was down because their service didn&#8217;t respond anymore. Something was odd. All requests took thirty or more seconds. All clients only showed the loading placeholders of doom. Customers were unhappy. Management was unhappy. Nothing worked.</p><p><strong>What had happened?</strong></p><p>As it turns out, a massive spike in traffic had actually overwhelmed the network interface of the ElastiCache cluster (in the end, just a Redis with an interface in front of it) the team used to store their data in. Additionally, the cluster had so much data that the machine began to swap memory. This meant that all Redis commands took forever to finish. And this, in return, meant that everything took forever.</p><p><strong>But how could this happen?</strong></p><p>It turns out that the Redis connection was just a plain Redis client, and the logic worked as follows:</p><ol><li><p>A request comes in</p></li><li><p>Take a connection from the pool</p></li><li><p>Send the Redis commands</p></li><li><p>Get the responses</p></li><li><p>Return the connection to the pool</p></li><li><p>Process the responses</p></li><li><p>Send a response to the client</p></li></ol><p>Just plain logic. No fallbacks. No safety measures.</p><p>Spot the issue? Yes. A feature that had worked for one and a half years didn&#8217;t work in one particular situation. A feature that had been load tested thoroughly and never broke finally gave in and broke. It broke in a situation with so much entropy that no one could ever have come up with a suiting load test to reproduce such a scenario reliably. Nevertheless, it happened.</p><div><hr></div><h2>A very important lesson about resiliency</h2><p>The short story above is just an example, but scenarios like this one happen probably every day around the world.</p><p>They teach us one crucial thing, however:</p><p><strong>We actually don&#8217;t build software for 99.99% of cases. We build it exactly for the 0.01% of cases where something really breaks. Or better: We should.</strong></p><p>But, resiliency is difficult. Our job as software engineers is not done when we think it&#8217;s done. It&#8217;s done when we&#8217;ve put in our <strong>best efforts</strong> to protect our software against nearly unimaginable failure scenarios. And we can be sure of one thing: <em>We will never get it right</em>. This is why I said &#8220;best efforts,&#8221; because failure is inevitable.</p><p>Even with a lot of experience, you will still overlook a possible edge case from time to time. What really counts in these circumstances is all the measures you have taken at least soften the blow as much as possible. And even these take a lot of experience and imagination to come up with.</p><p>Next to that, <strong>resiliency &#8800; resiliency</strong>. It&#8217;s a situational thing. Some applications need more resiliency than others. A static website that you deploy to Cloudflare Pages, Netlify, or Vercel needs way less resiliency than a huge streaming portal with many moving parts and hundreds of micro-services, deployed to Kubernetes on AWS. The static website already profits from what the serverless hosters offer, the streaming portal is so complex that it needs a lot of custom-tailored solutions to increase its resiliency.</p><p>Lastly, some measures to increase your resiliency cost way more than others. At some point, the decision about whether to increase your resiliency becomes a matter of your available budget. And more often than not, you will probably have to decide to not make your application more resilient because it&#8217;s not economical anymore.</p><p>Actually, there is a pretty simple rule about resiliency and uptime, which goes as follows:</p><p><strong>Achieving 99% uptime is easy. 99.9% uptime is already exponentially harder and more expensive. Every additional 9 you want to add to the decimal places of your uptime will cost you exponentially more than the previous one.</strong></p><p>And finally, you will never be able to make any system so resilient that it will forever stay at a 100% uptime. It&#8217;s impossible. It will never happen. There is always a single point of failure that you can&#8217;t make more resilient. If you try to, you create another single point of failure. If you try the same procedure again, you end up where you started. You&#8217;ve basically reached the end of it. At some point, everything else is out of your control.</p><p>There exist a few memes for a reason. One example is one about AWS&#8217; us-east-1. Even a company like Amazon can&#8217;t eliminate all its single points of failure. If us-east-1 is down, many other regions and services go down.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t0NP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t0NP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 424w, https://substackcdn.com/image/fetch/$s_!t0NP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 848w, https://substackcdn.com/image/fetch/$s_!t0NP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 1272w, https://substackcdn.com/image/fetch/$s_!t0NP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t0NP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif" width="1170" height="1427" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1427,&quot;width&quot;:1170,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30937,&quot;alt&quot;:&quot;AWS us-east-1 holding a lot of the internet&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AWS us-east-1 holding a lot of the internet" title="AWS us-east-1 holding a lot of the internet" srcset="https://substackcdn.com/image/fetch/$s_!t0NP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 424w, https://substackcdn.com/image/fetch/$s_!t0NP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 848w, https://substackcdn.com/image/fetch/$s_!t0NP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 1272w, https://substackcdn.com/image/fetch/$s_!t0NP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e5f00eb-a74e-4262-a4f8-eb40d3b3a1f9.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>us-east-1 is a very important AWS region</em></figcaption></figure></div><p>This proves one point:</p><p><strong>Resiliency is difficult, and sometimes, you just can&#8217;t get it right.</strong></p><div><hr></div><h2>A few examples of what you can do to improve resiliency</h2><p>Although you can never get to 100% uptime and completely resilient systems, it&#8217;s still worth investing in the parts of your system where it makes sense. Sometimes, it doesn&#8217;t take much to improve the resiliency of a single service so much that any further investments make no more sense. The rest is just the risk you need to live with.</p><p>But let&#8217;s take a look at a few examples of how you can improve the resiliency of your software, even on a smaller scale.</p><h3>Circuit Breaking</h3><p>Circuit Breakers are one of the most important resiliency patterns out there, and yet they are not used often enough. They would also have been a pretty great way to prevent the issue of our small example from the beginning.</p><p><strong>In electronics, a circuit breaker is an intended breaking point</strong>. In case of an overloaded electric circuit, a ground fault, or a short circuit, the circuit breaker trips intentionally and prevents further damage to devices and the system itself. The circuit breaker itself is designed in a way that it can break without becoming broken. It can trip over and over again (at some point you have to replace it, but that usually takes forever).</p><h4>Circuit Breakers in Software Engineering</h4><p>In software engineering, a circuit breaker works similarly. A method or remote call is protected by a circuit breaker that constantly measures the behavior of what it protects. If one of the metrics the circuit breaker observes exceeds a certain threshold, the circuit breaker trips temporarily to prevent further damage to the system. After a while, the circuit breaker slowly opens the flow again for a percentage of all calls, while observing its metrics. If the metrics stay below the threshold, the circuit breaker opens up the whole flow again. Otherwise, it interrupts the flow again.</p><p>Conceptually, a circuit breaker in software engineering has three states:</p><ol><li><p>Closed</p></li><li><p>Half-Open</p></li><li><p>Open</p></li></ol><h5>Closed State</h5><p>In this state, everything works normally. Closed in this sense is more to be seen in the sense of a normal circuit breaker in electronics. As long as the circuit is closed, electricity (or requests in software engineering) flows.</p><h5>Half-Open State</h5><p>In this state, the circuit breaker has already tripped. It was previously open and now closes the circuit only for a certain percentage of requests to see whether everything has gone back to normal. If so, it goes into the closed state again to allow calls to flow through. If not, it opens and prevents all requests again.</p><h5>Open State</h5><p>In this state, the circuit breaker has tripped. One of the metrics the circuit breaker observes has gone over a certain threshold and all further requests are rejected temporarily.</p><h5>Additional traits</h5><p>Unlike a circuit breaker in electronics, the one we use in software engineering has a few additional traits. Instead of only the electric current, it can observe multiple metrics. Usually, a circuit breaker has at least support for measuring and handling timeouts, exceptions, and other errors. Additionally, most circuit breakers allow for a fallback. This fallback is used whenever the circuit breaker trips.</p><h4>What a circuit breaker can be used for</h4><p>Circuit breakers are usually used to protect remote calls. <strong>In distributed systems, other services and components are usually way more likely to break than internal components within a service</strong>.</p><p>A circuit breaker can protect calls to upstream services, but also to databases. Sometimes, it also makes sense to protect calls that are otherwise internal but make use of system resources, especially in containerized systems. If a service uses an SQLite database that is mounted through a volume inside a Kubernetes cluster, it can indeed also be worthy of being protected by a circuit breaker.</p><p>Usually, a circuit breaker wraps a function call, which itself performs a remote call (or accesses the file system, like an SQLite database).</p><h4>Common misconceptions about circuit breakers</h4><p>A circuit breaker doesn&#8217;t magically make everything better. It also doesn&#8217;t prevent issues from occurring. In fact, it actually makes things fail faster and it also creates another category of errors. But in the end, these are intended measures to soften the blows of failure.</p><p>Many engineers are afraid of intentionally letting things fail. They work under the assumption that sending a response somehow (even if it means blocking resources for tens of seconds) is better than showing users an error. <strong>But this assumption is wrong</strong>.</p><p>The reality is that protecting the system for at least a certain percentage of users is better than letting it fail for everyone. If a circuit breaker in your service gives another service time to breathe and recover, that&#8217;s better than continuing to put that other service under pressure until it completely fails.</p><p>More often than not, you can implement a reasonable fallback. Sometimes, serving static fallback content can save the day. It might not be the most accurate content and may not be personalized, but once again: <strong>Something is better than nothing</strong>. And even if you cannot reliably serve any fallback content, you can still ensure that your application only fails for a certain percentage of users.</p><h4>Circuit Breakers as Infrastructure</h4><p>If you work with Kubernetes and deploy containers to clusters, you have even more choices. A good Service Mesh can provide circuit breaking capabilities out-of-the-box, for example.</p><p><strong>Services Meshes are designed and implemented to deal with the chaotic nature of distributed systems</strong>. They control how data flows and help with exactly that. Most of them provide sidecar proxies that additionally come with built-in circuit breakers. And usually, you can fine-tune these with resource definitions and tailor them to each individual upstream service. The sidecar proxy intercepts your outgoing traffic, proxies it, and handles things like circuit breaking for you, while returning the response as if nothing had ever happened.</p><p>Deploying a service mesh and actively putting your services inside that mesh often frees you from having to manually implement circuit breaking for upstream APIs. It may not free you from implementing them for your database calls, though. But still, if you configure them correctly, you don&#8217;t have to write code yourself, which alone is already worth a lot.</p><p><em><strong><a href="https://istio.io/latest/docs/tasks/traffic-management/circuit-breaking/">Istio</a></strong></em>, for example, comes with an excessively configurable circuit breaker. You can add a so-called <em><strong><a href="https://istio.io/latest/docs/reference/config/networking/destination-rule/">DestinationRule</a></strong></em> for each upstream, which additionally frees you from having to rebuild your software only because you want to reconfigure one of your circuit breakers. Changing the configuration only involves updating a resource definition within the cluster, which is usually way faster then rebuilding your whole software and deploying a new container.</p><h3>Dealing with Backpressure</h3><p>Just try to imagine that someone pushes you a few times and at some point you fall. But when you try to stand up again, that someone continues to push you. Standing up becomes nearly impossible at this point. This is too much backpressure.</p><p><strong>Every service or component in a system has a certain load it can take before it breaks</strong>. Even if scaled horizontally, there are situations in which single instances of a component cannot reliably take any more. In this situation, they are just overwhelmed by the requests they get and give in at some point. When this happens, recovery is often difficult, especially when the requests don&#8217;t stop.</p><p>Interestingly, many software engineers still ignore handling backpressure and instead focus on other things. They do nothing to protect their services and components. If their services get overwhelmed, they have a hard time trying to recover the system.</p><h4>Handling Backpressure</h4><p>Although circuit breakers exist, <strong>you should never rely on everyone playing fair</strong>. Maybe some of your colleagues didn&#8217;t have the time (or sometimes will) to implement a circuit breaker with reasonable timeouts and a back-off strategy for calls to your component. Sometimes, clients are not under your control. On the internet, everyone can call a public API, and even if it&#8217;s authenticated, it doesn&#8217;t protect you against getting requests at all.</p><p>It&#8217;s your job to find out at which point failure is imminent, by, for example, load testing thoroughly. You need to find important metrics your service depends on. Perhaps a component is CPU- or Memory-bound, or it&#8217;s actually depending on the amount of file descriptors available. These are metrics you can actively observe, even within a component itself.</p><p>You need to discover thresholds at which a component needs to say: &#8220;No more.&#8221; Depending on where you deploy your software, you might even need to find multiple. On Kubernetes, for example, you can gather one set of thresholds at which Kubernetes should direct no more traffic toward a pod (that&#8217;s the readiness probe), and one set of thresholds that puts a service into emergency mode, stopping most of the work and actively rejecting requests. While the first set&#8217;s values should be below the values of the latter, both should still be a little away from the point of complete failure.</p><h5>Handling Backpressure in HTTP Services</h5><p>HTTP has a strategy to tell clients to &#8220;back off,&#8221; HTTP status codes and HTTP headers. Although it&#8217;s no requirement for a client to implement, it&#8217;s still a good first start for you because at least a few HTTP client libraries respect it.</p><p>There are two status codes that you can use for this purpose:</p><ul><li><p>503 Service Unavailable</p><ul><li><p>This indicates that the service is not ready to accept any traffic at the moment (which means that no one can currently be served)</p></li></ul></li><li><p>429 Too Many Requests</p><ul><li><p>This usually indicates that a specific user has sent too many requests and is probably being rate limited (which means that other users may still be served)</p></li></ul></li></ul><p>Additionally, there is the Retry-After header that you can use to give your service some room to breathe (at least from clients that respect it).</p><p><strong>If an instance of your service detects that it&#8217;s close to collapsing, it should actively start to reject requests</strong>. In this case, you can decide between one of the two status codes, mentioned above, to return. In most cases, if you don&#8217;t selectively rate limit, a 503 is the better choice, though (at least semantically).</p><p>Additionally, you can set the Retry-After header to one of two possible values:</p><ul><li><p>A date</p></li><li><p>An integer value in seconds</p></li></ul><p>If you set a date, some clients will respect this value and only try to issue a new request after the date specified. In case of an integer value, clients are asked to retry the request after the amount of seconds specified within the header has passed. If your service only receives requests from components inside a closed system, you can require everyone else to honor this status code and header combination (or even make it a standard for everyone in your company).</p><p><strong>The important task for you, however, is to implement a backpressure strategy at all and as early as possible.</strong> Rejecting a request if your service can&#8217;t take any more should happen before it spends significant resources trying to process a request. This usually means implementing this kind of logic in a middleware layer.</p><p>If you place these security measures as early as possible in the call chain of any service, it becomes less important whether clients honor an HTTP status code and a Retry-After header. Clients respecting it are just a nice added bonus on top. Rejecting most of the work as early as possible still gives your service more room to slowly recover. If your service is deployed on Kubernetes, it can additionally intentionally signal that it&#8217;s not ready to receive traffic by returning a non-successful status code at its endpoint for the readiness-probe. Kubernetes will then shift traffic to other pods until the readiness-probe signals a recovery by returning successful status codes again.</p><h5>Handling Backpressure in Other Components</h5><p>For all other components that are no HTTP services or similar, there are usually other strategies. Sometimes, they are directly baked into the component itself, sometimes S/P/IaaS providers provide them for you, and sometimes, you need to handle it yourself from the outside.</p><p><strong>For these components, it&#8217;s important to do some research</strong>. You need to find out whether the components you want to use have a backpressure strategy in place or whether there are additional components (like proxies) that you can deploy to do it for you.</p><p>If there is nothing available, you probably need to implement a strategy in components that are under your control. For a database, that means circuit breaking all requests actively and trying to allow that database to recover if its overwhelmed. Often, this includes especially configuring timeouts because long-taking queries can be a good first indicator of overwhelm.</p><p>In the end, you need to get creative when dealing with such components because there are so many different of them. It can also take quite some time to come up with a suitable strategy. Nevertheless, you should invest time and resources into it because when things really get bad, you&#8217;ll profit off that investment massively.</p><h3>Caching</h3><p>Caching usually has different applications, but it&#8217;s also a good way to increase the resiliency of software components. <strong>Caches (whether in-memory or remote) are usually faster than calling the sources of their data</strong>. Calling a Redis to get some data is, for example, often faster than performing an SQL query, which provides the source of that Redis&#8217; cached data. Having an in-memory cache in front of your Redis in return is faster than issuing a call to your Redis at all. Having an application cache in front of all your logic can save your service from performing too much work overall.</p><p>Which types of caching you should implement does, of course, depend on your specific use case. Sometimes, caching your database data is enough, sometimes, an in-memory cache is enough, and sometimes you need it all. Nevertheless, all strategies have their own use cases.</p><h4>Caching Upstream Responses</h4><p>Caching upstream requests is pretty common. In fact, the HTTP spec even encourages it by providing a Cache-Control header. Any upstream service can send such a header with its response and tell you how you are allowed to cache the response. Browsers also honor the header and cache responses accordingly.</p><p><em><strong><a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control">The spec is quite extensive</a></strong></em> and the header itself has many fields, but all of them have a right to exist. There are also quite a few HTTP clients that already have their own caching mechanisms that work exactly by processing this header. But even if there is not, you can at least partially implement a caching mechanism yourself that processes the Cache-Control header.</p><p><strong>Caching directly impacts your system&#8217;s resiliency by decreasing the number of requests your upstreams have to handle</strong>. If they correctly set their Cache-Control header, they allow you fire less requests against their API because you can reuse cached responses for as long as they are valid.</p><p>Another mechanism that even improves your whole system&#8217;s resiliency is the stale-if-error field of the Cache-Control header. By setting this field to a positive integer, an upstream service allows you to reuse a cached response, even if its stale (which means it&#8217;s already older than its max-age field allowed), in case a call to it returns a 5XX HTTP status code (in other words, an error). This means that even if an upstream service you need to function properly is down or experiences other issues, you can still continue to serve at least some requests.</p><h4>Caching Database Results</h4><p>Depending on which database you use and how you store your data, you might deal with more or less scalable components. <strong>But even a NoSQL database can become pretty slow if you issue many complex queries</strong> (given that you can&#8217;t optimize them further).</p><p>Caching database results is usually a little more complex than caching upstream responses. Databases don&#8217;t send a Cache-Control header that tells you how long you can safely cache their responses. Even worse, you either have to make an educated guess and decide for yourself how long you think you can safely cache a query result (But keep in mind, data tends to change. Sometimes even rapidly), or you need to implement some form of eventing and leverage database triggers to notify all clients to invalidate specific cached entries (which is obviously not trivial and adds a lot of complexity).</p><p>Caching query results can greatly reduce the load on a database and give it some room to breathe. It naturally reduces the load a database has to handle because queries are not always issued but only when really necessary. Additionally, if implemented carefully, cached results can also be used to still serve requests even if the database is down. This usually won&#8217;t make all requests succeed, but a few are still more than none at all.</p><p><strong>Nevertheless, database caching isn&#8217;t simple</strong>. That&#8217;s sure, and there are also other alternatives you should try first (you will learn more about them in the next strategy, further down below), but you should not forget that you can and should (if necessary) also consider caching database query results.</p><h4>Distributed Caching</h4><p>Most services you deploy aren&#8217;t isolated, lone components. Nowadays, we usually scale our software horizontally. Either by leveraging serverless offerings, or by deploying to Kubernetes. Horizontal scaling is just easier than thoroughly optimizing our code or scaling vertically by adding more and more hardware (although I have heard that these multi-million Oracle Database servers with Exabytes of RAM and hundreds of CPUs still exist).</p><p><strong>Scaling horizontally comes at a cost, though. Instances don&#8217;t share their memory</strong>. An in-memory cache thus doesn&#8217;t work reliably. In this case, you can either accept that you have to do certain work n times (n being the number of instances of your service), or you can reduce the load on upstreams and databases even more by providing cached results to all instances of your service.</p><p>A common pattern is to deploy a Redis or a Memcached, which acts as a distributed cache. When one instance of a service makes an upstream request or sends a database query, it puts the result into one of these distributed caches and thus makes it available to all other instances. This can greatly reduce the number of requests all instances make overall, which in return improves resiliency once again. And in case of any errors of upstream components, all service instances can reuse the cached data to respond to requests for as long as they are allowed to.</p><h4>In-Memory Caching</h4><p>In-memory caching is the easiest (but sometimes also hardest) form of caching. Cached responses can simply be stored in-memory and reused from there whenever necessary (or possible). <strong>Memory-based caches are also the fastest caches available because there is nearly no overhead involved</strong>. Accessing memory is just faster than anything that involves network calls (which also includes leveraging distributed caching mechanisms), and it&#8217;s still faster than accessing the file system (which you can theoretically share among pods in a Kubernetes cluster, but I&#8217;d really advise against it unless you really know what you are doing).</p><p>A common pattern is to put an in-memory cache in front of your distributed cache, which is put in front of any cacheable remote call. The call chain then usually works as follows:</p><ol><li><p>Try to find a cached result within the in-memory cache</p></li><li><p>If found, return that result</p></li><li><p>If not found, try to fetch a cached result from the distributed cache</p></li><li><p>If found, return that result</p></li><li><p>If not found, try to fetch a fresh result from the upstream component</p></li><li><p>If found, return that result and put the result into all cache layers</p></li><li><p>If not found, use a fallback strategy or else</p></li></ol><p>Next to improving the overall response times of your service, in-memory caching, once again, adds to the resiliency of your component. Even distributed caches can experience issues and outages. Having at least a few cached responses available in-memory allows you to serve a few requests instead of none at all.</p><h4>Application-Level Caching</h4><p>Application-level caches sit in front of all business logic inside your service. <strong>They can be leveraged before you even have to put in any compute resources to calculate or fetch a response for your caller</strong>. Right at the beginning of processing a request, you can make a lookup in all your different tiers of caches and see whether you already have a pre-calculated result. If so, you can return that. If not, you can still put in effort to compute a new result (which can then, in return, be stored inside your application-level cache, so that you save computing power when the next request asking for the same result comes in).</p><p>As you can probably imagine, this reduces the load on all upstream components even further. You won&#8217;t usually end up with a 100% hit rate, but every percent counts. And once again, if some components are down, your application-level cache might still make someone&#8217;s day because they get a result instead of an error.</p><h3>Deduplication</h3><p>Deduplication is a strategy that tries to prevent the same work occurring concurrently. To better understand the problem deduplication solves, imagine the following scenario:</p><p>Three clients make a request to a service at the same time. All three requests hit the service at roughly the same time. They all request the same data. All of them are processed at the same time. What now usually happens is that the service still does the same work thrice (except there are some locking mechanisms preventing them from doing so, but that&#8217;s usually not the case). If instead of three clients, you have to deal with a few thousands, the problem quickly becomes clear: Although you are probably already caching results and doing everything else, you still have to deal with a few thousand, non-synchronized requests.</p><h4>In-Memory Deduplication</h4><p><strong>Deduplication tries to solve the issue of dealing with non-synchronized requests</strong>. In its simplest form, deduplication creates an in-memory queue handling requests to your service. All requests that ask for the same data get queued up. The first request that hits your service in that chain creates a new task within that queue, while synchronization mechanisms prevent other requests from advancing further in the processing chain. After the task is created, the first request gets subscribed to the result of that task. Then all other requests that ask for the same data can advance further, but instead of starting a new task, they also subscribe to the task that the first request created. Then, all work to complete the task is done exactly once (including looking up a potentially cached result in all caching layers), eventually cached (if computed again), and its result then returned to all requests subscribed, which means that all clients get the same result, which only had to be computed exactly once.</p><p>Deduplication requests does indirectly increase your service&#8217;s (and thus systems&#8217;) resiliency. It directly reduces the computing power you need to serve requests and it also saves upstream components from having to put in unnecessary duplicate work. The less work a system needs to do, the less likely it becomes for it to break at all.</p><h4>Distributed Deduplication</h4><p>In a more complex form, deduplication can even be performed in a distributed manner. <em><strong><a href="https://discord.com/blog/how-discord-stores-trillions-of-messages">Discord, for example, decouples requests to its ScyllaDB cluster</a></strong></em>, which stores all messages for all Discord channels, by leveraging proxies and so-called data services.</p><p>First, a proxy distributes requests for channel messages by a routing key and then routes all requests with the same routing key to a specific instance of a so-called data service. These data services implement an in-memory queue (as described above), which ensures that requests for the same data only get processed exactly once at the same time. Within these data services, the corresponding queries are implemented to fetch certain types of data, which, in this case, only have to be sent to the database cluster once instead of multiple times.</p><p>This is only a small addition to the in-memory variant of deduplication (as described above), but the additional routing proxy ensures that the same work is really only done exactly once, and not by multiple instances of a service at the same time.</p><p>Admittedly, distributed deduplication is probably something you only need to do at a very large scale. But it&#8217;s still nice to know that there is a strategy available, should you ever find yourself working in a scale that justifies even measures like this one. You can view it as the royal class of deduplication with most benefits for resiliency.</p><h3>Failover</h3><p>Even the most sophisticated methods of increasing resiliency we have already taken a look at can&#8217;t save you from all kinds of failure. Sometimes, components just break, no matter how much work you save them from. Infrastructure components or databases can just die out of the blue, and sometimes a whole data center loses power (like Cloudflare just experienced, which in return brought all of npm down).</p><p>Failover is a strategy that employs a backup in case of an outage or emergency. A failing database, for example, can, in such a case, be replaced by a failover cluster that has been synced in the background all the time. If the main database fails, the failover automatically (hopefully) takes over.</p><p>There are multiple strategies for failover, but the most prominent one nowadays is probably Multi-AZ (short for Availability Zone) deployments.</p><h4>Multi-AZ Deployments</h4><p>All major cloud providers provide so-called availability zones (sometimes just called different because &#8230; marketing) within their regions. These can be separate data centers (located geographically close to the main data center), or just other rooms within the same data center. No matter how a cloud provider implements them, they all usually have their own networks, power supplies, etc. This improves the chances that even if something fails in one availability zone, others can take over because they are all separated from each other.</p><p>On AWS, for example, you can deploy everything in multiple availability zones. Even the nodes of the same Kubernetes cluster (usually just EC2 instances) can be hosted in different availability zones. Pods can then be spread among the nodes of different availability zones evenly (or oddly, or however you like. It&#8217;s fully configurable) by making use of topology spread constraints and else. All databases you can deploy on AWS, of course, also have an option for a Multi-AZ setup. In case of failure, they offer an option for an automatic failover strategy (which you still have to explicitly activate, though).</p><p>Multi-AZ deployments throw money at a problem that doesn&#8217;t often occur. But if it occurs, you&#8217;ll be happy to realize that your investment has paid for itself. The issues covered by this strategy are usually something completely out of your control.</p><h3>Canary Deployments</h3><p>Even if you implement all of the above strategies, there is still a pretty high chance of you just introducing good old bugs. Mistakes just happen. But gladly, you can protect yourself to some extent.</p><p>While there are also other deployment methods that have the same goal in mind (reducing the impact of deploying broken software), canary deployments bring the most automation to the table, so we will take a look at them representatively.</p><p>Conceptually, a canary deployment is a form of an automated rollover strategy with progressive traffic shifting. Instead of just rolling over a deployment (replacing the old software with the new one step by step), a canary deployment rolls over a deployment progressively and slowly and automatically shifts and divides traffic between a stable and a canary deployment.</p><h4>How a Canary Deployment Works</h4><p>A Canary deployment works as follows:</p><ul><li><p>A new version of a software is deployed</p></li><li><p>The old, stable deployment still remains active</p></li><li><p>Automatically, a small percentage of traffic is shifted over to the new deployment</p><ul><li><p>The shift can either be sticky (session-based, for example) or happen randomly. That&#8217;s usually configurable.</p></li></ul></li><li><p>While traffic is shifted over to the new deployment, a controller closely monitors key metrics like error rates</p></li><li><p>If all observed metrics stay below a configured threshold, more and more traffic is gradually shifted over to the new deployment</p></li><li><p>If during this process, certain thresholds are broken, the traffic is automatically and completely shifted back to the old deployment</p></li><li><p>If no thresholds are broken, traffic is slowly shifted over until it hits 100%, which makes the canary deployment the new stable deployment</p></li></ul><h4>How a Canary Deployment Improves Resiliency</h4><p>Even if a new deployment has bugs, it can never bring the whole system down. Due to the (usually) automated nature of the process (given that everything is configured correctly), issues are spotted early before they affect too many users. The whole system can usually never collapse due to one broken deployment.</p><p>Additionally, a canary deployment allows for rapid development. New features can quickly be tested in a production environment, which usually yields more useful data than synthetic tests. This indirectly improves the stability of the software because it&#8217;s tested against real traffic more often. Importantly, it also allows smaller chunks to be deployed more frequently. The smaller a release, the less likely the chance of major issues becomes. More features that have never seen the reality of production just increase the chance of bigger outages due to multiple bugs affecting each other.</p><div><hr></div><p>You have (finally?) come to the end of this issue, so let me tell you something:</p><p><strong>Thank you for reading this issue!</strong></p><p>And now? Enjoy your peace of mind. Take a break. Go on a walk. And if you feel like it, work on a few projects.</p><p>Do whatever makes you happy. In the end, that&#8217;s everything that counts.</p><p>See you next week!</p><p>- Oliver</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The most difficult thing in tech is sticking to your core competencies]]></title><description><![CDATA[What you as an engineer can do to improve the situation]]></description><link>https://newsletter.oliverjumpertz.com/p/the-most-difficult-thing-in-tech</link><guid isPermaLink="false">https://newsletter.oliverjumpertz.com/p/the-most-difficult-thing-in-tech</guid><dc:creator><![CDATA[Oliver Jumpertz]]></dc:creator><pubDate>Sat, 25 Nov 2023 17:36:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UK2j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UK2j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UK2j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!UK2j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!UK2j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!UK2j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UK2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg" width="1200" height="686" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:686,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:133310,&quot;alt&quot;:&quot;The most difficult thing in tech is sticking to your core competencies&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The most difficult thing in tech is sticking to your core competencies" title="The most difficult thing in tech is sticking to your core competencies" srcset="https://substackcdn.com/image/fetch/$s_!UK2j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 424w, https://substackcdn.com/image/fetch/$s_!UK2j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 848w, https://substackcdn.com/image/fetch/$s_!UK2j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!UK2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5986b2d-358d-42c6-a899-a1e2b3e03b9b_1200x686.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We are funny beings. I mean us as software engineers, because we <strong>love</strong> to build stuff. Give us any problem and we will probably get stuck for a few hours and wish we were never born a few times, but we will try to get through it, and we will come up with a solution. More often than not, we will build something ourselves, however, and we will be hellish proud of our accomplishments.</p><p>Interestingly, most companies <strong>love</strong> exactly this. We solve their problems for them, and we even discover a few more along the way and use our precious time to solve them, too. What more could you ask for?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Well, did you know that this is one of the biggest problems in tech? <strong>We simply don&#8217;t know when to stop.</strong> We get so lost in what we love to do that we stop asking ourselves whether we spend our time wisely at all. Only a few companies and managers have realized this, as well. And if they have, <strong>they will try to stop us</strong>.</p><p>If you are a little lost now and ask yourself what I am talking about, wait a second. Let me explain.</p><div><hr></div><h2>Software Engineering in the industry</h2><p>There is a huge difference between software engineering and industrial software engineering. The first is what we do (or better, want to do), the second is software engineering within a business context, or stated differently: <strong>economical software engineering</strong>.</p><p>The reality is that not all companies have unlimited resources. To be even more precise: <strong>No company has unlimited resources</strong>. They all have a certain budget they are willing to allocate and another budget they can spend at most.</p><p>Money is finite. Every bank account of every company has a specific number inside it. If you go below that number, the company either goes bankrupt, or needs to lend money to continue to operate. If a company lends too much money, it definitely risks going bankrupt, and that is something most companies want to avoid, of course.</p><p>What has that to do with us? Well, we somehow like to get paid for what we do. Additionally, we love to spend money. We spin up a few EC2 instances here, create a few ElastiCache clusters there, add a CloudFront on top of it, and now we do the same on Azure or GCP because multi-cloud definitely increases our resiliency. That usually leaves your employer with a high bill, but at least it&#8217;s not you who has to pay it, right?</p><p>The scenario above is not necessarily bad. If going through your requirements results in a large cloud infrastructure on multiple cloud platforms, then be it. At least Jake from the business unit can grab your cloud engineer Sarah and go through your cloud dashboards and buy reserved instances for the next three years to save a lot of money and even get a few tax exemptions in some circumstances.</p><p>So, if that is not bad at all, do we even have a problem? Does this issue even make sense? Could you have spent your precious time more wisely than to actually read all the above, only to realize that I have fooled you?</p><p><strong>Nope. There is still a problem. Believe me.</strong></p><p>What we haven&#8217;t talked about yet is the countless hours we pour into building our CI/CD pipelines to deploy all of our software to our two cloud systems (by &#8220;we&#8221;, I mean us and the twenty other teams who do the same work). What we also haven&#8217;t talked about yet is all the work we put into setting up our own Kubernetes cluster and the dozens of operators we need. We add a service mesh, and definitely need to host a few databases ourselves, right within the cluster because neither AWS, nor Azure, or GCP have a native managed MongoDB (No, DocumentDB is a relational database with a MongoDB interface. Sorry.), or a Redis with more than its default functionality.</p><p>Oh, and after all that, we have decided to build our own authentication system, and implemented twelve Lambda functions and five micro-services, which all have nothing to do with our core business, but are still needed for our system to work.</p><p>And here comes the crux: <strong>There is no &#8220;reserved engineering capacity&#8221; at a discounted price.</strong> We get and deserve our salaries. No matter what happens. And this leads to the most difficult thing in tech.</p><div><hr></div><h2>Finding and sticking to our core competencies</h2><p>Core competencies are exactly what differentiates businesses from each other. They are what a business makes money with. And they are what you should usually focus your engineering efforts on. Engineers cost money, and you don&#8217;t usually get discounts. <strong>Everything you build and need to maintain thus costs usually more than your cloud resources or SaaS' cost you </strong>because you always need to pay someone to take care of it for you.</p><p>If you build a fancy new AI app, your core competency probably lies somewhere in the problem space you are working in. If that is creating headshots of people with AI, it makes most sense to focus your efforts on building a great user experience and creating ML models that produce awesome headshots. It&#8217;s neither your core competency to create your own authentication system, nor is it to implement your own vector database.</p><p>In such a scenario, it&#8217;s essential to make a few make or buy decisions. The leaders of your company need to ask themselves the following questions for everything you need for your product to work:</p><ol><li><p>Is this one of our core competencies?</p></li><li><p>How much does it cost us to build competence in a specific field of expertise and build and maintain the solution for the foreseeable future</p></li><li><p>How much does it cost to simply buy an existing solution?</p></li></ol><p><strong>This is where many companies and engineers fail.</strong></p><p>Business leaders and budget owners see the cost of an SaaS solution priced at a few hundred thousand dollars a year and immediately begin to worry about their budget. What they don&#8217;t understand is that there is usually more cost associated with building something new than only paying for the time some engineers need to integrate an existing solution.</p><p>Engineers, on the other hand, would love to create a solution themselves because it&#8217;s an interesting problem space they want to work in or it&#8217;s something that gives them the most opportunity to learn. They don&#8217;t care about the business aspect (which is fine because it is not one of their core competencies), they care about the engineering (which is exactly their core competency).</p><p>The harsh truth, however, is that the most efficient way to deal with these issues is to try and make a <strong>perfectly neutral and objective decision</strong>. It&#8217;s exactly as stated above. If it&#8217;s not your core competency, try to buy it. Only if all solutions that fulfill your requirements are more expensive than building something yourself (with all true costs associated with it, like long-term maintenance, etc.), you should ask yourself whether it&#8217;s worth building a new competency from the ground up and pay all costs associated with it (including frequent failure and outages until your solution is production-ready and solid enough) or you should simply pay someone else to give you access to a solution they take care of.</p><p>If something is just too expensive and you also cannot build it yourself because that, too, would blow your budget, it&#8217;s probably time to try to come up with an alternative solution. Going back to the Headshot AI app scenario: Perhaps, you cannot build a full model yourself because you can&#8217;t afford the GPU time associated with training a model from scratch. In this case, you could come up with the idea to retrain an existing open source model. Maybe, you need some time to get good results, maybe you even never reach the perfect quality, but perhaps it&#8217;s just enough to get your first few thousand customers and bring in cash you can reinvest into developing your own model from scratch.</p><div><hr></div><h2>We, as engineers, still have to learn</h2><p>Even if some of our business leaders are bad at finding and sticking to a company&#8217;s core competencies, we as engineers can do better. We are the ones whose job it is to come up with solutions, but it&#8217;s also our job to offer expert advice. This can and should also include advising against building certain things ourselves only because we want to. Sometimes, we need to argue to spend a few hundred thousand dollars a year on a (let&#8217;s just say) ready-made monitoring solution instead of setting up Prometheus, Grafana, OpenTelemetry and what else ourselves. Sometimes we need to argue that Auth0 (or another auth solution) is better than to take an existing open source solution and customize it ourselves if auth is just not our core competency (hint: it usually isn&#8217;t).</p><p>It&#8217;s really difficult. I know it myself. More than once I have designed full solutions in my head before realizing that I had already gone too far because the component in question was a clear buy candidate. I&#8217;m an engineer. I love to build difficult and fun stuff. I love to learn a lot along the way, and I love to later talk or write about the awesome things I have created. But it&#8217;s also my job to try to stay calm and assess a situation professionally. This includes sometimes having to speak against adding something to our stack, which clearly doesn&#8217;t belong to our core competencies.</p><p>At some point, however, you work at a scale where ready-made solutions simply don&#8217;t work anymore. I have, for example, made some &#8220;fun experiences&#8221; with MongoDB clusters on Atlas costing nearly a million dollars a year because our scale had just become too big. At this point, it was clearly time to expand our competencies, even if that meant having to hire people, train them, and deal with running a MongoDB cluster ourselves. It just became cheaper to do it ourselves than spending so much on a cloud service.</p><p>I also currently gather experience building a fully-fledged GraphQL Edge CDN. We don&#8217;t do this because we love to deal with global availability, low-latency responses, the complexity of dealing with cache entity tagging, or high-resiliency. We do it because other existing solutions would cost us thrice as much a month as we spend on our whole AWS bill a year (it&#8217;s definitely more than seven figures). So, we simply have no other choice.</p><p>The examples above also show one of the most important things we, as engineers, have to learn: We need to learn how to <strong>correctly</strong> make make-or-buy decisions. And we need to learn how to take every nuance into account. Well, technically, not only we, but also our managers because we usually don&#8217;t get many insights into business KPIs like cost of development, and more.</p><div><hr></div><h2>How to compare apples to apples</h2><p>If you want to make really informed decisions, you need to take into account as much as possible. You can&#8217;t just compare apples to oranges. <strong>You need to get it right.</strong></p><p>As already stated, you always need to think about whether you deal with something that is a part of your core competencies first. If not, you should think thrice about whether you really want to build a solution yourself. If you still come to the conclusion that you should at least find out whether making or buying comes cheaper, you will have to do some math. This, however, needs a clear process and is not easy, either.</p><p>To give you a general idea of how to come up with numbers you can compare, here is the process I usually use (but keep in mind that this might vary depending on how or where you work):</p><ol><li><p>Estimate how much time it takes you to build your own solution (I usually estimate hours)</p></li><li><p>Add 20% on top because engineers tend to underestimate complexity and usually can&#8217;t reliably foresee complexities arising during implementation</p></li><li><p>Multiply your final estimation with your calculated cost per developer (per hour)</p></li><li><p>Estimate how much time it takes to maintain the solution for a longer time (I tend to let my team estimate how many hours on average they think they will spend per week)</p></li><li><p>Once again, add 20% on top</p></li><li><p>Multiply this estimation with your calculated cost per developer, again</p></li><li><p>Don&#8217;t forget the cost it takes you to potentially hire new people or train existing staff</p></li><li><p>Calculate how much any outage due to your solution failing will cost you (This one is the most difficult)</p><ol><li><p>Expect self-built solutions to fail at least once</p></li><li><p>Expect to lose users due to this and calculate how much that churn will cost you</p></li><li><p>Take into account that it also costs you money when you or other engineers need to solve the problem while not working on other features (which in return costs you users and money once again)</p></li></ol></li><li><p>Sum everything up</p></li><li><p>Take what a ready-made solution will cost you in the same timeframe that you need to build your own solution (if it takes you two years, you need that ready-made solution for at least two years)</p></li><li><p>Add 1-5% on top because even a bought solution will fail at some point and probably cost you a few users (it&#8217;s just not under your control to fix it but at least you pay someone to interrupt their well-deserved night and work until the issue gets fixed)</p></li><li><p>Sum the previous two points up</p></li><li><p>Compare the two sums</p></li></ol><p>Don&#8217;t worry. If you are just an individual contributor, someone from the business side will have to help you. In this case, you can&#8217;t know how much money one day of an engineer&#8217;s time costs the company. You also can&#8217;t know how much churning users cost the business. If you are an entrepreneur, though, you will need these numbers at hand. If not, try to get them as soon as possible.</p><p>In the end, you end up with a pretty neutral number. It either tells you that building a solution yourself is cheaper, or the opposite, buying something, has less cost associated with it. <strong>The only thing left is actually being strong enough to make a decision and stick to it for the foreseeable future.</strong></p><p>But that&#8217;s it for you. If you have come this far in the process, you have made a very professional decision (or at least contributed to it). It might not bring you joy because you are missing out on an incredible opportunity to do something new and/or crazy, but it will help you keep your sanity because everything you build fails at some point. The more complex the solution that fails, the more &#8220;fun&#8221; for engineers to fix it ASAP. Stated differently: The less you do yourself, the less code you need to maintain. The less code you need to maintain, the less debt you have. The less debt you have, the more peace of mind you gain.</p><p>See? It&#8217;s not <em>that</em> bad!</p><div><hr></div><p>You have (finally?) come to the end of this issue, so let me tell you something:</p><p><strong>Thank you for reading this issue!</strong></p><p>And now? Enjoy your peace of mind. Take a break. Go on a walk. And if you feel like it, work on a few projects.</p><p>Do whatever makes you happy. In the end, that&#8217;s everything that counts.</p><p>See you next week!</p><p>- Oliver</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[My 7-Step Process to get through any System Design Interview]]></title><description><![CDATA[The best way to design a system, and to impress your interviewer(s)]]></description><link>https://newsletter.oliverjumpertz.com/p/my-7-step-process-to-get-through</link><guid isPermaLink="false">https://newsletter.oliverjumpertz.com/p/my-7-step-process-to-get-through</guid><dc:creator><![CDATA[Oliver Jumpertz]]></dc:creator><pubDate>Sat, 18 Nov 2023 11:30:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let&#8217;s talk about system design, or better: <strong>System Design Interviews</strong>. I know that many tech interviews still consist of rounds and rounds of technical discussions, trying to figure out what you&#8217;re capable of. I also know that these processes usually suck because there are better ways than confronting you with two to three employees of a company who try to assess your skills.</p><p>Still, these interviews exist, and being somehow prepared for them doesn&#8217;t hurt. Sometimes, it&#8217;s the company of your dreams that still uses such processes. In these cases, just applying somewhere else is no choice.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>To make your life as easy as possible, and also give you a process that you can use outside of interviews (to design systems yourself), I want to share my personal 7-step process to design any system with you. It&#8217;s the process I still use when confronted with such interviews, and it&#8217;s the same process I use when I have to interview applicants at the senior staff level and upwards (at some levels, you want to make sure an applicant knows their craft), and design systems myself (of course).</p><p>Ready? Let&#8217;s get into it.</p><div><hr></div><h2>Step 1 - Clarify The Requirements</h2><p><strong>Ask questions and a lot of them.</strong></p><ul><li><p>Try to find out what exactly the interviewer expects from you.</p></li><li><p>Try to narrow it down as far as possible to the exact scope of the problem. Even if the question is to (re)design an existing system, ask.</p></li></ul><p>The amount of questions you could ask is nearly endless, but here are some ideas for you:</p><ul><li><p>Are there multiple frontends (App + Website + more)?</p></li><li><p>Are there regulatory requirements?</p></li><li><p>Do we need our own auth, or can we leverage an existing one (like login with Facebook/Google, etc.)?</p></li><li><p>Are there other consumers than the known clients planned for our APIs?</p></li><li><p>Do we need third-party integrations?</p></li><li><p>Do we want to incorporate a data lake for analysis?</p></li><li><p>Any requirements on data consistency?</p></li></ul><div><hr></div><h2>Step 2 - Define System Interfaces</h2><p><strong>Define all APIs that the system will (probably) need.</strong> Explain what each API is for as detailed as possible so the interviewer can jump in and tell you if you got a requirement wrong.</p><p>If you got something wrong, no problem, adjust accordingly!</p><p>To give you a general idea, take a platform like Instagram for example which needs APIs to:</p><ul><li><p>Upload and view images,</p></li><li><p>Retrieve the feed</p></li><li><p>Store and retrieve comments for an image</p></li><li><p>Store and retrieve likes</p></li><li><p>Manage your profile</p></li><li><p>Manage people's followers</p></li><li><p>Store and retrieve stories</p></li><li><p>etc.</p></li></ul><p>As you see, existing systems are already pretty large, so it makes sense to ask your interviewer if you should really lay out everything you could possibly think of, or if you should only focus on a part of the existing system.</p><h3>Illustrating System Interfaces</h3><p>You can, for example, start with a use case diagram, which is a great way to identify and showcase possibly needed APIs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XW3P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XW3P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 424w, https://substackcdn.com/image/fetch/$s_!XW3P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 848w, https://substackcdn.com/image/fetch/$s_!XW3P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 1272w, https://substackcdn.com/image/fetch/$s_!XW3P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XW3P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif" width="527" height="341" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87b97e64-11c3-444d-86a6-21a9279ada92.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:341,&quot;width&quot;:527,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7115,&quot;alt&quot;:&quot;a simple use case diagram for an instagram-like system&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a simple use case diagram for an instagram-like system" title="a simple use case diagram for an instagram-like system" srcset="https://substackcdn.com/image/fetch/$s_!XW3P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 424w, https://substackcdn.com/image/fetch/$s_!XW3P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 848w, https://substackcdn.com/image/fetch/$s_!XW3P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 1272w, https://substackcdn.com/image/fetch/$s_!XW3P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87b97e64-11c3-444d-86a6-21a9279ada92.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A simple use case diagram</em></figcaption></figure></div><div><hr></div><h2>Step 3 - Estimate The Scale Of The System</h2><p><strong>Give estimates to the scale of the system and communicate them clearly.</strong> Once again, if you got a requirement wrong, the interviewer can tell you, and you can adjust. If not, the interviewer can better understand your design decisions.</p><p>You don't need to give an exact estimate here, simply throw a number in the room which you find reasonable. In the case of well-known platforms, inform yourself prior to the interview. The numbers for Instagram, Facebook, etc, are all out there on the internet.</p><p>And regarding numbers, we talk about daily / monthly active users being transformed to an estimate of how many API requests/s (or minute) your system should be able to handle</p><p>If you interview at an organization that has such a well-known platform, you can impress your interviewer if you have the numbers at hand. This shows that you are really interested in the company and have made your homework.</p><p>You don't have to memorize those numbers exactly, it's not even about 1 or 2 million more or less, but you shouldn't be off by hundreds of millions. Being off by such a large amount could lead to design decisions you would not have made otherwise.</p><div><hr></div><h2>Step 4 - Define The Data Model</h2><p><strong>Define the data model of the system, and define how data flows.</strong> Ask how detailed this step should be, and if your interviewer requires it at all.</p><p>If it's required, also explain and maybe visualize how data flows between the components of the system.</p><p>Leverage relational models where applicable, but don't forget that there are other types out there, as well, like documents, graphs, and many more. Try to use the most appropriate ones for the entity at hand. It might, for example, make sense to leverage graphs for friendship-like relationships or for follower relations. If you find such a graph model fitting, use and model the entity accordingly.</p><p>Don't forget to lay out how the entities move through the system, like for example: The feed service fetches a user from the user service. The user entity thus travels from the user service to the feed service, and maybe even further.</p><p>You don't have to come up with the exact solution that the engineers of an existing platform came up with. That's not what an interview is about. Simply add the properties you may find fitting, and which are just enough to fulfill the requirements.</p><p>Your users, for example, will most likely need a forename, a name, an ID, an email address, and so on.</p><h3>Illustrating Entities And Data Flow</h3><p>You can use a UML class diagram, draw a box with properties in it, or simply list the properties of an entity to describe the data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CmSR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CmSR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 424w, https://substackcdn.com/image/fetch/$s_!CmSR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 848w, https://substackcdn.com/image/fetch/$s_!CmSR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 1272w, https://substackcdn.com/image/fetch/$s_!CmSR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CmSR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif" width="880" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f36cbb0c-91bd-48e3-abf4-33b06f906e15.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:880,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6915,&quot;alt&quot;:&quot;three possible ways to model your entities&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="three possible ways to model your entities" title="three possible ways to model your entities" srcset="https://substackcdn.com/image/fetch/$s_!CmSR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 424w, https://substackcdn.com/image/fetch/$s_!CmSR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 848w, https://substackcdn.com/image/fetch/$s_!CmSR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 1272w, https://substackcdn.com/image/fetch/$s_!CmSR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36cbb0c-91bd-48e3-abf4-33b06f906e15.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Three possible ways to model an entity</em></figcaption></figure></div><p>To illustrate the data flow, you can use simple illustrations to showcase how the data flows for certain operations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3GdP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3GdP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 424w, https://substackcdn.com/image/fetch/$s_!3GdP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 848w, https://substackcdn.com/image/fetch/$s_!3GdP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 1272w, https://substackcdn.com/image/fetch/$s_!3GdP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3GdP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif" width="880" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:880,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15779,&quot;alt&quot;:&quot;A simple diagram showcasing how to draw how data flows within a system&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A simple diagram showcasing how to draw how data flows within a system" title="A simple diagram showcasing how to draw how data flows within a system" srcset="https://substackcdn.com/image/fetch/$s_!3GdP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 424w, https://substackcdn.com/image/fetch/$s_!3GdP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 848w, https://substackcdn.com/image/fetch/$s_!3GdP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 1272w, https://substackcdn.com/image/fetch/$s_!3GdP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fc4a8cf-f670-4b6e-8c65-924ac747186f.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A simple illustration of data flows within a system</em></figcaption></figure></div><div><hr></div><h2>Step 5 - Draw A High-Level Design</h2><p>Most systems only have a few very important components that make up the core of the system. <strong>Try to identify them and draw boxes for the five most important ones, and visualize their interactions.</strong> Especially in modern micro-service environments, services communicate with each other (a lot!).</p><p>If service A needs some data that is managed by service B, you already have two boxes and one arrow to draw to illustrate that communication.</p><p><em>Remember</em>: If you have already modeled the data flow of certain entities, reuse this information here.</p><h3>Illustrating The High-Level Design</h3><p>Once again, a simple illustration with some boxes and some arrows, including descriptions, is enough to showcase the design you have in mind.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uyGP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uyGP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 424w, https://substackcdn.com/image/fetch/$s_!uyGP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 848w, https://substackcdn.com/image/fetch/$s_!uyGP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 1272w, https://substackcdn.com/image/fetch/$s_!uyGP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uyGP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif" width="880" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e29e6bf5-e027-45a8-9962-4844febe422c.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:880,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17937,&quot;alt&quot;:&quot;A high-level system design of an Instagram-like service&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A high-level system design of an Instagram-like service" title="A high-level system design of an Instagram-like service" srcset="https://substackcdn.com/image/fetch/$s_!uyGP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 424w, https://substackcdn.com/image/fetch/$s_!uyGP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 848w, https://substackcdn.com/image/fetch/$s_!uyGP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 1272w, https://substackcdn.com/image/fetch/$s_!uyGP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe29e6bf5-e027-45a8-9962-4844febe422c.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A high-level design drawing of an Instagram'-like system</em></figcaption></figure></div><div><hr></div><h2>Step 6 - Design Components In Detail</h2><p>Explicitly ask if you should focus on specific components for this step. If not, <strong>choose the three most important ones and visualize/explain how they should work and what they should do in detail</strong>. A good interviewer will guide you.</p><p>Take an image service, for example, that is one of the core components of an Instagram-like system. It offers an API to retrieve images, for example, which is handled by an internal implementation. To store images, it needs access to some kind of storage, which is handled by another internal implementation. Lay all that out, draw diagrams to illustrate your ideas, and help the interviewer understand what you are designing.</p><h3>Illustrating Detailed Components</h3><p>You could try to resemble a UML component diagram to illustrate a detailed view of a service, for example.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uVwn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uVwn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 424w, https://substackcdn.com/image/fetch/$s_!uVwn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 848w, https://substackcdn.com/image/fetch/$s_!uVwn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 1272w, https://substackcdn.com/image/fetch/$s_!uVwn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uVwn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif" width="880" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:880,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:10489,&quot;alt&quot;:&quot;A simple component diagram of an image service in an Instagram-like service&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A simple component diagram of an image service in an Instagram-like service" title="A simple component diagram of an image service in an Instagram-like service" srcset="https://substackcdn.com/image/fetch/$s_!uVwn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 424w, https://substackcdn.com/image/fetch/$s_!uVwn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 848w, https://substackcdn.com/image/fetch/$s_!uVwn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 1272w, https://substackcdn.com/image/fetch/$s_!uVwn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dff87a7-ddbd-4be3-bd4f-b23377b3c8e2.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>An UML component diagram of a possible image service</em></figcaption></figure></div><div><hr></div><h2>Step 7 - Explicitly State Bottlenecks</h2><p><strong>Every system and every design has its limits.</strong> If there are bottlenecks in your system (and there will always be), state them explicitly. Give detailed explanations about what affects them, and how you could mitigate them.</p><p>This step requires the most knowledge about the components you used to design your system. You'll have to know, for example, when relational databases fall behind NoSQL storages noticeably, or in which situations certain message brokers are not the best choice anymore.</p><p>Maybe a relational model was good for the anticipated scale of the system, but migration becomes a pain at some point? You can be pretty creative in this step, but once again, you need to know the advantages and disadvantages of components and also the design principles you built your system upon. You don't have to be able to recall each and every disadvantage, but be able to state the most important ones.</p><p>Sometimes interviewers will also jump in and give you hints or state a bottleneck they found explicitly. That's your chance to discuss it with the interviewer and find a solution together.</p><div><hr></div><p>Those 7 steps listed above can be a guide for you to get through a system design interview. They will support you to design a system in a structured way, but they won't obviously help you to ace the interview automatically. That's up to your skills, creativity, and imagination.</p><p>From my very own experience, I never expected candidates to come up with a solution that you could implement right away and that would work flawlessly afterward. For me, the system design interview has always been more of a creative and structured brainstorming session, in which I get to know the candidate's ability to design systems and recall important traits of certain components better.</p><p>What I usually found pretty helpful for me and all candidates, though, was exactly that structured approach. Starting from the very beginning, slowly going deeper and more in-detail. This way candidates can structure their thoughts better and I am able to follow along better, which is a win-win situation.</p><div><hr></div><p>You have (finally?) come to the end of this issue, so let me tell you something:</p><p><strong>Thank you for reading this issue!</strong></p><p>And now? Enjoy your peace of mind. Take a break. Go on a walk. And if you feel like it, work on a few projects.</p><p>Do whatever makes you happy. In the end, that&#8217;s everything that counts.</p><p>See you next week!</p><p>- Oliver</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How Netflix, PayPal, and other big tech companies scale their API development]]></title><description><![CDATA[Learn more about the technology allowing tech companies to scale their APIs while additionally increasing their development speed]]></description><link>https://newsletter.oliverjumpertz.com/p/how-netflix-paypal-and-other-big</link><guid isPermaLink="false">https://newsletter.oliverjumpertz.com/p/how-netflix-paypal-and-other-big</guid><dc:creator><![CDATA[Oliver Jumpertz]]></dc:creator><pubDate>Sat, 11 Nov 2023 13:50:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Nearly all software products need an API at some point.</strong></p><p>You either need your frontend to talk to your backend (that&#8217;s basically most products out there), or you want other people&#8217;s frontends or systems to talk to your product (Stripe is a prime example of such a product).</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>Creating APIs, however, can turn into both an art form and an organizational nightmare.</strong></p><p>Here is why:</p><ol><li><p>Good APIs need to be well-crafted. You can&#8217;t just create some REST endpoints without much design and call it a day. You need to put some real effort into them to make them usable and scalable.</p></li><li><p>Dedicated API teams usually create a middleware-layer between frontends and backends, which basically adds a whole additional iteration to your feature cycle. Instead of two iterations (first the backend, then the frontend), you now need three iterations until a feature of your product can make an impact for customers.</p></li></ol><p>Companies like <em>Netflix</em>, <em>PayPal</em>, and many more, however, seem to have finally found a way to decrease the organizational burden of creating their APIs.</p><p>It&#8217;s the same technology I work with daily, nearly at the same scale as these companies, and in this issue, we are going to take a look at what exactly these companies do, and what we can learn from them. But first, we will take a look at common challenges when creating APIs before we dive into how you can solve most of them.</p><div><hr></div><h2>Setting the foundation</h2><p>Before diving deeper, we should first set a foundational layer of understanding. In this case, you need to understand what type of APIs we talk about here, and which ones we don&#8217;t.</p><p>In this particular case, we talk about <strong>&#8220;THE&#8221; API </strong>between frontends and backends, so basically the entry point to any system.</p><p>The backend itself usually contains many more APIs on a micro-service-level, with many different protocols used, and a lot of inter-service-communication.</p><p>To get a better idea of what we talk about, take a look at the following image:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bB61!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bB61!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 424w, https://substackcdn.com/image/fetch/$s_!bB61!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 848w, https://substackcdn.com/image/fetch/$s_!bB61!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 1272w, https://substackcdn.com/image/fetch/$s_!bB61!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bB61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif" width="711" height="271" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1be4b89-3dac-409a-bbcc-b2e47178eed5.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:271,&quot;width&quot;:711,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8468,&quot;alt&quot;:&quot;Overview of a system with a public API&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Overview of a system with a public API" title="Overview of a system with a public API" srcset="https://substackcdn.com/image/fetch/$s_!bB61!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 424w, https://substackcdn.com/image/fetch/$s_!bB61!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 848w, https://substackcdn.com/image/fetch/$s_!bB61!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 1272w, https://substackcdn.com/image/fetch/$s_!bB61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1be4b89-3dac-409a-bbcc-b2e47178eed5.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A simple architectural view of a public API in front of a system</em></figcaption></figure></div><p>It&#8217;s exactly as stated. A more or less public API at the front of the system that connects backend and clients, and in this issue, we will focus on exactly this public-facing API.</p><p>Stated as simply as possible: We talk about the <strong>public-facing remote API</strong> of a system.</p><div><hr></div><h2>The issues with APIs</h2><p><strong>Creating APIs comes at a cost.</strong> To be even more precise, there are several issues you face and hurdles you have to take to get a good API up and running.</p><p>All companies building large-scale systems face these issues at some point, and we will now take a look at what these are.</p><h3>1. Designing APIs is difficult</h3><p>There are so many things you need to think about when you want to create a <strong>really good API</strong>, it&#8217;s mind-blowing. You need to think about</p><ul><li><p>Usability</p></li><li><p>Scalability</p></li><li><p>Security</p></li><li><p>Maintainability</p></li></ul><p>to only name a few. And even if you are done thinking about these, you still need to implement your API, which comes with its own problems once again.</p><p>Stripe is a great example of a company that puts a lot of work into its API (but that is no wonder because <strong>this API is Stripe&#8217;s product</strong>). Their API just &#8220;feels good&#8221;, and it&#8217;s easy enough to use, but that is also no wonder because <em><strong><a href="https://stripe.com/blog/payment-api-design">Stripe&#8217;s engineers put a lot of work </a></strong></em>into creating it.</p><p>Stripe&#8217;s engineers go through many feedback loops until a new API endpoint finally gets released into the wild. This means that changes aren&#8217;t that easy or fast to make. Instead, doing it slowly and carefully is king in this particular scenario. That&#8217;s nothing for every company out there. Some of them want to iterate fast (or even faster), most probably also without compromising on API quality.</p><h3>2. Creating APIs is even more difficult</h3><p><strong>When creating an API, you usually also face organizational issues.</strong> Who designs the API? Who implements it? Who runs it? Who maintains it? (You get the point)</p><p>In many cases and for many companies, the public-facing API is created by a dedicated team, which creates a new layer of responsibility, and a few more issues.</p><p>The API team gets its requirements either from the business-side (product managers, e.g.) or client teams (who want to implement features in their clients, which they need new or additional data for). It basically looks like shown in the following image:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n_yk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n_yk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 424w, https://substackcdn.com/image/fetch/$s_!n_yk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 848w, https://substackcdn.com/image/fetch/$s_!n_yk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 1272w, https://substackcdn.com/image/fetch/$s_!n_yk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n_yk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png" width="266" height="272" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff155623-f14b-494d-b19d-52a2189d80df_266x272.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:272,&quot;width&quot;:266,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9041,&quot;alt&quot;:&quot;The responsibility chain&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The responsibility chain" title="The responsibility chain" srcset="https://substackcdn.com/image/fetch/$s_!n_yk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 424w, https://substackcdn.com/image/fetch/$s_!n_yk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 848w, https://substackcdn.com/image/fetch/$s_!n_yk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 1272w, https://substackcdn.com/image/fetch/$s_!n_yk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff155623-f14b-494d-b19d-52a2189d80df_266x272.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The responsibility chain</em></figcaption></figure></div><p>Every new feature you want to introduce needs to run through these three layers, usually in the following order:</p><ol><li><p>The backend needs to add new data or new endpoints</p></li><li><p>The API team needs to incorporate the new data or endpoints into the public-facing API</p></li><li><p>The client teams need to use that data to create their features</p></li></ol><p>You could now argue that all teams can still work on a feature concurrently, but there is still some delay between getting a basic idea of how to implement something and providing mock data and/or mock endpoints for other teams to use. Nevertheless, the feature is only finished when all teams are finished.</p><p>On the other hand (and to be fair), having a dedicated API team also has one crucial advantage: <strong>You have exactly one team that needs to know how to create great public-facing APIs.</strong> That&#8217;s a fact you also should not underestimate.</p><h3>3. Increased mobile usage creates even more requirements for APIs</h3><p>Nowadays, <em><strong><a href="https://datareportal.com/global-digital-overview">the world is mobile</a></strong></em>, and mobile-usage is only going to increase in the future. Everyone I know has at least some form of smartphone, and they use it regularly. <strong>Chances are thus high that most users of any product are going to use exactly that product from a mobile device sooner or later.</strong></p><p>We don&#8217;t have mobile-first web design for no reason. Love it or hate it, but you will most probably have to deal with mobile users if you don&#8217;t want to exclude quite a few potential customers from your product. Most products even have a mobile app for exactly that reason (because people somehow like &#8220;native&#8221; apps more than mobile websites).</p><p>Mobile websites and apps have an issue, though: They often need data from your backend, and mobile coverage isn&#8217;t always the best. Additionally, not all countries have massive amounts of data for a reasonable price, which puts additional constraints on your design. <strong>You can&#8217;t let your users fetch megabytes of data only to display a small portion of it</strong>. This over-fetching is a common issue of REST APIs.</p><p>Lastly, the times are over where you could get away with only a web client and one mobile client for iOS and Android each. Depending on what you build, you could even need somewhere around <strong>twenty or more different </strong>clients<strong>.</strong></p><p>Believe it or not, but <em>Netflix</em>, for example, really has to deal with twenty or more different clients. As a streaming service, they not only live on the web and on mobile devices, but they also need to provide content on Smart TVs (Especially Smart TVs are very fragmented. Every manufacturer has their own OS with different constraints), Google Chromecast, Amazon FireTV Stick, Android TV, and more (I know what I&#8217;m talking about because I also work for a pretty large streaming service provider).</p><p>It&#8217;s difficult to build a single API that serves so many clients. Clients that all potentially have slightly different requirements of data or ways to get that data. <strong>A general-purpose API quickly becomes too inflexible.</strong></p><div><hr></div><h2>If general-purpose APIs don&#8217;t do the job, what then?</h2><p>To serve so many different clients with slightly differing requirements you would usually have to build the ultimate general purpose API. <strong>But that&#8217;s nearly impossible</strong>.</p><p>Your API would probably consist of many different routes/endpoints and often duplicate logic, only changed slightly, so all clients get the data and functionalities they need. This quickly gets out of hand.</p><p>That problem isn&#8217;t new, though, because <em><strong><a href="https://samnewman.io/patterns/architectural/bff/">Sam Newman already wrote about the so-called Backend-for-Frontend (BFF) pattern in 2015</a></strong></em>. (To be honest, this pattern had already been applied prior to Newman&#8217;s article, but he was the first one to talk about it publicly.)</p><p>The general idea goes as follows:</p><p>Every client with its special needs gets its own API, called Backend-for-Frontend, which basically looks like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8foz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8foz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 424w, https://substackcdn.com/image/fetch/$s_!8foz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 848w, https://substackcdn.com/image/fetch/$s_!8foz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 1272w, https://substackcdn.com/image/fetch/$s_!8foz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8foz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif" width="711" height="401" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:401,&quot;width&quot;:711,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9214,&quot;alt&quot;:&quot;System overview with multiple backends-for-frontends&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="System overview with multiple backends-for-frontends" title="System overview with multiple backends-for-frontends" srcset="https://substackcdn.com/image/fetch/$s_!8foz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 424w, https://substackcdn.com/image/fetch/$s_!8foz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 848w, https://substackcdn.com/image/fetch/$s_!8foz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 1272w, https://substackcdn.com/image/fetch/$s_!8foz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41a12a06-8f78-4f58-8cf5-c7fe12ba4972.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The BFF Pattern</em></figcaption></figure></div><p>If your iOS app wants to fetch data differently than your web app, you only need to add a special endpoint in the iOS BFF. The requirements of your iOS app don&#8217;t affect your web app because it&#8217;s a totally different API that needs to be adjusted.</p><p>This solves one of the many issues: Delivering exactly the data a client needs because all APIs are tailored to the needs of that specific client.</p><p>Instead of creating different endpoints for specific clients in a general-purpose API, you assign dedicated teams to each BFF (or even let full-stack engineers build both the client and their own API). This increases the flexibility of your API(s) and additionally eliminates (or at least lessens the impact of) the bottleneck a single API potentially creates.</p><p>(The pattern works well, as Soundcloud, for example, <em><strong><a href="https://www.thoughtworks.com/insights/blog/bff-soundcloud">has successfully proven in the past</a></strong></em>.)</p><p>At that time, other companies also had issues with different clients and varying requirements for their APIs. To tackle the problem with traditional REST APIs and mobile apps, Lee Bryon, Dan Schafer and Nick Schrock invented GraphQL internally at Facebook in 2012 (they got the idea when working on a redesign of their iOS app). In 2015 Facebook open-sourced GraphQL, and since then, GraphQL has seen massive adoption throughout the industry.</p><p>GraphQL solves an issue Backends-for-Frontends create: Instead of having to provide multiple APIs for different clients, you can shrink your API layer down to one again. At the same time, each client can fetch exactly the data it needs. So you can basically hit two birds with one stone. Not bad, isn&#8217;t it? It doesn&#8217;t matter whether your GraphQL entities have 50 or more fields. Clients only fetch what they really need.</p><p>But GraphQL comes at some of the costs of a general-purpose API (or let&#8217;s better say it reintroduces it): You are once again down to one (or less common two or more) teams that work on that single API. Chances are high that the resources you allocate to your GraphQL API are still not enough to serve the needs of all stakeholders at the same time. <strong>You have reintroduced a bottleneck</strong>.</p><div><hr></div><h2>The architectural pattern allowing Netflix, PayPal and other big tech companies to evolve their APIs into something far better</h2><p>If a general-purpose API is too inflexible and creates a feature bottleneck, BFFs take too much work, and GraphQL itself reintroduces the feature bottleneck, what then?</p><p>Time to finally take a look at how big tech companies evolve their APIs and solve 95% of problems that all other solutions we have previously taken a look at create or cannot eliminate.</p><p>The answer is an architectural model and protocol addition to GraphQL itself, called <strong>GraphQL Federation</strong>.</p><p>GraphQL Federation is also known by the name <strong>Apollo Federation</strong>, which includes the name of the company that develops the architectural model (partially together with <em>Netflix</em>) and offers commercial and open-source software based on it, <em>Apollo GraphQL</em>.</p><p>GraphQL Federation brings something to the table that completely eliminates the need for a middleman: <strong>It allows engineers to directly attach backend services to the API</strong>, automatically, without the need for a team that does the grunt work of connecting clients and backends.</p><p>Instead of creating micro-services that provide RESTful APIs, or leverage gRPC, Thrift, or whatever else seems fitting, teams can directly implement their APIs using GraphQL. All changes to their own APIs are reflected on the public API as soon as that team updates their public schema. This completely eliminates the middleman.</p><p>Conceptually, Federation consists of three parts:</p><ol><li><p>Well-designed additions to the GraphQL spec in the form of specific directives (directives are like annotations in Java or decorators in TypeScript).</p></li><li><p>A schema registry where teams of individual services upload their GraphQL schema to and clients, as well as a central component, can fetch an aggregated single schema from.</p></li><li><p>A central component, called Router, that dismantles queries, creates a query plan, and calls multiple backend services (called subgraphs) under the hood. All individual responses by subgraphs are then reassembled into a single response.</p></li></ol><p>But let&#8217;s take a closer look at what these are and how they work.</p><h3>1. Federation-Specific Directives</h3><p>Conceptually, nothing changes when working with Apollo Federation. A graph always consists of a well-defined set of types, queries, mutations and (recently added) subscriptions.</p><p>A type is like an entity, some form of data your graph offers. It consists of simple fields or sub queries, which are like remote methods that any client can call.</p><p>This is an example of a simple entity that models a blog post:</p><pre><code><code>type BlogPost {
  id: ID!
  title: String!
  description: String
  author: String!
  publishedAt: DateTime!
  lastUpdated: DateTime
}</code></code></pre><p>And this is an example of a corresponding query:</p><pre><code><code>type Query {
  # fetches a single blog post by its id
  blogPost(id: ID): BlogPost
}</code></code></pre><p>In a federated graph, entities can be accessed between graphs. In the above case, a blog post could be provided by a blog post service. Blog posts usually have comments, and these might be handled by a completely different team, with their own micro-service and their own data storage.</p><p>In this case, the schema for a comment could look like this:</p><pre><code><code>type Comment {
  id: ID!
  author: String!
  text: String!
}</code></code></pre><p>In a &#8220;classical&#8221; GraphQL API, the API service would implement the schema for all types and fetch the corresponding data from different services if certain fields are requested. The comment schema above would move into the original schema and be provided by the one single API service. In a federated API, however, things are a little different, and this is where federation-specific directives come into play.</p><p>Each subgraph (every service attached to the federated graph with its own schema) can mark certain entities as &#8220;accessible by other graphs&#8221;. It does so by adding the key directive to every type it wants to make accessible by others.</p><p>In the scenario from above, the comment schema would look as follows:</p><pre><code><code>type Comment @key(fields: "id") {
  id: ID!
  author: String!
  text: String!
}</code></code></pre><p>The only addition is the key directive next to the name of the type. It just makes one statement: &#8220;You can uniquely identify this type by one field, namely id.&#8221; This makes the Comment type a <strong>federated entity</strong>.</p><p>If the blog post service now wants to include a Comment in its schema, it can do so by simply referencing it as follows:</p><pre><code><code>type BlogPost {
  id: ID!
  title: String!
  description: String
  author: String!
  publishedAt: DateTime!
  lastUpdated: DateTime
  comments: [Comment!]
} </code></code></pre><p>This is everything. The BlogPost itself needs no key directive if no one else must be able to include it in their types, and it can simply include the Comment type within its own schema. The only thing the providing service of a federated entity needs to do, is to implement a special method that allows the federation infrastructure to fetch one or multiple entities by the fields specified within their key directive. Everything else is handled by the rest of the federated GraphQL infrastructure.</p><p>This also eliminates a primary need of inter-service communication in a system. The communication is implicitly done by the Router in the front. Services themselves don&#8217;t necessarily need to talk to each other if they are connected through a federated graph. If a client fetches multiple entities that are connected through the schema, the Router in front of the system will issue the corresponding requests to fetch all necessary data from multiple services.</p><p>Theoretically, other subgraphs can also extend existing entities with single or multiple fields, without providing new types themselves, and there are also ways to make certain fields resolvable by multiple subgraphs, or override them. These are lesser-used directives, however, so looking at the key directive specifically shall suffice to give you a general idea.</p><h3>2. Schema Registry</h3><p>The schema registry is a central part of a system that leverages a federated graph.</p><p>Every subgraph (remember, a micro-service that offers its own partial GraphQL endpoint) uploads its schema to the registry. The registry then takes all available schemas and wires together two new schemas:</p><ol><li><p>An API schema</p></li><li><p>A supergraph schema</p></li></ol><p>The API schema is the public schema that all clients use to communicate with the GraphQL API itself. This schema doesn&#8217;t contain federated directives and only provides what is really necessary for clients to generate types and clients from. It contains all types, interfaces, queries, mutations, and so on from all subgraphs.</p><p>The supergraph schema is a special schema that has the same contents as the API schema, but it additionally contains federated directives and special control directives, which are used by the Router to create query plans with. This schema is usually not public.</p><h3>3. Router</h3><p>The Router is the entry point to a federated graph. All clients send their GraphQL requests to this component, which then uses the supergraph schema to create a dedicated query plan.</p><p>First, a large query is split into multiple different queries (if necessary). Then the resulting subqueries are sent to the respective subgraphs. After that, all individual responses are stitched back together into a single, unified response, which is then sent back to the client.</p><p>The Router also takes care of handling errors that inevitably happen in distributed systems. Sometimes, errors from subqueries can be aggregated into the final response, and sometimes if the schema doesn&#8217;t allow for it (non-null values missing due to errors, for example), the whole request needs to result in an error.</p><p>Architecturally, you can imagine the Router being placed in a system as follows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KoB5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KoB5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 424w, https://substackcdn.com/image/fetch/$s_!KoB5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 848w, https://substackcdn.com/image/fetch/$s_!KoB5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 1272w, https://substackcdn.com/image/fetch/$s_!KoB5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KoB5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif" width="711" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d93234a4-763d-46c0-bf3e-4e0144efc735.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:711,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8435,&quot;alt&quot;:&quot;A system overview of a federated graph&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A system overview of a federated graph" title="A system overview of a federated graph" srcset="https://substackcdn.com/image/fetch/$s_!KoB5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 424w, https://substackcdn.com/image/fetch/$s_!KoB5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 848w, https://substackcdn.com/image/fetch/$s_!KoB5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 1272w, https://substackcdn.com/image/fetch/$s_!KoB5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd93234a4-763d-46c0-bf3e-4e0144efc735.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>GraphQL Federation</em></figcaption></figure></div><div><hr></div><h2>How Federated GraphQL helps Netflix, PayPal, and many more with creating APIs at scale</h2><p>Now that you have a general idea of how GraphQL/Apollo Federation works, it&#8217;s time to take a look at what exactly it offers that makes it such a valid alternative to other solution that big tech companies go all-in on it.</p><p>First of all, it&#8217;s still GraphQL. GraphQL&#8217;s huge advantage over REST is that any client can exactly specify and fetch which data it wants and really needs. Backend or API teams don&#8217;t need to deal with any of this. They can simply provide data they have available and even extend their types on-the-fly without breaking anything, decreasing the performance, or even increasing the network overhead. No more over-fetching.</p><p>Second, it eliminates the bottleneck a centrally maintained API creates. Features don&#8217;t need to be worked on by additional API teams. As soon as a backend service or multiple services have implemented a new feature, clients can begin fetching the new data or using the new remote procedures. Changes and fixes can be reflected way faster. This is a massive speed increase.</p><p>Third, it&#8217;s still GraphQL (yes, again), and this comes with human-readable schemas that are way faster to scan and understand than OpenAPI specs for RESTful APIs. That&#8217;s a huge advantage of GraphQL schemas. They were designed with humans in mind.</p><p>Fourth, by spreading the API all over the backend, every engineer working on a backend service integrated into the federated API also becomes an API engineer. It may not sound like much, but frontend and backend engineers have always had very differing views about usable APIs. More often than not, backend engineers simply expose their database schema as their API schema and call it a day. In 99% of all cases, this results in a very poorly usable API. Frontend teams often imagine different, more usable APIs, with better user/developer experience. In a federated GraphQL API, backend engineers can be held responsible and can&#8217;t hide behind the fact that someone still has to build a facade (the public-facing API) and call their API under the hood.</p><p>Now that you know what GraphQL Federation offers, it&#8217;s time to take a look at some case studies to understand how two selected companies use GraphQL at scale; with hundreds or thousands of services and billions of requests a day.</p><div><hr></div><h2>The story of Netflix and GraphQL Federation</h2><p>As a GraphQL Federation user (and contributor) of the first hour, Netflix had already used GraphQL very early. But interestingly, Netflix began using GraphQL Federation in a not-so-public system first: Its <em><strong><a href="https://netflixtechblog.com/how-netflix-scales-its-api-with-graphql-federation-part-1-ae3557c187e2">Studio API</a></strong></em>.</p><p>Netflix has been producing original content for quite a while, and a large portion of Netflix&#8217; staff works on these original productions. A large chunk of Netflix engineers also work on these systems that help Netflix staff manage these productions, from the very first creative pitch to post-production and the final release of a movie or series.</p><p>All these workflows are handled by the Studio API, which allows custom clients to be built on top of it. Thanks to the flexibility of GraphQL, these clients range from mobile clients for creative scouts, to web or desktop apps for Netflix Studio employees doing their work in the office.</p><p>The Studio Graph consists of at least three services or &#8220;domain graph services&#8221; (as Netflix calls them), which all model a partial but important part of the overall domain.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JhB3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JhB3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 424w, https://substackcdn.com/image/fetch/$s_!JhB3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 848w, https://substackcdn.com/image/fetch/$s_!JhB3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 1272w, https://substackcdn.com/image/fetch/$s_!JhB3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JhB3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif" width="1292" height="919" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81d433e4-cc69-48a9-a283-a2823b0e755f.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:919,&quot;width&quot;:1292,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42504,&quot;alt&quot;:&quot;A system overview of the Netflix Studio API&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A system overview of the Netflix Studio API" title="A system overview of the Netflix Studio API" srcset="https://substackcdn.com/image/fetch/$s_!JhB3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 424w, https://substackcdn.com/image/fetch/$s_!JhB3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 848w, https://substackcdn.com/image/fetch/$s_!JhB3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 1272w, https://substackcdn.com/image/fetch/$s_!JhB3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d433e4-cc69-48a9-a283-a2823b0e755f.avif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Source: <strong><a href="https://netflixtechblog.com/how-netflix-scales-its-api-with-graphql-federation-part-1-ae3557c187e2">https://netflixtechblog.com/how-netflix-scales-its-api-with-graphql-federation-part-1-ae3557c187e2</a></strong></em></figcaption></figure></div><p>Next to its Studio API, Netflix also started working on migrating its mobile clients from its API framework <em><strong><a href="https://netflix.github.io/falcor/">Falcor</a></strong></em> over to Federated GraphQL in 2022.</p><p>Prior to this, the API was maintained by a single team that used the Java Version of Falcor to implement the mobile API. But this created exactly the issues you have already read about. The API team always had to manage communication between both involved parties (frontend and backend), and additionally had to catch up on domain knowledge to understand the needs and desires of its stakeholders.</p><p>After realizing the potential gains of GraphQL over their custom implementation Falcor and the added productivity gained by directly connecting backend to frontend engineers, Netflix undertook a <em><strong><a href="https://netflixtechblog.com/migrating-netflix-to-graphql-safely-8e1e4d4f1e72">very interesting migration</a></strong></em> (which is a story too long to cover here, but feel free to read about it, it&#8217;s worth it) with zero downtime and a lot of A/B Testing, Replay Testing and Sticky Canaries.</p><p>Since June, 2023, all of Netflix&#8217; mobile traffic is served through a single Federated GraphQL API, with globally distributed instances of Netflix&#8217; custom implementation of a Router. We don&#8217;t know exactly how many subgraphs the whole graph contains, but we can safely assume that it&#8217;s a few hundreds at least.</p><p>Next to this, Netflix invests heavily into GraphQL by building its own GraphQL infrastructure. Engineers are working on at least a custom Router, a custom Registry, and have also open-sourced their personal <em><strong><a href="https://netflix.github.io/dgs/">Spring-based DGS framework</a></strong></em>.</p><div><hr></div><h2>The story of PayPal and GraphQL Federation</h2><p>PayPal&#8217;s journey into GraphQL started in 2017, when they faced a growing problem: Their Checkout was powered by a RESTful API, or better, multiple RESTful APIs: Backends-for-Frontends. Core teams built the system in the backend, creating APIs with different protocols, not caring much about how that data would be used. Multiple other teams would then build BFFs for specific clients or products.</p><p>A very crucial issue PayPal&#8217;s engineers noticed at that time was a lot of repeated low-value boilerplate code in all BFFs. Data had to be fetched, filtered, mapped, and sorted over and over again. This used up valuable time that engineers could have used to do what they were actually meant to do: Building (UI-) features.</p><p>Initially, PayPal&#8217;s Web Platform team built a monolith GraphQL API, as a proof of concept. With a lot of success and many issues solved, the team soon realized that building an API themselves would not suffice forever. This is when they stumbled upon GraphQL Federation and started to champion the architectural pattern and technology internally.</p><p>Soon after, PayPal started migrating over to GraphQL Federation and has added more and more services to its Graph throughout the years. We have no explicit data on how many subgraphs the PayPal graph contains, but we can approximately assume that it must be hundreds right now.</p><p>PayPal&#8217;s engineers have since been <em><strong><a href="https://medium.com/paypal-tech/graphql-at-paypal-an-adoption-story-b7e01175f2b7">quoted praising federation</a></strong></em> and its added benefits, as it increases development speed of new features and frees capacity to work on more important things than dedicated APIs.</p><p>As far as we know, PayPal has made GraphQL its default choice for all new UI apps. With over 50 apps and products already connected to the graph, this number will only grow. Additionally, some of its core services, Identity, Payments, and Compliance have fully migrated to GraphQL. Additionally, PayPal even offers a <em><strong><a href="https://graphql.braintreepayments.com/">public GraphQL API</a></strong></em> to interact with its services.</p><p>From some of PayPal&#8217;s blog post, we know that PayPal has no dedicated GraphQL infrastructure team. They leverage Apollo&#8217;s platform and use the components offered by its product <em>GraphOS</em>.</p><div><hr></div><p>You have (finally?) come to the end of this issue, so let me tell you something:</p><p><strong>Thank you for reading this issue!</strong></p><p>Although this issue was incredibly large, I still hope you found it useful.</p><p>And now? Enjoy your peace of mind. Take a break. Go on a walk. And if you feel like it, work on a few projects.</p><p>Do whatever makes you happy. In the end, that&#8217;s everything that counts.</p><p>See you next week!</p><p>- Oliver</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[FOMO is killing your career]]></title><description><![CDATA[Why running after new tech is never going to solve your problems]]></description><link>https://newsletter.oliverjumpertz.com/p/fomo-is-killing-your-career</link><guid isPermaLink="false">https://newsletter.oliverjumpertz.com/p/fomo-is-killing-your-career</guid><dc:creator><![CDATA[Oliver Jumpertz]]></dc:creator><pubDate>Tue, 07 Nov 2023 13:42:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!V3ug!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V3ug!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V3ug!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!V3ug!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!V3ug!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!V3ug!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V3ug!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:223123,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V3ug!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!V3ug!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!V3ug!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!V3ug!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd176bfad-43b0-4a0a-91ce-ef80b7255d2e_1200x630.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.oliverjumpertz.com/subscribe?"><span>Subscribe now</span></a></p><h2>I can still remember myself, a few years younger, searching the internet every day, just to find out what cool new tech there was out in the world.</h2><p>My reasoning behind this was actually pretty simple:</p><blockquote><p>The fresher and crazier the new tech I use, the better my chances must be to get a high-paying job. The better I keep up with the latest tech, the better my overall chances.</p><p>- A younger naive me</p></blockquote><p>But guess what? <strong>It didn&#8217;t work. It never worked this way.</strong></p><p>I know for sure that many aspiring and experienced developers out there still try to chase the latest tech, worrying about their careers or trying to take the next step.</p><p>If you&#8217;re one of them, stay with me. Let me try to show you what I learned over the course of my (already very long) career, and what helped me calm down and focus on the only things that really matter.</p><div><hr></div><h2>The problem with FOMO</h2><h3>The biological problems</h3><p><strong>FOMO, the fear of missing out, puts pressure on you (when experienced for a longer period of time).</strong></p><p>Sadly, most of us humans are pretty bad at handling pressure. We are not made to sustain a certain amount of pressure over a longer period of time.</p><p>Biologically, we were made to be relatively calm beings. Our genetics are still made for us to be hunters and gatherers. We are made to walk for miles and search for something edible. We are made for short intense physical activities like hunting or running away from danger.</p><p>This is also why <a href="https://mhanational.org/what-adrenaline">adrenaline exists</a>, the hormone that (in theory) should help us. If you&#8217;re put under pressure, your body releases adrenaline as a way to improve your senses and help you over a short period of time. That adrenaline rush then allows you to hunt down your next meal, or run away from a danger that could otherwise kill you.</p><p>Adrenaline rushes are usually triggered by one of the following:</p><ol><li><p>Fear</p></li><li><p>Excitement</p></li><li><p>Anxiety</p></li><li><p>Stress</p></li></ol><p>Three of the four triggers of an adrenaline rush already have to do with FOMO. The <strong>fear</strong> of missing out. Your fear can lead to stress, that stress can lead to anxiety.</p><p>The result?</p><ul><li><p>Jitters and irritability</p></li><li><p>Trouble falling or staying asleep</p></li><li><p>Frequent headaches</p></li><li><p>Difficulties with concentration and memory</p></li><li><p>Your muscles constantly feel tense or sore</p></li><li><p>Unexplained weight loss</p></li></ul><p>I think we don&#8217;t need to argue about the biological implications on your health. They are bad, and you probably want to avoid them like hell. You even <strong>should</strong> avoid them like hell.</p><h3>The professional problems</h3><p><strong>Professionally, there are also quite a few issues with FOMO.</strong></p><p>Always running after the latest tech robs you of crucial experience. You usually don&#8217;t need more experience with a specific tech stack if you suffer from FOMO. What you actually need is more general experience solving problems.</p><p>Changing &#8220;your&#8221; tech stack regularly, in the beginning, resets your whole learning experience each time. Instead of continuing to learn new concepts and encounter new problems, you confront yourself with a new syntax, new APIs, and new technology-specific problems.</p><p>It&#8217;s basically like trying to take the steps up somewhere but deciding to take another stairway mid-way by basically jumping down to the bottom again. You never reach the top this way.</p><p>In the end, you end up with mediocre experience in a few technologies, without the traits that are usually expected from a more experienced developer.</p><p>Next to that, I have already seen quite a few CVs of developers who were definitely suffering from FOMO. Their skills section contained more bullet points and was longer than the section with their previous projects and prior experience.</p><p>More often than not, engineers who suffer from FOMO, are only mediocre developers because they have never dived deep enough. They just stopped at a certain level and decided to reset everything because a specific technology was more important to them than all the fundamentals you can learn in (really) any programming language and with any library.</p><p>In the end, FOMO kills an engineer&#8217;s career because the reality is:</p><p><strong>You only advance in your career if you become a better engineer, not a better &#8220;user of technology xyz&#8221;.</strong></p><h2>But, what then?</h2><p>By now, you have probably realized that running after the latest technologies is not the way to get further in your career. But what should you do then? What does really matter?</p><h3>First of all, calm down. There is no reason for FOMO.</h3><p>Yes, there are companies out there that solely hire based on experience in very specific programming languages, technologies, and frameworks, but the ones that really matter hire you for your experience as a software engineer.</p><p>Google, for example, <a href="https://www.google.com/about/careers/applications/jobs/results/?skills=software%20engineer">only hires Software Engineers</a>. You are hired based on your experience and then assigned a team that uses specific languages and tools.</p><p>The reasoning behind it?</p><p><strong>Software engineers are not defined by the technologies they use. These tools are just that: tools to get a specific job done, with engineering principles.</strong></p><p>This is exactly the way it should be. A company usually has one or a few problems they want to solve with software. These problems somehow need to be solved. The solutions need to be usable for the users they are meant for. </p><p>All that needs to be done in an economically sustainable way, and the cost must not exceed the budget. Otherwise, the project is either canceled because there is no money left, or an automated solution to a problem becomes more expensive than the prior manual solution.</p><p>And how do you do that?</p><p>By putting a lot of work into designing a solution and deciding on technologies to use.</p><p>The most optimal way, in this case, is deciding what to use based on the key data you have available. What does your budget allow for? What problems need to be solved? Which technologies are capable of solving these issues? And so on.</p><p><strong>And, what talent do you need if you really want to do it this way?</strong></p><p>Well, you need engineers who have already seen a lot and who are capable of getting into some new technology as fast as possible while still being able to leverage their existing experience.</p><p>That experience usually has more to do with the problems themselves, like:</p><ol><li><p>How do you efficiently store data?</p></li><li><p>How can you decrease the latency as far as possible?</p></li><li><p>How do you get as much UX as possible into your solution (on the tech side of things)?</p></li><li><p>How can you work with petabytes of data efficiently?</p></li><li><p>etc.</p></li></ol><p><strong>This experience is not necessarily gained with different technologies.</strong></p><h3>Second, get your priorities straight</h3><p>You need to ask yourself a few questions first:</p><ol><li><p>Where do I want to end up?</p></li><li><p>What do I really enjoy?</p></li><li><p>What am I ready to do for it?</p></li></ol><p>Number 1 sets your goals straight. What is your end goal? How far do you actually have to go?</p><p>Number 2 ensures you keep your sanity. It makes no sense to learn or work with something you do not enjoy using at all.</p><p>And number 3 ensures that you are aware of what you are ready to sacrifice or put into it.</p><p>If you put all these answers into a single sentence, it will usually look something like this:</p><p><strong>&#8220;I want to become a skilled software engineer who enjoys solving problems in [the (frontend|backend|fullstack|mobile|gaming) space | xyz space) (regex is everywhere&#8230;), and I am ready to allocate n hours a day to get there.&#8221;</strong></p><p>Great, isn&#8217;t it? A single sentence that defines all of your professional life for the foreseeable future.</p><p>But don&#8217;t worry, it&#8217;s only a mantra. What matters even more is what follows now.</p><p><strong>Wherever you want to end up, whatever you enjoy, and whatever you are ready to sacrifice, you won&#8217;t get there the way you imagined.</strong></p><p>There is one, simple, way to gain all the experience you need, and that is the following:</p><ol><li><p>Pick technologies you like and enjoy using</p></li><li><p>Build as many projects as you like (although the more the better)</p></li><li><p>Ensure you face many different problems</p></li><li><p>Repeat for as long as you enjoy doing it</p></li></ol><p>That. Is. All.</p><p>There is usually a way to do everything with as few technologies as possible, without you having to always switch around because you could potentially miss something.</p><p>You can do everything with JavaScript and Mongo, you can also build fullstack apps with Python (Hey, Django) and Java (GWT anyone?), or build frontends with Go, and you can even potentially do data science and machine learning in any language. You can use vectors with Postgres, or perform ML with it, and even Redis is a suitable main database nowadays.</p><p>This is the crucial thing. <strong>Your priority is becoming a competent software engineer</strong>, not a competent JavaScript/Python/Whatever developer. A loop is still a loop, a network still works the same way no matter which language you use, and the fundamentals of databases still remain the same, no matter which one you use (a document DB is a document DB, a relational database a relational one, etc.).</p><p><strong>You won&#8217;t lose much by going down this route.</strong></p><p>Yes, there will be companies that won&#8217;t hire you as a frontend developer because they expect two years of experience with React instead of Vue. There will be companies that will reject you because you know Java instead of Go.</p><p>This isn&#8217;t even bad, because a company that is this short-sighted usually also isn&#8217;t a very good company to work for.</p><p>More than enough companies have a hiring process that does not look at specific technologies but instead puts an emphasis on your experience in a specific area. For these companies, experience in Vue is as good as in React, because they both follow very similar principles, and experience in Go is as good as in Java.</p><h3>Third and lastly, do everything to enjoy what you are doing</h3><p>Enjoying the journey that learning software engineering is is way more important than grinding through it and bringing yourself closer to burnout, only because you fear not being accepted the way you are (the reality is&#8230;most seniors still suffer from this, so you&#8217;re not alone).</p><p>If you&#8217;re at the beginning, you still have a few years to learn until you can call yourself a senior. If you&#8217;re already further down the line, congratulations, you still have years to come. Learning never stops for a software engineer. <strong>Really. Never.</strong></p><p>Just view this journey as a marathon instead of a sprint, and enjoy some peace of mind. <strong>You deserve it.</strong></p><div><hr></div><p>If you have come this far, let me tell you something:</p><p><strong>Thank you for reading this issue!</strong></p><p>And now? Enjoy your peace of mind. Take a break. Go on a walk. And if you feel like it, work on a few projects.</p><p>Do whatever makes you happy. In the end, that&#8217;s everything that counts.</p><p>See you next week!</p><p>- Oliver</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.oliverjumpertz.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Educated Software Engineer! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item></channel></rss>