Updating conversion, creating readmes

This commit is contained in:
Jonas Zeunert
2024-04-19 23:37:46 +02:00
parent 3619ac710a
commit 08e75b0f0a
635 changed files with 30878 additions and 37344 deletions

View File

@@ -1,9 +1,9 @@
 Awesome Site Reliability Engineering !Awesome (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg) (https://github.com/sindresorhus/awesome)
 Awesome Site Reliability Engineering !Awesome (https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg) (https://github.com/sindresorhus/awesome)
 (https://dastergon.gr/awesome-sre)
A curated list of awesome Site Reliability (https://www.usenix.org/conference/srecon14/technical-sessions/presentation/keys-sre) and Production 
(https://www.usenix.org/conference/srecon15/program/presentation/canahuati) Engineering resources.
A curated list of awesome Site Reliability (https://www.usenix.org/conference/srecon14/technical-sessions/presentation/keys-sre) and Production (https://www.usenix.org/conference/srecon15/program/presentation/canahuati) Engineering 
resources.
What is Site Reliability Engineering?
▐ "Fundamentally, it's what happens when you ask a software engineer to design an operations function." - Ben Treynor Sloss, VP Google Engineering, founder of Google SRE
@@ -61,8 +61,7 @@
⟡ Engineering Reliability into Web Sites: Google SRE (https://research.google.com/pubs/pub32583.html)
⟡ DEVOPS & SRE AMA - Building High Performance Organizations (https://vimeo.com/179914447)
⟡ John Allspaw's AMA on Incident Analysis and Postmortems (https://community.atlassian.com/t5/Jira-Ops-questions/I-m-John-Allspaw-Ask-Me-Anything-about-incident-analysis-and/qaq-p/957084)
⟡ Site Reliability Engineering with Paul Newson - Part 1 (https://www.gcppodcast.com/post/episode-38-site-reliability-engineering-with-paul-newson/) & Part 2 
(https://gcppodcast.com/post/episode-59-sre-ii-with-paul-newson/)
⟡ Site Reliability Engineering with Paul Newson - Part 1 (https://www.gcppodcast.com/post/episode-38-site-reliability-engineering-with-paul-newson/) & Part 2 (https://gcppodcast.com/post/episode-59-sre-ii-with-paul-newson/)
⟡ How SysAdmins Devalue Themselves (https://queue.acm.org/detail.cfm?id=2891413)
⟡ The Softer Side of DevOps (https://www.youtube.com/watch?v=ry51Llzil1I)
⟡ SRE, noun. See also: confidence, trust. (https://medium.com/@kobolog/sre-noun-see-also-confidence-trust-e7e33e19efc1)
@@ -75,8 +74,7 @@
⟡ Microservices, DevOps and Production Complexity (https://blog.netsil.com/microservices-devops-and-operational-complexity-be98cb01b660)
⟡ Introducing Google Customer Reliability Engineering (https://cloudplatform.googleblog.com/2016/10/introducing-a-new-era-of-customer-support-Google-Customer-Reliability-Engineering.html)
⟡ Evolution or Rebellion? The rise of Site Reliability Engineers (SRE) (https://robhirschfeld.com/2016/12/29/evolution-or-rebellion-the-rise-of-site-reliability-engineers-sre/)
⟡ The difference between Site Reliability Engineering, System Administration, and DevOps
 (https://standalone-sysadmin.com/the-difference-between-site-reliability-engineering-system-administration-and-devops-d05031495499)
⟡ The difference between Site Reliability Engineering, System Administration, and DevOps (https://standalone-sysadmin.com/the-difference-between-site-reliability-engineering-system-administration-and-devops-d05031495499)
⟡ SRE in the Small and in the Large (https://www.usenix.org/conference/lisa16/conference-program/presentation/closing-plenary)
⟡ SBSRE Meetup: Different SRE roles and challenges(Netflix) (https://www.youtube.com/watch?v=zLXf0cKDOv0)
⟡ Panel: Who/What Is SRE? (https://www.usenix.org/conference/srecon16/program/presentation/definition-of-sre-panel)
@@ -111,8 +109,7 @@
⟡ The human scalability of "DevOps" (https://medium.com/@mattklein123/the-human-scalability-of-devops-e36c37d3db6a)
⟡ Podcast: Site Reliability Management with Mike Hiraga (https://softwareengineeringdaily.com/2018/04/09/site-reliability-management-with-mike-hiraga/)
⟡ How a cat inspired system reliability at Knowlarity (https://medium.com/@Knowlarity_Engineering/how-a-cat-inspired-system-reliability-at-knowlarity-ad73c24f29a7)
⟡ Getting Started with Site Reliability Engineering
 (https://github.com/devopsenterprise/2018-London/blob/master/Tuesday/Breakout%20Sessions/Throne%2C%20Stephen%2C%20Getting%20Started%20with%20Site%20Reliability%20Engineering.pdf)
⟡ Getting Started with Site Reliability Engineering (https://github.com/devopsenterprise/2018-London/blob/master/Tuesday/Breakout%20Sessions/Throne%2C%20Stephen%2C%20Getting%20Started%20with%20Site%20Reliability%20Engineering.pdf)
⟡ "Practical Applications of the Dickerson Pyramid" by Nat Welch (https://www.youtube.com/watch?v=xWAfTAu0Mww)
⟡ LinkedIns Kurt Andersen Uncovers Blindspots in SRE Implementations (https://blameless.com/blog/sre-implementations-blindspots/)
⟡ Interview with Betsy Beyer, Stephen Thorne of Google (https://driftboatdave.com/2018/10/09/interview-with-betsy-beyer-stephen-thorne-of-google/)
@@ -149,8 +146,7 @@
⟡ So you want to be a Site Reliability Engineer? (https://www.loomsystems.com/single-post/2016/03/23/So-you-want-to-be-a-Site-Reliability-Engineer)
⟡ Spiraling Ops Debt & the SRE Coding Imperative (https://www.loomsystems.com/blog/2017/02/06/spiraling-ops-debt-the-sre-coding-imperative)
⟡ So you want to be an SRE? (https://hackernoon.com/so-you-want-to-be-an-sre-34e832357a8c)
⟡ Career Profiles/Site Reliability Engineer
 (https://www.khanacademy.org/college-careers-more/career-content/career-profile-videos/site-reliability-engineer/v/ruth-grace-site-reliability-engineer-what-i-do-and-how-much-i-make)
⟡ Career Profiles/Site Reliability Engineer (https://www.khanacademy.org/college-careers-more/career-content/career-profile-videos/site-reliability-engineer/v/ruth-grace-site-reliability-engineer-what-i-do-and-how-much-i-make)
⟡ What is the role of a Site Reliability Engineer? (https://cloudacademy.com/blog/what-is-the-role-of-a-site-reliability-engineer/)
⟡ Lynda.com: DevOps Foundations: Site Reliability Engineering (https://www.lynda.com/Software-Development-tutorials/DevOps-Foundations-Site-Reliability-Engineering/669542-2.html)
⟡ Incident Management Training: Wheel of Misfortune (https://dastergon.gr/wheel-of-misfortune/)
@@ -190,8 +186,7 @@
⟡ Real-World SRE (https://www.packtpub.com/web-development/real-world-sre)
⟡ Seeking SRE (http://shop.oreilly.com/product/0636920063964.do)
⟡ What is SRE? (https://www.verizondigitalmedia.com/e-book/oreilly-what-is-sre/)
⟡ Engineering Reliable Mobile Applications: Strategies for Developing Resilient Native Mobile Applications
 (https://landing.google.com/sre/resources/practicesandprocesses/engineering-reliable-mobile-applications/)
⟡ Engineering Reliable Mobile Applications: Strategies for Developing Resilient Native Mobile Applications (https://landing.google.com/sre/resources/practicesandprocesses/engineering-reliable-mobile-applications/)
⟡ Building Secure and Reliable Systems (https://landing.google.com/sre/book.html)
⟡ Chaos Engineering: Crash test your applications (https://www.manning.com/books/chaos-engineering/)
⟡ 97 Things Every SRE Should Know (https://www.oreilly.com/library/view/97-things-every/9781492081487/)
@@ -239,12 +234,10 @@
⟡ Available...or not? That is the question - CRE life lessons (https://cloudplatform.googleblog.com/2017/01/available-or-not-that-is-the-question-CRE-life-lessons.html)
⟡ How Google Backs Up The Internet Along With Exabytes Of Other Data (http://highscalability.com/blog/2014/2/3/how-google-backs-up-the-internet-along-with-exabytes-of-othe.html)
⟡ Performance, Scalability, And High Availability: 3 Key Infrastructure Adaptability Requirements (http://highscalability.com/blog/2017/2/2/performance-scalability-and-high-availability-3-key-infrastr.html)
⟡ The Production Environment at Google - Part 1 (https://medium.com/@jerub/the-production-environment-at-google-8a1aaece3767) & Part 2 
(https://medium.com/@jerub/the-production-environment-at-google-part-2-610884268aaa)
⟡ The Production Environment at Google - Part 1 (https://medium.com/@jerub/the-production-environment-at-google-8a1aaece3767) & Part 2 (https://medium.com/@jerub/the-production-environment-at-google-part-2-610884268aaa)
⟡ Reliable releases and rollbacks - CRE life lessons (https://cloudplatform.googleblog.com/2017/03/reliable-releases-and-rollbacks-CRE-life-lessons.html)
⟡ How release canaries can save your bacon - CRE life lessons (https://cloudplatform.googleblog.com/2017/03/how-release-canaries-can-save-your-bacon-CRE-life-lessons.html)
⟡ Things I Learned Managing Site Reliability for Some of the Worlds Busiest Gambling Sites
 (https://zwischenzugs.wordpress.com/2017/04/04/things-i-learned-managing-site-reliability-for-some-of-the-worlds-busiest-gambling-sites/)
⟡ Things I Learned Managing Site Reliability for Some of the Worlds Busiest Gambling Sites (https://zwischenzugs.wordpress.com/2017/04/04/things-i-learned-managing-site-reliability-for-some-of-the-worlds-busiest-gambling-sites/)
⟡ Every Day Is Monday in Operations (https://www.linkedin.com/pulse/introduction-every-day-monday-operations-benjamin-purgason)
⟡ Under the Hood: Ensuring Site Reliability (https://engineering.squarespace.com/blog/2017/under-the-hood-ensuring-site-reliability)
⟡ Designing reliable systems with cloud infrastructure (Google Cloud Next '17) (https://www.youtube.com/watch?v=7Hy_6SMn8pY)
@@ -386,8 +379,7 @@
⟡ Embracing Feedback (https://blog.heptio.com/embracing-feedback-2fd703da714f)
⟡ Postmortem Action Items: Plan the Work and Work the Plan (https://www.usenix.org/conference/srecon17americas/program/presentation/lueder)
⟡ Social Issues In Postmortems (https://medium.com/@allspaw/social-issues-in-postmortems-d48dde624d18)
⟡ Google Has an Official Process in Place for Learning From Failure--and It's Absolutely Brilliant
 (https://www.inc.com/justin-bariso/meet-postmortem-googles-brilliant-process-tool-for-learning-from-failure.html)
⟡ Google Has an Official Process in Place for Learning From Failure--and It's Absolutely Brilliant (https://www.inc.com/justin-bariso/meet-postmortem-googles-brilliant-process-tool-for-learning-from-failure.html)
⟡ Postmortem culture: how you can learn from failure (https://rework.withgoogle.com/blog/postmortem-culture-how-you-can-learn-from-failure/)
⟡ re:Work - Postmortem discussion template (https://docs.google.com/document/d/1ob0dfG_gefr_gQ8kbKr0kS4XpaKbc0oVAk4Te9tbDqM/edit)
⟡ Post-mortems to the rescue (https://increment.com/documentation/post-mortems-to-the-rescue/)
@@ -448,8 +440,7 @@
⟡ Service Level Disagreements (https://blog.b3k.us/2009/07/15/service-level-disagreements.html)
⟡ How We Use Sloth to do SLO Monitoring and Alerting with Prometheus (https://mattermost.com/blog/sloth-for-slo-monitoring-and-alerting-with-prometheus/)
⟡ SLI Deep Dive (https://medium.com/site-reliability-engineering-leadership/sli-deep-dive-cae92bd90a79)
⟡ Measuring Reliability in GCP: Step By Step SLO creation guide using Cloud Operation Sandbox
 (https://medium.com/google-cloud/measuring-reliability-in-gcp-step-by-step-slo-creation-guide-using-cloud-operation-sandbox-99043bd0e70f)
⟡ Measuring Reliability in GCP: Step By Step SLO creation guide using Cloud Operation Sandbox (https://medium.com/google-cloud/measuring-reliability-in-gcp-step-by-step-slo-creation-guide-using-cloud-operation-sandbox-99043bd0e70f)
⟡ SLO tracker (https://slotracker.com/)
⟡ SLO Alerting for Mortals (https://ervinbarta.com/2021/10/19/slo-alerting-for-mortals/)
⟡ SRE methods and climate change (https://bpetit.nce.re/2021/03/sre-methods-and-climate-change/)
@@ -495,8 +486,7 @@
⟡ SRE as a Lifestyle Choice (https://medium.com/@bellmar/sre-as-a-lifestyle-choice-de9f5a82d73d)
⟡ SRECon EMEA 2019 Recap (https://speakerdeck.com/dastergon/srecon-emea-2019-recap-sre-muc-meetup)
⟡ Life of an SRE at Google - JC van Winkel (https://www.youtube.com/watch?v=7Oe8mYPBZmw)
⟡ Site Reliability Engineering for Native Mobile Apps - Abhijith Krishnappa
 (https://www.infoq.com/articles/site-reliability-engineering-mobile-apps/) - Case study: Halodoc adaptation of SRE principles for Native Mobile Apps
⟡ Site Reliability Engineering for Native Mobile Apps - Abhijith Krishnappa (https://www.infoq.com/articles/site-reliability-engineering-mobile-apps/) - Case study: Halodoc adaptation of SRE principles for Native Mobile Apps
⟡ SRE Best Practices by InfraCloud (https://www.infracloud.io/blogs/sre-best-practices/)
Real-time Messaging