4.23文创礼盒,买2个减5元 读书月福利
欢迎光临中图网 请 | 注册

SRE实战

作者:Nat Welch著
出版社:东南大学出版社出版时间:2019-03-01
开本: 24cm 页数: 10,323页
中 图 价:¥69.1(7.2折) 定价  ¥96.0 登录后可看到会员价
加入购物车 收藏
运费6元,满69元免运费
?快递不能达地区使用邮政小包,运费14元起
云南、广西、海南、新疆、青海、西藏六省,部分地区快递不可达
本类五星书更多>

SRE实战 版权信息

SRE实战 本书特色

  《SRE实战(影印版 英文版)》是软件开发人员在网站灾难性故障中的生存指南。随着企业力求实现正常运行时间的大化,站点可靠性工程(Site Reli ability Engineering,SRE)首当其冲。当你的站点出现问题,修复故障已经迫在眉睫的时候,《SRE实战(影印版 英文版)》可以作为一个手把手的操作指南。
  Nat Welch在可靠性工程方面丰富的实战经验源自于某些对于系统中断事件极为敏感的互联网大公司。他用于监控现代Web服务、设置警报和评估事件响应的方法都经过了实践的考验,学会这些必将助你一臂之力。
  《SRE实战(影印版 英文版)》可不仅仅是教你如何应对灾难,它还为你揭示了安全测试和发布软件所需的工具和策略、长期增长计划以及预见未来的瓶颈。通过《SRE实战(影印版 英文版)》,你将学会如何制定自己的强健行动计划,以便在全公司的网站危机中凸显你的价值。

SRE实战 内容简介

本书是软件开发人员在网站灾难性故障中的首选生存指南。随着企业力求实现正常运行时间的*大化,站点可靠性工程(Site Reliability Engineering,SRE)首当其冲。当你的站点出现问题,修复故障已经迫在眉睫的时候,本书可以作为一个手把手的操作框架。Nat Welch在可靠性工程方面丰富的实战经验源自于Internet上某些*大的公司,这些公司对于系统中断事件极为敏感。他所用于监控现代Web服务、设置警报和评估事件响应的方法都经过了实践的考验,学会这些必将助你一臂之力。

SRE实战 目录

Preface
Chapter 1: Introduction
A brief history
What is SRE?
What is in the book?
SRE as a framework for new projects
Summary
References

Chapter 2: Monitoring
Why monitoring?
Instrumenting an application
What should we measure?
A short introduction to SLIs, SLOs, and error budgets
Service levels
Error budgets
Collecting and saving monitoring data
Polling applications
Nagios
Prometheus
Cacti
Sensu
Push applications
StatsD
Telegraf
ELK
Displaying monitoring information
Arbitrary queries
Graphs
Dashboards
Chatbots
Managing and maintaining monitoring data
Communicating about monitoring
Do they even know there is monitoring?
References and related reading
Future reading
Summary

Chapter 3: Incident Response
What is an incident?
What is incident response?
Alerting
When do you alert?
How do you alert?
Alerting services
What is in an alert?
Who do you alert?
Being on call
Communication
Incident Command System (ICS)
Where do you communicate?
Recovering the system
Calling all clear
Summary

Chapter 4: Postmortems
What is a postmortem?
Why write a postmortem?
When to write a postmortem document
Carrying out incident analysis
How to write a postmortem document
Summary
Impact
Timeline
Root cause
Action items
Postmortems without action items
Appendix
Blameless postmortems
Holding a postmortem meeting
Analyzing past postmortems
MTFR and MTBF
Alert fatigue
Discussing past outages
Summary
References

Chapter 5: Testing_and Releasing_
Testing
What do you test?
Testing code
Testing infrastructure
Testing processes
Releasing
When to release
Releasing to production
Validating your release
Rollbacks
Automation
Continuous everything
Summary

Chapter 6: Capacity Planning
A quick introduction to business finance
Why plan?
Managing risk and managing expectations
Defining a plan
What is our current capacity?
When are we going to run out of capacity?
How should we change our capacity?
State and concurrency
Is your service limited by another service?
Scaling for events
Unpredictable growth-user-generated content
Preplanned versus autoscaling
Delivering
Execute the plan
Architecture——where performance changes come from
Tech as a profit center and procurement
Summary

Chapter 7: Building Tools
Finding projects
Defining projects
RDD
Example
Design documents
Planning projects
Example
Retrospectives and standups
Allocation
Building projects
Advice for writing code
Separation of concerns
Long-term work
Example OKRs
Notebooks
Documenting and maintaining projects
Summary

Chapter 8: User Experience
An introduction to design and UX
Real-world interaction design
User testing
Picking an experience
Designing the test
Finding people to test
Developer experience
Experience of tools
Performance budgets
Security
Authentication
Authorization
Risk profile
Phishing
ACM code of ethics
Summary
References

Chapter 9: Networking Foundations
The internet
Sending an HTTP request
DNS
dig
Ethernet and TCP/IP
Ethernet
IP
CIDR notation
ICMP
UDP
TCP
HTTP
curl and wget
Tools for watching the network
netstat
nc
tcpdump
Summary

Chapter 10: Linux and Cloud Foundations
Linux fundamentals
Everything is a file
Files, directories, and inodes
Sockets
Devices
/proc
Filesystem layout
What is a process?
Zombies
Orphans
What is nice?
syscalls
How to trace
Watching processes
Build your own
Cloud fundamentals
VMs
Containers
Load balancing
Autoscaling
Storage
Queues and Pub/Sub
Units of scale
Example architecture interview
Summary
References
Other Books You May Enjoy
Index
展开全部
商品评论(0条)
暂无评论……
书友推荐
编辑推荐
返回顶部
中图网
在线客服