Blog
Our changelog, announcements, dev posts, and anything else we think you'll find interesting.
Fallacies Of Distributed Gomputing
Michael Hausenblas
There are any fallacies in distributed computing. This stuff will bite you sooner or later. For each fallacy, I address the problem with a general solution and a Go specific solution.
1. The network is reliable
In general: timeouts, error handling, retry logic.
Go language &stdlib:
Beyond stdlib:
2. Latency is zero
In general: cancelation, partial results
Go language & stdlib:
- context.WithCancel
- encoding/json.Decoder.Decode(Stream)
Beyond stdlib:
3. Bandwidth is infinite
In general: CDNs, protobuf rather than JSON
Go: golang/protobuf
4. The network is secure
In general: SSL/TLS, digital signatures, checksums
For go: crypto/tls, crypto/rsa, crypto/sha512, crypto/x509
Beyond stdlib:
5. Topology doesn't change
In general: DNS vs. IP addresses, TTL
In Go: be aware of the pure Go resolver
6. There is one administrator
In general: auditing, role-based access, immutability
Go language & stdlib:
- functional programming
- Immutability on Go
Beyond stdlib:
- casbin/casbin (authz lib/ACL, RBAC, ABAC)
7. Transport cost is zero
In general: cost of (un)marshalling
In Go stdlib:
- encoding/*
- performance of encoding/json.Marshal
Beyond stdlib:
8. The network is homogeneous
In general: interoperability on all levels.
In Go: broad and well-tested stdlib
Beyond stdlib:
Overview case studies
I'll be covering five open source distributed systems.
All stats taken on 2017-07-06:
Used cloc to generate raw stats:
Kubernetes
Kubernetes.io is a container orchestration platform.
- Google initiated, opinionated, some 70% market share - Like the kernel for a cluster operating system - Initially was depending on Docker (now: CRI-O) - Bring your own SDN, minimal core + plugins
Fallacies covered:
-
The network is reliable.
-
The network is secure.
-
Transport cost is zero.
Architecture overview:
Consul
Consul is a tool for service discovery and configuration
- Not one thing but many things to many people
- Can provide service discovery (microservices architecture)
- Can be used as a key-value store
- Uses Serf lib as gossip protocol (manage membership/broadcast messages)
Fallacies covered:
- Topology doesn't change.
Architecture overview:
etcd
etcd is a distributed, reliable key-value store
- Distributed setup realized via Raft
- Benchmarked at 10,000 writes/sec - Exposes HTTP and gRPC interfaces - Automatic TLS with optional client cert authentication
Fallacies covered:
- The network is reliable.
- Bandwidth is infinite.
- Transport cost is zero.
CockroachDB 23.4
CockroachDB is a distributed SQL database
- Primary design goals: scalability, strong consistency, survivability
- Every node in the cluster can act as SQL gateway, mapping and executing SQL statements to key-value operations
- Uses RocksDB (variant of Google's LevelDB storage lib) for persistence
Fallacies covered:
- Latency is zero.
- The network is homogeneous.
Architecture overview:
Minio
Minio is an object storage server compatible with Amazon S3 APIs
- Cloud-native/container-ready objects with up to 5TB in size
- Storage API covers local storage as well as network storage
- Uses erasure code algorithm to protect data
- New:
gateway
for multi-cloud (Azure, GCS, S3) access
Fallacies covered:
- The network is reliable
- Transport cost is zero
Architecture overview:
##Observations
Some observations from the code reviews carried out in order to prepare this talk:
- subjectively, the top three types of issues encountered were: timeouts, DNS, and resource exhaustion
- go doc is awesome; high level of coverage; right incentives such as godoc.org
- Go best practices, six years in also applies (think: operations)
- make it possible to replicate test and build pipeline locally
- ... and also, slightly controversial: Container Assisted Testing
##Conclusions
-
Go is a great language for a team to build (complex) distributed systems
-
Go scales: both in the
-
Go is a great language for a team to build (complex) distributed systems
-
Go scales: both in the human dimension and concerning machines
-
Any chance that we're gravitating towards a common lib for 'distributed gomputing'?
Take home message: be aware of the 8 fallacies and how to avoid them!
For complete slides: http://go-talks.appspot.com/github.com/mhausenblas/fallacies-of-distributed-gomputing/main.slide.
Liveblog by Linda Xie (@lindeexie)
Michael Hausenblas is a developer advocate for OpenShift and Kubernetes Hat Red Hat. His background is in large-scale data processing and container orchestration. He also contributes to open source software, mainly using Go.