济源做网站/百度广告代理公司
参考链接:http://nil.csail.mit.edu/6.824/2020/notes/l-2pc.txt
分布式事务两个核心主题:
Concurrency Control
Atomic Commit
同一个事务涉及的数据存储在不同的服务器上,因此处理起来更麻烦
多个事务并发执行时的正确性
要满足C和I,即同一个事务不应观察到其他事务带来的变化,要满足相同的约束等
一定程度上Isolated等于serializable
这里以事务为单位,执行顺序与结果为:
T1,T2; 11,9
T2,T1; 10,10
因此只有这两个为正确结果,如果读到11,10或者10,9之类的都是错误结果
结果需要跟串行执行时的结果一样
这个过程带来的各种问题可参考mysql mvcc那里
事务执行过程中可能会abort,此时需要进行其他处理
Concurrency Control
Pessimistic Control:主要用锁来实现
过度持有锁会影响性能,不过保证了正确性
样例:two-phase locking
步骤
1、acquire lock before using record
2、hold until commit or abort record(否则无法串行执行)
冲突多时使用
值得注意的是,这里的锁都是exclusive排他锁而不是共享锁(即使读操作也需要获得排他锁)
一直持有以避免事务abort带来复杂后果
Optimistic Control(occ):
冲突少时使用(加锁本身也需要浪费时间,无必要就不加锁)
锁机制可能导致死锁
get x get y; get y get x.
two-phase-commit
all-or-none
TC-Transaction Coordinator
具体流程:
1、TC给各个server发送该server需要执行的指令
收到指令,server会立即尝试对涉及的数据加锁
2、执行到末尾,外部需要提交,发送prepare指令给server
3、server根据自身当前情况返回是否能够执行
4、全部为yes则commit,否则abort
5、serverabort或commit后unlock释放锁,并返回ack确认信息
two-phase commit各种可能的错误情况
Server
1、B在返回yes前崩溃:迅速重启,TC长时间无反馈重新发送确认信息,B重启后根据自身情况决定返回yes/no
2、B在返回yes/no后崩溃:更加复杂。为了能够恢复状态,B需要在接受prepare信息返回yes前记录对应的log,启动后重新变回之前的状态并执行(通过log进行状态恢复)
3、B在commit后崩溃,无法返回ACK:commit或abort后B会删除全部的log,这样接收来自TC的相同的commit指令时会ignore执行信息直接返回ACK。
TC
1、TC在发送prepare message前崩溃
毫无影响,重启后重新发就好
2、发送commit消息前崩溃
通过持久化的log,重新执行,确定是否发送commit
可能导致server收到重复的commit message
TC在收到全部服务器的ack消息后才能删除持久化的log
类似的,server在commit/abort后会删除全部持久化信息以应对二次发来的相同信息
分布式必须重点关注各方面的性能问题,一个小小的问题就可能带来连锁反应称为bottle-neck
丢包问题
1、长时间没有收到yes/no回复:直接abort事务
2、设置timeout避免长时间持有锁,但如果已经回复了yes就只能干等了(正确性第一)
因此人们尽力保证发送yes/no到commit的过程足够短暂
two-phase commit并不是用来解决availability的问题,任何一个环节,如crash、reboot都会导致整个系统陷入长时间等待
it is correct to failure but not available to failure
解决available的思路
将分布式不同协议进行结合(如lab3实现kv数据库的方法)
每一个节点都通过raft对应多台机器,任何一个崩掉都可以快速恢复
lab4如此实现
一些普遍问题
2-phase commit如果coordinator崩了且无法恢复,worker也会一直持有锁并等待下去,因此需要尽可能地避免这种情况出现。
Q: What should two-phase commit workers do if the transaction coordinator crashes?A: If a worker has told the coordinator that it is ready to commit, then the worker cannot later change its mind. The reason is that the coordinator may (before it crashed) have told other workers to commit. So the worker has to wait (with locks held) for the coordinator to reboot and re-send its decision.Waiting indefinitely with locks held is a real problem, since the locks can force a growing set of other transactions to block as well. So people tend to avoid two-phase commit, or they try to make coodinators reliable. For example, Google's Spanner replicates coordinators (and all other servers) using Paxos.
3-phase commit只有网络畅通且能判断coordinator是否失联的情况才好使,发生网络分区等问题时依然会出现问题。
Q: Why don't people use three-phase commit, which allows workers to commit or abort even if the coordinator crashes?A: Three-phase commit only works if the network is reliable, or if workers can reliably distinguish between the coordinator being dead and the network not delivering packets. For example, three-phase commit won't work correctly if there's a network partition. In most practical networks, partition is possible.
log的意义:
1、捕捉事件执行的顺序,便于恢复
2、整理日志后可以连续写入数据,效率更高
Q: Why do logs appear so often in the designs we look at?A: One reason is that a log captures the serial order that the system has chosen for transactions, so that e.g. all replicas perform the transactions in the same order, or a server considers transactions in the same order after a crash+reboot as it did before the crash.Another reason is that a log is an efficient way to write data to hard disk or SSD, since both media are much faster at sequential writes (i.e. appends to the log) than at random writes.A third reason is that a log is a convenient way for crash-recovery software to see how far the system got before it crashed, and whether the last transactions have a complete record in the log and thus can safely be replayed. That is, a log is a convenient way to implement crash-recoverable atomic transactions, via write-ahead logging.