翻译文章,原文地址:http://antirez.com/news/94
Clarifications about Redis and Memcached
If you know me, you know I’m not the kind of guy that considers competing products a bad thing. I actually love the users to have choices, so I rarely do anything like comparing Redis with other technologies.
我觉得有竞争好事,用户有更多的选择,所以我很好拿Redis和其他的技术比较。
However it is also true that in order to pick the right solution users must be correctly informed.
然而用户得被指导怎么选择。
This post was triggered by reading a blog post published by Mike Perham, that you may know as the author of a popular library called Sidekiq, that happens to use Redis as backend. So I would not consider Mike a person which is “against” Redis at all. Yet in his blog post that you can find at the URL http://www.mikeperham.com/2015/09/24/storing-data-with-redis/ he states that, for caching, “you should probably use Memcached instead [of Redis]”. So Mike simply really believes Redis is not good for caching, and he arguments his thesis in this way:
1) Memcached is designed for caching. 2) It performs no disk I/O at all. 3) It is multi threaded and can handle 100,000s of requests by scaling multi core.
有人说Redis不行,他们的依据如下: 1) Memcached是为了缓存而设计的。 2) 它不会对磁盘进行读写操作。 3) 它是多线程的,可以处理100,000s的请求。
I’ll address the above statements, and later will provide further informations which are not captured by the above sentences and which are in my opinion more relevant to most caching users and use cases.
他们对用户和使用场景的理解没我明白。
Memcached is designed for caching: I’ll skip this since it is not an argument. I can say “Redis is designed for caching”. So in this regard they are exactly the same, let’s move to the next thing.
针对
Memcached是为了缓存而设计的
这句,没啥可争得,我也能整句Redis是为缓存设计的
,Memcached和Redis在缓存方面差不多,看看别的。
It performs no disk I/O at all: In Redis you can just disable disk I/O all if you want, providing you with a purely in-memory experience. Except, if you really need it, you can persist the database only when you are going to reboot, for example with “SHUTDOWN SAVE”. The bottom line here is that Redis persistence is an added value even when you don’t use it at all.
针对
Memcache不会对磁盘进行读写操作
这句,Redis你可以禁用磁盘读写,就使用纯内存缓存,如果你想用,就可以把数据持久化,比如你要重启服务器。话就撂这了,Redis的持久化是一个赠品,你爱用不用。
It is multi threaded: This is true, and in my goals there is to make Redis I/O threaded (like in memcached, where the data access itself is not threaded, basically). However Redis, especially using pipelining, can serve an impressive amount of requests per second per thread (half a million is a common figure with very intensive pipelining. Without pipelining it is around 100,000 ops/sec). In the vanilla caching scenario where each Redis instance is the same, works as a master, disk ops are disabled, and sharding is up to the client like in the “memcached sharding model”, to spin multiple Redis processes per system is not terrible. Once you do this what you get is a shared-nothing multi threaded setup so what counts is the amount of operations you can serve per single thread. Last time I checked Redis was at least as fast as memcached per each thread. Implementations change over time so the edge today may be of the one or the other, but I bet they provide near performances since they both tend to maximize the resources they can use. Memcached multi threading is still an advantage since it makes things simpler to use and administer, but I think it is not a crucial part.
针对
Memcache是多线程
这个,确实,而且我也想把Redis整成多线程的IO(数据访问还是单线程)。还有,Redis ,尤其使用了Pipeline以后,单线程每秒能处理巨量的请求(用Pipeline每秒整50w,不用每秒整10w)。你要是多整几个Redis实例,让客户端自己分片,禁用磁盘读写,不就和”memcached分片模型一样了吗”,这样婶你要计算的请求数量就是每个线程能处理的请求数量。最后一次我检查Redis的时候,它比和memcached一样快了。实现有可能变化,但是我认为他们都提供了比较接近的性能,因为它们都会最大化资源使用。Memcached多线程是一个优势,因为它让操作更简单,更容易管理,但是我认为它不是一个关键的部分。
There is more. Mike talks of operations per second without citing the quality of operations. The thing is in systems like Redis and Memcached the cost of command dispatching and I/O is dominating compared to actually touching the in-memory data structures. So basically in Redis executing a simple GET, a SET, or a complex operation like a ZRANK operation is about the same cost. But what you can achieve with a complex operation is a lot more work from the point of view of the application level. Maybe instead of fetching five cached values you can just send a small Lua script. So the actual “scalability” of the two systems have many dimensions, and what you can achieve is one of those.
还有,Mike说的是每秒操作数量,没有指出操作的质量。像在Redis和Memcached的这样的系统,命令分发和I/O的成本和实际操作内存数据来比较消耗是很大的。所以说,Redis执行一个简单的GET,SET,或者一个复杂的操作,比如ZRANK操作,其实是相同的成本。但是你可以通过一个复杂的操作达到更多的工作,比如发送一个小的Lua脚本。两个系统想整活可以有好多方向,但是你只能整一个。
Of Mike’s concerns the only valid I can see is multi threading which, if we consider Redis in its special case of memcached replacement, may be addressed executing multiple processes, or simply by executing just one since it will be very very hard to saturate one thread doing memcached alike operations.
我看Mike说的就一个
Memcache支持多线程
这点游泳,要是换个角度,Redis也可以被视为一个Memcache替换,通过多个进程实现Memcache的模式。反正单线程像Memcache一样,操作的成本是很大的。
The real differences —
Now it’s time to talk about the real differences between the two systems.
现在让我们讨论两个系统的真实差异。
- Memory efficiency > * 内存效率
This is where Memcached used to be better than Redis. In a system designed to represent a plain string to string dictionary, it is simpler to make better use of memory. This difference is not dramatic and it’s like 5 years I don’t check it, but it used to be noticeable.
这时Memcache过去比Redis好的地方。将一个简单的字符串到字符串字典的系统,它更简单的利用内存。这个差异不是很引人注目,我好像5年没有检查过它,但是以前它很亮眼。
However if we consider memory efficiency of a long running process, things are a bit different. Read the next section.
然而我们考虑到一个长期运行的进程的内存效率,事情变得不一样了。读下一个章节。
But again to really evaluate memory efficiency, you should put into the bag that specially encoded small aggregated values in Redis are very memory efficient. For example sets of small integers are represented internally as an array of 8, 16, 32 or 64 bits integers, and are accessed in logarithmic time when you want to check the existence of some since they are ordered, so binary search can be used.
另一个比较重要的是,Redis里面的特别编码的小聚合值是非常内存效率的。比如Redis里面的集合是用整数数组表示的,如果你想检查某个值是否存在,那么你可以用二分查找。
The same happens when you use hashes to represent objects instead of resorting to JSON. So the real memory efficiency must be evaluated with an use case at hand.
同样的用哈希表来表示对象,而不是用JSON也是很有内存效率的,所以真正的内存效率必须用一个实际的场景来评估。
- Redis LRU vs Slab allocator > * Redis LRU vs Slab分配器
Memcached is not perfect from the point of view of memory utilization. If you happen to have an application that dramatically change the size of the cached values over time, you are likely to incur severe fragmentation and the only cure is a reboot. Redis is a lot more deterministic from this point of view.
Memcached在内存利用率方面表现一般。如果你有一个应用程序,它频繁地改变缓存值的大小,你很可能会发生内存塌陷,只有重启才能解决。Redis更加确定。
Moreover Redis LRU was lately improved a lot, and is now a very good approximation of real LRU. More info can be found here: http://redis.io/topics/lru-cache. If I understand correctly, memcached LRU still expires according to its slab allocator so sometimes the behavior may be far from real LRU, but I would like to hear what experts have to say about this. If you want to test Redis LRU you now can using the redis-cli LRU testing mode available in recent versions of Redis.
Redis LRU最近很好地改善了很多,现在和真正的LRU的效果非常接近了。更多信息可以在这里找到:http://redis.io/topics/lru-cache。 如果我没理解错,memcached的LRU仍然由它的slab分配器决定,所以有时候它的表现可能和真正的LRU不同,但是我希望能听到一些专家聊聊这个。如果你想测试Redis LRU,你现在可以用redis-cli LRU测试模式,新版的Redis里面支持。
- Smart caching > * 智能缓存
If you want to use Redis for caching, and use it ala-memcached, you are truly missing something. This is the biggest mistake in Mike’s blog post in my opinion. People are switching to Redis more and more because they discovered that they can represent their cached data in more useful ways. What to retain the latest N items of something? Use a capped list. Want to take a cached popularity index? Use a sorted set, and so forth.
如果你想用Redis作为缓存,并且用它像memcached一样,你真的漏了一件事。这是Mike的博客文章的最大错误。人们开始用Redis来缓存数据,并且发现它可以表示数据更有用的方式。保留最近N个项目?使用一个可容纳的列表。想要拿一个热门度的缓存?使用一个排序的集合,以此类推。
- Persistence and Replication
- 持久化和复制
If you need those, they are very important assets. For example using this model scaling a huge load of reads is very simple. The same about restarts with persistence, the ability to take cache snapshots over time, and so forth. But it’s totally fair to have usages where both features are totally irrelevant. What I want to say here is that there are “pure caching” use cases where persistence and replication are important.
如果需要这些功能,那它们就都是宝贝。比如说,使用这种模型,可以分流大量的读取操作。同样,如果需要可持久化的重启,或者定时生成内存快照等等。用和不用这俩功能的基本持平。但我想说的是,有一些“纯缓存”的用例,持久化和复制是非常重要的。
- Observability
- 观察性
Redis is very very observable. It has detailed reporting about a ton of internal metrics, you can SCAN the dataset, observe the expiration of objects. Tune the LRU algorithm. Give names to clients and see them reported in CLIENT LIST. Use “MONITOR” to debug your application, and many other advanced things. I believe this to be an advantage.
Redis的可观察性很好。它有大量的内部指标的详细报告,你可以遍历数据集,观察对象的过期时间。调整LRU算法。给客户端命名,然后在CLIENT LIST里看到它们。使用“MONITOR”来调试你的应用程序,并且有很多其他高级功能。我认为这是一个优势。
- Lua scripting
- lua脚本
I believe Lua scripting to be an impressive help in many caching use cases. For example if you have a cached JSON blob, with a Lua command you can extract a single field and return it to the client instead of transferring everything (you can do the same, conceptually, using Redis hashes directly to represent objects).
我认为Lua脚本是对很多缓存场景都非常重要。比如说,如果你有一个缓存的JSON块,通过Lua命令,你可以提取出这数据的其中一个字段的值回给客户端,而不是传输所有的数据(你也可以做相同的,抽象地,使用Redis哈希来表示对象)。
Conclusions > 结论
—
Memcached is a great piece of software, I read the source code multiple times, it was a revolution in our industry, and you should check if for you is a better bet compared to Redis. However things must be evaluated for what they are, and in the end I was a bit annoyed to read Mike’s report and very similar reports over the years. So I decided to show you my point of view. If you find anything factually incorrect, ping me and I’ll update the blog post according with “EDIT” sections.
Memcached是个接触的软件,我看了它的源码好几遍,它是我们的行业的一个大变革,你应该检查一下对于你来说,选择Memcached是否比Redis更好。但是,对于事情应该正确的评估,看了Mike的报告以及这写年和他相似的报告我很上火。所以我决定告诉你我的观点。如果你发现什么是错误的,请联系我,我会更新博客文章,根据“EDIT”段落。