Hi everyone. Welcome to the sixth chapter in our Tencent cloud developer associate course, distributed caching and message queuing. At the end of this chapter, you'll be able to understand the concepts and features of caching, understand the concepts and features of message queuing, understand the use cases of TencentDB for Redis, and understand the use cases of cloud message queuing, CMQ, and cloud Kafka, CKafka. You will also learn how to use TencentDB for Redis, CMQ, and CKafka. In this chapter, we'll cover five sections. Overview of caching, TencentDB for Redis, overview of message queuing, CMQ, and CKafka. Let's get started with section one, overview of caching. In this video, we'll cover why cache is needed, provide an overview of caching, compare the local cache with the distributed cache, and introduce common distributed cache services. Let's consider this question. Why do we need caching? To answer this question, we first need to analyze the performance differences between different storage media in terms of data, read, and write. For modern computers, CPUs are much faster at reading and writing data than memory. And memory is much faster than hard drives. Of course, the higher the speed of the storage media, the higher the hardware cost. Therefore, based on cost and performance considerations, the design of common caching systems in the software development field is basically memory-based. The hardware cost is greatly reduced compared to CPU-based caching systems. While the read and write speed is a quantum leap compared to traditional hard disks. This figure shows the traditional application architecture. If a user wishes to obtain data, the user will first send a request to the server, and the server will directly operate the database after receiving the request, take out the corresponding data, and then return it to the user. But one of the most important performance bottlenecks of this application architecture, is the operation of the database. And its limitations are mainly reflected in the following two aspects. First, the data read and write speed is slow, since traditional databases have to store data on the hard disk. Second, it is difficult for traditional databases to cope with high numbers of concurrent access requests. So how can we break through the limitations of traditional application architecture? Caching is the solution. Caching is essentially a technology that speeds up the system response, which was first used in the design of CPUs, and later widely used in software development. For example, some large websites use memory as a cache on their servers to accelerate data reads. The types of data suitable for caching mainly include hot data such as hot news and hot search. Data with low real time requirements such as website homepage and news rating, and data with simple business logic such as product details, and personal information. This figure shows the application system with caching added between the web server, and the database. As you can see, the frequently queried data will be saved in the cache. So that when the data request comes from the user side, the server side will first determine whether there is any requested data in the cache. If there is, the data will be returned to the user directly. If not, the query will be returned to the client directly, through the traditional database query. And at the same time, stored in the cache. So the same data request will directly fetch the data from the cache. In this way, the number of direct database operations is greatly reduced. While the performance advantage of the memory greatly improves the response speed of the system. The cache can be divided into two types, the local cache, and the distributed cache. For the local cache, the cache elements are local to a specific application server instance, and cannot be shared. In contrast, distributed cache as a cache component or service separated from the local application servers, so that multiple application servers can share the cache directly. This table compares the local cache, with the distributed cache. Since the cache elements for the local cache are applicable to the specific application instance, when the data changes, the old version of the cache will not be recycled. Therefore, the local cache has poor consistency. Meanwhile, a distributed cache will provide a logical view and state of a single cache when deployed on a cluster consisting of multiple nodes. In most cases, objects in a distributed cache will exist on a single node in the cluster. Using a hashing algorithm, the cache engine can always determine which particular node, a key value pair is located in. Since the entire cluster will always have a specific state, there is never an inconsistency for the distributed cache. In terms of overhead, the local cache may affect the garbage collection and system performance, resulting in a high overhead. In contrast, the distributed cache has a lower overhead, which is mainly generated by network latency and object serialization. In addition, local cache also has poor reliability compared to the distributed cache. Since local cache uses the same heap space as the application, great care must be taken to determine the maximum amount of memory the cache can use. If the application runs out of memory, it is difficult to recover. On the other hand, distributed caches run as independent processes on multiple nodes. So a single point of failure will not cause the cache to fail. Missing cache elements will go to the surviving node the next time the cache fails to hit. The worst consequence of an overall cache failure in the distributed cache case, is to degrade system performance, rather than cause an overall system failure. To summarize, for smaller and predictable access to invariant objects, local cache is an ideal solution. Since it outperforms distributed caching in terms of performance. However for cases where the number of objects to be cached is unknown and large, and read consistency is required, distributed cache is a better solution. Applications can apply both types of caching, and the choice depends on the application scenarios. Let's look at some common distributed cache services. For example, Memecached is a distributed cashing system developed by Brad Fitzpatrick of Livejournal. But currently used by many websites. It is an open source software distributed under a Berkeley source distribution, BSD license. Redis is an open source network-enabled memory based, and optionally persistent key value pair storage database. In terms of data storage types, Redis has more data structures, and supports richer data operations than Memcached. For Memcached, you usually need to take the data to the client to make similar changes, and then, set it back. Which greatly increases the number of network input and output, and the size of the data. In Redis, these complex operations are usually as efficient as a normal get and set. Therefore, if you need the cache to support more complex structures and operations, Redis would be a better choice. In terms of single core performance, Redis has a higher performance per core than Memcached, when storing small data. Since Redis uses only a single core, while Memcached can use multiple cores. For storing larger datasets, Memcache performs better than Redis. In terms of additional features, Redis is more feature rich, as it supports publish and subscribe modes, has better support for master-replica partitioning, and, can be easily serialized. While Memcached can only has better support for multi-threaded services.