Making a large investment in buying new servers and adding computing power to your data center is not a good thing, if you are not able to maximize the return on investment. Have you ever considered that the software can be the bottleneck for your data center performance? I came across an interesting case about how Tencent achieved significant results in storage system performance through software and infrastructure optimization.
Most of you are probably familiar with Tencent, one of China’s top Internet service providers. Its popular products like QQ instant messenger*and Weixin*, as well as its online games, have become household names among active online users in the country.
With the popularity of its social media products and massive user base in hundreds of millions, it is not surprising that Tencent needs to process and store lots and lots data like images, video, mail and documents created by its users. If you are a user of Tencent’s products, you are likely contributing your photos and downloading your friends’, too. To manage such needs, Tencent uses a self-developed file system, the Tencent File System* (TFS*).
Previously, Tencent utilized a traditional triple redundancy backup solution which was not an efficient solution in storage utilization. The storage media was found to be a major cost factor of TFS. As a result, Tencent decided to implement an erasure-code solution using Jerasure* open source library running on Intel® architecture-based (IA-based) servers.
As Tencent engineers validated the new solution, they noticed the computing performance was lower than the I/O throughput of the storage and network subsystems. In other words, the storage servers were not able to compute the data as fast as the I/O subsystem could move them. Adding more compute power might appear as an obvious but costly solution. Instead, Tencent used software optimization tools like Intel® VTune™ Amplifier XE and Intel® Intelligent Storage Acceleration Library to identify inefficient codes in its system and optimize them for the Intel® Xeon® processor-based server. The results were very effective and the bottleneck of system performance moved to the I/O subsystems. This was then easily addressed by Tencent when it migrated to a 10 Gigabit network using Intel® Ethernet 10 Gigabit Converged Network Adapter.
As a result of the cost-effective optimization effort, Tencent was able to get the most out of the storage system it deployed. Tencent found the optimized erasure code solution effectively reduced storage space by 60 percent and storage performance enhanced by about 20 times, while the I/O performance of TFS improved by 2.8 times. With cold data now being processed using the new TFS system, Tencent has saved significant server resources and raised the performance-price ratio for storage.
The new solution not only contributed to performance and user experience. Tencent is also saving hundreds of kilowatts of energy as it no longer needed to purchase thousands of servers to meet its storage needs.
The next time you access Tencent’s product, you now know the efforts Tencent engineers have put into improving your experience. If you are interested to know in detail how Tencent optimized its software and removed bottlenecks in its storage system, and their results, you can read the complete case study.
Have you got any interesting software optimization story to share?