Meta, the company that owns social networks such as Facebook, WhatsApp and Instagram, among other projects, made public the AI Research SuperCluster (RSC), the fastest Artificial Intelligence (AI) supercomputer in the world until today and that seeks to obtain the position as the most powerful when it finishes its construction in mid-2022.
This powerful machine began its development a year and a half ago when its cooling, power, network and wiring systems were planned from scratch. Phase one is currently underway, which has 760 Nvidia DGX A100 graphics cards each powered by eight A100 GPUs and two AMD CPUs of 64 cores each.
Even in its first phase, it already exceeded the performance by up to 20 times in relation to the previous system thanks to its high-performance Internet InfiniBand of Nvidia Quantum-2, which gives a speed of up to 200 GB per second in its bandwidth, in addition to having a storage of 175 petabytes. In general, it has a power of 1,895 exaflops with estimates that by the middle of the year it will reach almost five, that is, a speed of more than 100 thousand average desktop computers.
In the report made public by the firm led by Mark Zuckerberg, it was detailed that, although Artificial Intelligence is currently only used for tasks such as translating text into different languages or detecting potentially harmful content, next-generation tasks will need to have powerful supercomputers to be able to carry out up to five quintillion operations per second.
Through this power, the goal is to build better AI models, so that new generation teams can work in multiple languages at the same time, in addition to performing voice translations, text, image and video analysis and even development of augmented reality tools.
Because one of the main bets of Meta is precisely the Metaverse, it will be sought that supercomputers also support the construction of technologies for this new creation.
DATUM
Before the end of 2022, Meta plans to complete phase two of the CSR, so it would operate with 16,000 GPUs and would be able to train natural language processing models with more than a trillion parameters a day, even in huge data packets, such as an exabyte.