Tachyum Prodigy Native AI Supports TensorFlow and PyTorch

2020年8月26日 · 読むのに 5 分

SANTA CLARA, Calif., August 26, 2020 – Tachyum™ Inc. today announced that it has further expanded the capabilities of its Prodigy Universal Processor through support for TensorFlow and PyTorch environments, enabling a faster, less expensive and more dynamic solution for the most challenging artificial intelligence/machine learning workloads.

Analysts predict that AI revenue will surpass $300 billion by 2024 with a compound annual growth rate (CAGR) of up to 42 percent through 2027. AI is being heavily invested in by technology giants looking to make the technology more accessible for enterprise use-cases. They include self-driving vehicles to more sophisticated and control-intensive disciplines like Spiking Neural Nets, Explainable AI, Symbolic AI and Bio AI. When deployed into AI environments, Prodigy is able to simplify software processes, accelerate performance, save energy and better incorporate rich data sets to allow for faster innovation.

Proprietary programming environments like CUDA are inherently hard to learn and use. With open source solutions like TensorFlow and PyTorch, there are a hundred times more programmers that can leverage the frameworks to code for large-scale ML applications on Prodigy. By including support for deep learning environments that are easier to learn, build and train diversified neural networks, Tachyum is able to overcome and move beyond the limitations facing those working exclusively with NVIDIA’s CUDA or with OpenCL.

In much the same way that external floating-point coprocessors and vector coprocessor chips have been internalized into the CPU, Tachyum is making external matrix coprocessors for AI an integral part of the CPU. By having integrated matrix operations as part of Prodigy, Tachyum is able to provide high-precision neural network acceleration of up to 10 times faster than other solutions. Tachyum’s support of 16-bit floating point and lower precision data types improves performance and saves energy in applications, such as video processing. Faster than the NVIDIA A100, Prodigy uses compressed data types to allow larger models to fit in memory. Instead of 20GB shared coherent memory, Tachyum allows 8TB per chip and 64TB per node.

Idle Prodigy-powered universal servers in hyperscale data centers, during off-peak hours, will deliver 10x more AI Neural Network training/inference resources than currently available, CAPEX free (i.e. at low cost, since the Prodigy-powered universal computing servers are already bought & paid for). Tachyum’s Prodigy enables edge computing and IoT products, which will have an onboard high-performance AI inference optimized to exploit Prodigy-based AI training from either the cloud or the home office.

“Business and trade publications are predicting just how important AI will become in the marketplace, with estimates of more than 50 percent of GDP growth coming from it,” said Dr. Radoslav Danilak, Tachyum founder and CEO. “What that means is that the less than 1 percent of data processed by AI today will grow to as much as 40 percent and the 3 percent of the planets power used by datacenters will grow to 10 percent in 2025. There is an immediate need for a solution that offers low power, fast processing and easy of use and implementation. By incorporating open source frameworks like TensorFlow and PyTorch, we are able to accelerate AI and ML into the world with human-scale computing coming in 2 to 3 years.”

Tachyum’s Prodigy can run HPC applications, convolution AI, explainable AI, general AI, bio AI and spiking neural networks, as well as normal data center workloads on a single homogeneous processor platform with its simple programming model. Using CPU, GPU, TPU and other accelerators in lieu of Prodigy for these different types of workloads is inefficient. A heterogeneous processing fabric, with unique hardware dedicated to each type of workload (e.g. data center, AI, HPC), results in underutilization of hardware resources, and a more challenging programming environment. Prodigy’s ability to seamlessly switch among these various workloads dramatically changes the competitive landscape and the economics of data centers.

Prodigy significantly improves computational performance, energy consumption, hardware (server) utilization and space requirements compared to existing chips provisioned in hyperscale data centers today. It will also allow Edge developers for IoT to exploit its low power and high performance, along with its simple programming model to deliver AI to the edge.

Prodigy is truly a universal processor. In addition to native Prodigy code, it also runs legacy x86, ARM and RISC-V binaries. And, with a single, highly efficient processor architecture, Prodigy delivers industry-leading performance across data center, AI, and HPC workloads. Prodigy, the company’s flagship Universal Processor, will enter volume production in 2021. In April, the Prodigy chip successfully proved its viability with a complete chip layout exceeding speed targets. In August, the processor is able to correctly execute short programs, with results automatically verified against the software model, while exceeding the target clock speeds. The next step is to get a manufactured wholly functional FPGA prototype of the chip later this year, which is the last milestone before tape-out.

Prodigy outperforms the fastest Xeon processors at 10x lower power on data center workloads, as well as outperforming NVIDIA’s fastest GPU on HPC, AI training and inference. A mere 125 HPC Prodigy racks can deliver 32 tensor EXAFLOPS. Prodigy’s 3X lower cost per MIPS and 10X lower core power translates to a 4X lower data center Total Cost of Ownership (TCO), enables billions of dollars of savings for hyperscalers such as Google, Facebook, Amazon, Alibaba, and others. Since Prodigy is the world’s only processor that can switch between data center, AI and HPC workloads, unused servers can be used as CAPEX-free AI or HPC cloud, because the servers have already been amortized.

To see videos of the latest results, please go to the Resources page.

Follow Tachyum

https://twitter.com/tachyum

https://www.linkedin.com/company/tachyum

https://www.facebook.com/Tachyum/

About Tachyum

Tachyum is disrupting data centers, HPC and AI markets by providing universality, industry leading performance, cost and power, while enabling data centers that are more powerful than the human brain. Tachyum, co-founded by Dr. Radoslav Danilak, and its flagship product Prodigy, the world’s first and only universal processor, begins production in 2021. Prodigy brings unprecendeted value targeting a $50B market that is growing at 20% per year. With data centers currently consuming over 3% of the planet’s electricity, and 10% by 2025, low power Prodigy is critical for the continued doubling of worldwide data center capacity every 4 years. Tachyum has offices in the USA and Slovakia, EU.