Technology
What the technology is
Tumbling Dice has developed a new class of compute architecture designed to overcome the limitations of the traditional Von Neumann model.
Instead of relying on heavy, powerhungry GPU pipelines, our approach brings computation closer to the data, enabling massively parallel execution with far lower overhead. This makes the technology ideal for modern AI workloads, where concurrency and efficiency matter more than raw brute force.
Unlike fixedfunction accelerators, which are optimised for specific model families, our architecture is flexible and modelagnostic. It adapts to emerging AI approaches without requiring new silicon, giving us a broader application footprint and a faster innovation cycle.
By combining lightweight model structures with hardwarenative execution, we deliver fast, predictable inference on compact, lowpower devices, thereby unlocking AI capability in environments where GPUs and specialised ASICs are impractical or uneconomical.
why it works
Most traditional processors chase performance by pushing clock speeds into the gigahertz range.
Our FPGAbased accelerator takes a different approach. Even running at 60 MHz on a lowend FPGA architecture, it delivers over four times the throughput of a modern superscalar CPU design while using around onetwentieth of the power.
The advantage is not raw frequency. It is architecture.
Instead of one fast processor core working sequentially, the accelerator executes thousands of hardware operations in parallel, each tailored to the workload. This deep concurrency means more work gets done per cycle, even at dramatically lower clock speeds.
The result is a platform that is faster, cooler, and far more efficient. Not because it works harder, but because it works smarter.
features & benefits
High Throughput at Low Clock Speed
Our architecture delivers exceptional performance even on lowend FPGA platforms. By exploiting deep parallelism, the system achieves high throughput without relying on high clock frequencies or powerhungry cores.
Available Technology Pathways
The platform runs today on widely available FPGAs and can be migrated to ASIC for even greater performance, efficiency, and cost reduction. This ensures a clear roadmap from prototype to mass deployment.
UltraLow Energy and Water Usage
Our architecture delivers exceptional performance even on lowend FPGA platforms. By exploiting deep parallelism, the system achieves high throughput without relying on high clock frequencies or powerhungry cores.
Reconfigurable Hardware
The system can be reprogrammed to support new models, new workloads, or updated algorithms without replacing hardware. This reduces upgrade costs and extends the lifetime of deployed systems.
Scalable by Design
The architecture scales from small, lowpower devices to large multiFPGA or ASIC deployments. This flexibility allows the same core technology to support everything from embedded systems to highperformance workloads.
Low Cost
By combining lowend FPGA compatibility with a clear ASIC path the technology offers a compelling cost profile, both in terms of upfront hardware and ongoing operational expenses.
EdgeAI Ready
AI is moving out of the cloud and into everyday devices. Our accelerator is designed for this shift: compact, efficient, and capable of running advanced models close to where data is generated.
Protected IP
The core architecture is patentpending, securing the company’s competitive advantage and protecting future commercialisation.
How we compare to GPUs
Highend GPUs dominate in datacentre environments. They are engineered for maximum raw throughput, using thousands of cores, high clock speeds, and large power budgets to train and run very large AI models. They excel at this, but they come with significant energy and cooling demands.
Our approach is different and optimised for a different purpose
Where it fits
Our accelerator is designed for realworld deployment across a wide range of environments.
Its efficiency, scalability, and reconfigurable architecture make it suitable for multiple markets, from compact embedded devices to larger lowpower server installations.
Edge Devices
Ideal for running small language models (SLMs) and other AI workloads directly on devices where power, heat, and space are limited. This enables meaningful AI capability without relying on cloud connectivity or datacentre infrastructure.
LowPower Server Deployments
While not designed to replace highend GPUs in large datacentres, the technology can be deployed in serverfarm environments for inference workloads. Its efficiency reduces energy use, cooling requirements, and overall environmental impact, offering a greener alternative to traditional CPU and GPUbased clusters.
A Flexible Footprint
From handheld devices to rackmounted systems, the architecture adapts to the environment. It delivers consistent advantages wherever power, heat, and efficiency matter more than bruteforce compute.
Embedded Systems
The lowpower, reconfigurable design makes it a strong fit for industrial, automotive, medical, and consumer products that require fast, reliable inference ondevice. It supports long product lifecycles and can be updated without replacing hardware.
FPGA Today, ASIC Tomorrow
The platform runs on widely available FPGAs, enabling rapid development and early deployment. A clear ASIC roadmap provides a path to higher performance, lower cost, and largescale commercial rollout.