Google Cloud provides leading infrastructure, platform capabilities, and industry solutions. We deliver enterprise-grade cloud solutions that leverage Google’s cutting-edge technology to help companies operate more efficiently and adapt to changing needs, giving customers a foundation for the future. Customers in more than 150 countries use Google Cloud as their trusted partner to solve their most critical business problems.
Rebellions introduces REBEL-Quad and REBEL-IO—chiplet-based inference accelerators purpose-built for hyperscale and enterprise AI workloads. With a multi-chiplet architecture, HBM3e, and UCIe-Advanced interconnect, the system delivers exceptional compute density and energy efficiency, outperforming leading GPUs in throughput-per-watt across models like LLaMA-70B and 405B. The demo highlights single-card inference performance, demonstrating high efficiency and low power consumption even for large models. Powered by a production-ready software stack compatible with PyTorch, vLLM, and Triton, Rebellions offers a sustainable, deployable alternative for next-gen AI infrastructure.

Jinwook Oh
Jinwook Oh is the Co-Founder and Chief Technology Officer of Rebellions, an AI chip company based in South Korea. After earning his Ph.D. from KAIST (Korea Advanced Institute of Science and Technology), he joined the IBM TJ Watson Research Center, where he contributed to several AI chip R&D projects as a Chip Architect, Logic Designer, and Logic Power Lead. At Rebellions, he has overseen the development and launch of two AI chips, with a third, REBEL, in progress. Jinwook's technical leadership has been crucial in establishing Rebellions as a notable player in AI technology within just three and a half years.
Rebellions
Website: https://rebellions.ai/
Rebellions develops and mass-produces AI accelerators optimized for Large Language Models, Multi-Modal Models, and scale-out inference workloads, delivering industry-leading energy efficiency, scalability, and deployment readiness. Its flagship REBEL chip features a modular chiplet architecture, UCIe connectivity, 144GB HBM3E memory, and REBEL-IO for hyperscale, exa-scale AI inference clusters.
The platform is supported by a mature software stack designed for seamless integration with existing datacenter infrastructure and popular AI frameworks. This builds on the proven mass-production experience of ATOM, launched in 2023 and deployed in production data centers.
Backed by SK Telecom, SK hynix, Aramco’s Wa’ed Ventures, and KT, and strengthened by its merger with SK SAPEON, Rebellions is positioned as Asia’s leading independent AI semiconductor platform for sovereign AI and next-generation datacenter deployments.
Following the MLCommons Q3 MLPerf Inference results announcement on the morning of Tuesday 9th September on the keynote stage, Miro Hodak, Senior Member of Technical Staff, AI Performance Engineering at AMD will deliver a detailed analysis of the results followed by a Q&A session from the audience.

Miro Hodak
Miro Hodak is a Principal Member of Technical Staff at AMD, where he focuses on AI performance and benchmarking. Prior to joining AMD, he served as an AI Architect at Lenovo and was a professor in physics at North Carolina State University before that.
Miro has been actively involved with MLPerf and MLCommons since 2020, contributing to the development of multiple MLPerf benchmarks and submitting results across several rounds of Inference and Training. Since 2023, he has served as co-chair of the MLPerf Inference Working Group.
He has authored peer-reviewed publications in fields ranging from artificial intelligence and computer science to materials science, physics, and biochemistry, with his work cited over 2,500 times.
Google Cloud provides leading infrastructure, platform capabilities, and industry solutions. We deliver enterprise-grade cloud solutions that leverage Google’s cutting-edge technology to help companies operate more efficiently and adapt to changing needs, giving customers a foundation for the future. Customers in more than 150 countries use Google Cloud as their trusted partner to solve their most critical business problems.
Large language models can now power capable software agents, yet real‑world success comes from disciplined engineering rather than flashy frameworks. Most reliable agents are built from simple, composable patterns instead of heavy abstractions.
The talk will introduce patterns to add complexity and autonomy only when it pays off. Attendees should leave with a practical decision framework for escalating from a single prompt to multi‑step agents, also keeping in mind guardrails for shipping trustworthy, cost‑effective agents at scale.
