Beyond the Transformer Wall: Scaling Reasoning to the Edge with LFM 2.5 DevConf.CZ 2026

Beyond the Transformer Wall: Scaling Reasoning to the Edge with LFM 2.5
.ical
2026-06-18 10:35–10:50, A113 (capacity 64)

The "Bigger is Better" era of AI is hitting a physical limit. While trillion-parameter models dominate the cloud, the real-world demand for private, low-latency, and energy-efficient intelligence is growing at the edge. Enter LFM 2.5, the latest flagship from Liquid AI. Built on a hybrid "Liquid" architecture rather than standard Transformers, LFM 2.5-1.2B-Thinking achieves frontier-grade reasoning in a sub-1GB RAM footprint.

In this 15-minute lightning talk, we will explore the shift from "System 1" (probabilistic chat) to "System 2" (deliberative reasoning) on consumer hardware. We will dissect how LFM 2.5 uses Linear Implicit Variable (LIV) operators to achieve 2x CPU throughput over Llama 3.2 and Qwen, enabling 300+ tokens/sec on mobile NPUs. Finally, we will demonstrate a "Reasoning Trace" running locally on Fedora using vLLM and llama.cpp, proving that you don't need a data center to build a "thinking" agent.

Experience level: Intermediate - attendees should be familiar with the subject

Mitul Sharma

Software Engineer at Red Hat who loves deconstructing complex problems and collaborating with partners to build stable solutions. When he’s not fixing LLM hallucinations, he’s an avid explorer traveling the globe in search of new places and perspectives.

Beyond the Transformer Wall: Scaling Reasoning to the Edge with LFM 2.5 .ical 2026-06-18 10:35–10:50, A113 (capacity 64)

Beyond the Transformer Wall: Scaling Reasoning to the Edge with LFM 2.5
.ical
2026-06-18 10:35–10:50, A113 (capacity 64)