DevConf.CZ 2025

Anish Asthana

Anish is an engineering manager at Red Hat in the OpenShift AI organization. He is working on making machine learning easier for the wider community by building a platform out with cloud capabilities at the core. Most recently, his interests have been focused on the Training and Experimentation space via components in Kubeflow. He has previously been invested heavily in areas such as monitoring, scalability, and reliability.


Company or affiliation

Red Hat

Job title

Manager, Engineering


Sessions

06-13
13:15
35min
Generative AI Model Data Pre-Training on Kubernetes: A Use Case Study
Anish Asthana

Large Language Models (LLM) require preprocessing vast amounts of data, a process that can span days due to its complexity and scale, often involving PetaBytes of data. This talk demonstrates how Kubeflow Pipelines (KFP) simplify LLM data processing with flexibility, repeatability, and scalability. These pipelines are being used daily at IBM Research to build indemnified LLMs tailored for enterprise applications.
Different data preparation toolkits are built on Kubernetes, Rust, Slurm, or Spark. How would you choose one for your own LLM experiments or enterprise use cases and why should you consider Kubernetes and KFP?
This talk describes how open source Data Prep Toolkit leverages KFP and KubeRay for scalable pipeline orchestration, e.g. deduplication, content classification, and tokenization.
We share challenges, lessons, and insights from our experience with KFP, highlighting its applicability for diverse LLM tasks, such as data preprocessing, RAG retrieval, and model fine-tuning.

Artificial Intelligence and Data Science
D105 (capacity 300)
06-14
12:30
80min
Git Fundamentals for Open Source!
Urvashi Mohnani, Anish Asthana

Contributing to open-source projects is an excellent way to improve your skills, gain real-world experience, and engage with a global developer community. Given the large number of contributors with different backgrounds, having an effective Version Control System such as Git is essential for effective contribution. This hands-on workshop will equip you with the essential Git skills needed to contribute effectively to open-source projects.

We’ll start with core concepts such as repositories, forks, feature branches, commit best practices, and remotes before diving into real-world scenarios like rebasing, merge conflict resolution, interactive commits, and handling pull requests. Additionally, we’ll explore collaborative workflows like feature branching, GitHub discussions, and automated checks for CI/CD.

Whether you’re new to open-source or looking to refine your Git workflow, this workshop will give you the confidence to contribute to projects of any scale. Join us to level up your version control skills and take full advantage of Git’s capabilities!

Open Track
A218 (capacity 20)