Research Summary

Checkout my Google Scholar profile for more upto date list of publications.

Multi-Chiplet Architectures

Emerging 2.5D and 3D multi-chiplet architectures create coupled challenges across temperature, data movement, workload mapping, and hardware specialization. My work in this area spans thermal-management challenges in heterogeneous integration, multi-fidelity thermal modeling for fast design exploration and runtime management (MFIT), thermally-aware scheduling for heterogeneous multi-chiplet PIM systems (THERMOS), co-simulation of deep learning workloads on chiplet platforms (CHIPSIM), energy-efficient heterogeneous chiplet inference architectures (HeMu), and lossless exponent coding for inter-chiplet communication in hybrid LLMs (LEXI).

Together, these projects improve how chiplet-based systems are modeled, scheduled, and architected for efficient AI inference under thermal, bandwidth, and energy constraints.

2.5D chiplets 3D chiplets
64 chiplets heatmap

Related Published Papers

  1. Miao Sun, Alish Kanani, Kaushik Shroff, and Umit Y. Ogras, “LEXI: Lossless Exponent Coding for Efficient Inter-Chiplet Communication in Hybrid LLMs,” In DAC 2026. Paper
  2. Lukas Pfromm, Alish Kanani, Harsh Sharma, Janardhan Rao Doppa, Partha Pratim Pande, and Umit Y. Ogras, “CHIPSIM: A Co-Simulation Framework for Deep Learning on Chiplet-Based Systems,” In IEEE OJ-SSCS, 2025. Paper, Code
  3. Harsh Sharma, Alish Kanani, Janardhan Rao Doppa, Umit Y. Ogras and Partha Pratim Pande, “HeMu: Energy-Efficient DNN Inferencing via Heterogeneous-Multi-Chiplet Architectures,” In IEEE TCAD 2025. Paper
  4. Alish Kanani, Lukas Pfromm, Harsh Sharma, Janardhan Rao Doppa, Partha Pratim Pande, and Umit Y. Ogras, “THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures”, In ESWEEK - TECS, 2025. Paper, Code
  5. Lukas Pfromm*, Alish Kanani*, Harsh Sharma, Parth Solanki, Eric Tervo, Jaehyun Park, Janardhan Rao Doppa, Partha Pratim Pande, and Umit Y. Ogras, “MFIT: Multi-Fidelity Thermal Modeling for 2.5 D and 3D Multi-Chiplet Architectures” ACM TODAES 2025. Paper, Blog, Code
  6. Jaehyun Park, Alish Kanani, Lukas Pfromm, Harsh Sharma, Parth Solanki, Eric Tervo, Janardhan Rao Doppa, Partha Pratim Pande, and Umit Y. Ogras, “Thermal Modeling and Management Challenges in Heterogenous Integration: 2.5D Chiplet Platforms and Beyond” - VTS, 2024. Paper

* Equal contributions

Hardware Accelerators for ML models

My accelerator research targets efficient execution of modern ML models across edge and datacenter systems. Recent work includes DUET, a disaggregated accelerator for hybrid Mamba-Transformer LLM inference with prefill/decode-specific packages, and eMamba, an efficient acceleration framework for Mamba models in edge computing. My earlier work also explored automated FPGA acceleration for tree-based ML models through LightFPGA.

DUET disaggregated accelerator system

Related Published Papers

  1. Alish Kanani, Sangwan Lee, Han Lyu, Jiahao Lin, Jaehyun Park, and Umit Y. Ogras, “DUET: Disaggregated Hybrid Mamba-Transformer LLMs with Prefill and Decode-Specific Packages,” In DAC 2026. Paper
  2. Jiyong Kim, Jaeho Lee, Jiahao Lin, Alish Kanani, Miao Sun, Umit Y. Ogras, Jaehyun Park, “eMamba: Efficient Acceleration Framework for Mamba Models in Edge Computing”, In ESWEEK - TECS, 2025. Paper
  3. Alish Kanani*, Swar Vaidya* and Harshit Agarwal, “LightFPGA: Scalable and Automated FPGA Acceleration of LightGBM for Machine Learning Applications” VDAT, 2021. Paper, Slides, Code

Runtime Optimization using Machine Learning

Dynamic runtime optimization focuses on balancing latency, energy, temperature, and resource utilization in heterogeneous computing systems. My work in this area includes THERMOS, which performs thermally-aware multi-objective scheduling for AI workloads on heterogeneous multi-chiplet PIM architectures, and runtime monitoring techniques that make ML-based schedulers more robust for Domain-Specific SoCs (DSSoCs). Together, these frameworks improve performance and energy efficiency while managing dynamic workloads, resource contention, and thermal constraints.

robust task scheduling

Related Published Papers

  1. Alish Kanani, Lukas Pfromm, Harsh Sharma, Janardhan Rao Doppa, Partha Pratim Pande, and Umit Y. Ogras, “THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures”, In ESWEEK - TECS, 2025. Paper, Code
  2. Alper A. Goksoy, Alish Kanani, Satrajit Chatterjee, and Umit Y. Ogras, “Runtime Monitoring of ML-Based Scheduling Algorithms Toward Robust Domain-Specific SoCs” ESWEEK- TCAD, 2024. Paper, Blog

Approximate Computing and Applications

Approximate computing trades accuracy for efficiency, making it a perfect fit for energy-constrained systems and error-resilient applications.

I worked on several exciting projects in this area, such as developing ReARM, a reconfigurable approximate multiplier, and ACA-CSU, a carry-selection-based adder, to optimize power and performance for tasks like image processing. My work extended to ApproxBioWear, targeting efficient arithmetic for wearable biomedical devices, and even explored the feasibility of approximation in communication systems. Most recently, Ellora applied these concepts to create low-power radar processors, blending approximate computing with advanced OFDM techniques.

aca_csu approx_bio
approx_comm approx_radar

Together, these projects showcase a broad application of approximation, pushing the boundaries of energy efficiency across diverse domains.

Related Published Papers

  1. Rajat Bhattacharjya, Alish Kanani, A Anil Kumar, Manoj Nambiar, M Girish Chandra, Rekha Singhal, “Ellora: Exploring Low-Power OFDM-based Radar Processors using Approximate Computing” – LASCAS 2024. Paper
  2. Ish Kool, Alish Kanani, Rajat Bhattacharjya, “Approximating Communication Systems: Reality or Fantasy?” – HiPC 2021. Paper,Slides
  3. Alish Kanani, Rajat Bhattacharjya and Dip Sankar Banerjee, “ApproxBioWear: Approximating Additions for Efficient Biomedical Wearable Computing at the Edge” – EMBC 2021. Paper, Slides
  4. Alish Kanani, Jigar Mehta and Neeraj Goel, “ACA-CSU: A Carry Selection Based Accuracy Configurable Approximate Adder Design” – ISVLSI 2020. Paper, Slides, Code
  5. Rajat Bhattacharjya, Alish Kanani and Neeraj Goel, “ReARM: A Reconfigurable Approximate Rounding-Based Multiplier for Image Processing” – VDAT 2020. Paper, Slides