Karthik Shivashankar

AI and Software Engineering Researcher

Summary

My research bridges Artificial Intelligence and Software Engineering, focusing on the dual challenge of technical debt: I apply machine learning to automatically identify and manage technical debt, while also developing tools to analyze and mitigate debt within ML systems. My primary contributions include BEACon-TD, a transformer-based NLP framework for classifying technical debt from developer communications, and two static analyzers: MLScent, which detects 76 ML-specific anti-patterns, and PyExamine, a multi-level detector for 49 Python code smells. My work translates this research into open-source tools that provide tangible solutions to critical challenges in the software development lifecycle.

Education

Philosophiae Doctor (PhD) in Informatics

University of Oslo, Norway | Oct 2025

Thesis: The Dual Role of Machine Learning in Technical Debt Management: Applying Machine Learning for Identification While Analyzing Debt within ML Systems.

MSc in Electronics Engineering (EuroMasters, with Distinction)

University of Surrey, Guildford, UK | 2017 – 2019

Dissertation (Distinction): Deep Learning for 4D Augmented Reality. Developed a compact representation of 4D video sequences using 3D Deep Learning models (VAEs) to enable efficient rendering on Mixed Reality platforms like HoloLens.

Research Experience

PhD Researcher

University of Oslo, Norway | 2021 – 2025

My research involved the end-to-end development of AI-driven tools to automate and enhance software quality analysis.

  • Automated NLP-based Quality Analysis: Engineered BEACon-TD, a novel framework using fine-tuned transformer models to classify 13 distinct types of technical debt directly from unstructured text in issue trackers. This work was operationalized in the open-source TD-Suite framework.
  • ML-Specific Code Quality & Maintainability: Addressed the lack of specialized quality tools for AI systems by building MLScent, a static analyzer that detects 76 unique ML-specific anti-patterns in frameworks like TensorFlow and PyTorch. This was complemented by a comprehensive systematic literature review to consolidate knowledge on maintainability in ML.
  • Advanced Python Code Smell Detection: To overcome the limitations of existing linters, I developed PyExamine, a multi-level static analysis tool for Python that identifies 49 unique metrics covering code, structural, and architectural smells.
  • Enhanced Code Maintainability with LLMs: Investigated the use of fine-tuned Large Language Models to proactively refactor Python code, demonstrating measurable improvements in objective maintainability metrics while preserving functionality.

Industry Experience

AI Engineer

Fantastec Sports Technologies, London, UK | June 2018 - Oct 2021

  • Automated digital asset creation by architecting a computer vision pipeline using Python, OpenCV, and Dlib, significantly reducing manual production time and costs.
  • Constructed a scalable data pipeline on AWS with Apache Spark and Scikit-Learn to analyze user behavior, enabling a predictive in-app asset recommendation system.
  • Designed a Monte Carlo simulation in MATLAB to forecast game dynamics for a blockchain application, delivering key data that guided product strategy and feature development.

Technical Skills

Languages

Python SQL MATLAB

AI/ML Frameworks

PyTorch Scikit-learn Hugging Face Pandas OpenCV

Core Competencies

Natural Language Processing (NLP) Deep Learning Static Code Analysis Empirical Software Engineering MLOps Technical Debt Management

Tools & Platforms

Git Docker CI/CD AWS Apache Spark

Key Publications & Open Source Projects

A full list of publications is available on my Google Scholar profile.

Maintainability and Scalability In Machine Learning: Challenges and Solutions | ACM Computing Surveys, 2025

PyExamine: A Comprehensive, Un-Opinionated Smell Detection Tool for Python | MSR 2025 [GitHub]

MLScent: A tool for Anti-pattern detection in ML projects | CAIN 2025 [GitHub]

BEACon-TD: Classifying Technical Debt and its types across diverse software projects issues using transformers | Journal of Systems and Software, 2025

Better Python Programming for all: With the focus on Maintainability | [Arxiv]

TD-Suite: All Batteries Included Framework for Technical Debt Classification | [GitHub]

Datasets & Models

Explore trained models and datasets on my Hugging Face Profile.