NIVE
DITHA
Niveditha Srikanth
DATA SCIENTIST  •  AI ENGINEER  •  BIOINFORMATICS  •  RESEARCHER  •  INNOVATOR  •  DATA SCIENTIST  •  AI ENGINEER  •  BIOINFORMATICS  •  RESEARCHER  •  INNOVATOR  • 
I'm |
ML ENGINEER  •  CLOUD DATA ENGINEER  •  ACADEMIC PEER REVIEWER  •  HEALTHCARE ANALYST  •  ML ENGINEER  •  CLOUD DATA ENGINEER  •  ACADEMIC PEER REVIEWER  •  HEALTHCARE ANALYST  • 
Biotechnology by origin, Data science by choice, AI engineering by direction

Who Am I?

A data scientist with a background in biotechnology and hands-on experience building fraud detection pipelines at a Fortune 500 healthcare company. I work across the full data stack from writing Pandas on AWS SageMaker to designing anomaly triggers and outliers that surface suspicious billing patterns at scale

My research side runs parallel with 12+ publications, 2 best paper and presentation awards across IEEE and Springer conferences, and also an independent bioinformatics work on immune checkpoint gene discovery using PCA and network analysis in R. I'm someone who moves between research and engineering without losing thread of either, and I am always more interested in what I have not figured out yet than what I already know

What I Specialise?

Python Python
R R
SQL SQL
OpenAI GPT OpenAI GPT
PyTorch PyTorch
TensorFlow TensorFlow
CI/CD CI/CD
AWS AWS
Git Git
Docker Docker
Tableau Tableau

Machine Learning & AI: Anomaly Detection, Fraud Detection Pipelines, Supervised & Unsupervised Learning, Feature Engineering, Model Evaluation, Deep Learning (CNNs, Sequence Models), NLP & Transformers (Foundations)

Data Science & Analytics: Exploratory Data Analysis (EDA), Statistical Analysis, Hypothesis Testing, Data Cleaning & Preprocessing, Dimensionality Reduction (PCA), Clustering (Mclust, K-Means), KPI Tracking, Data Visualization & Reporting

Programming & Core Tools: Python (Pandas, NumPy, Scikit-learn), SQL (Athena, BigQuery, PostgreSQL, MySQL), R (ggplot2, mclust), Jupyter Notebooks, Microsoft Excel

Cloud & ML Systems: AWS (S3, Athena, Glue, Redshift Serverless, Lambda, SageMaker), ETL Pipelines, End-to-End ML Pipelines, Model Training & Deployment Workflows, Experimentation

Bioinformatics: Gene Expression Analysis, PCA-based Analysis, Mclust Clustering, Pathway Enrichment (Enrichr, KEGG), Network Analysis (STRING, Cytoscape)

Tools & Workflow: Tableau, Power BI, Looker Studio, Domo, Git, GitHub, Docker (Basics)

What I've Learned and Contributed?

Data Science Associate Analyst
The Cigna Group – Cigna Healthcare
Jan 2026 – Apr 2026
Bengaluru, Karnataka, India
Key Contributions:
  • Built 12+ anomaly detection triggers across healthcare claims and provider data; designed a composite risk scoring framework using log-scaled transformations and percentile-based normalization (CUME_DIST) — improving detection coverage by 20% and contributing to an estimated $400K–$700K in potential savings
  • Applied BIRCH clustering and LOF to segment providers; engineered 15+ statistical features identifying 18% high-risk providers and improving anomaly lift by 11%
  • Collaborated with fraud investigators and business stakeholders to translate model outputs into prioritized investigation leads
  • Engineered 40+ features from 10M+ claim records; improved model precision by 13% and deployed scoring pipelines on AWS (S3, Athena, SageMaker), flagging 17% of claims as high-risk
PythonAWS SageMakerBIRCHLOFSQL / AthenaPandasFWA DetectionPower BI
Undergraduate Research Assistant – Computational Bioinformatics (ML)
Bioinformatics Lab, REC · Supervised by Dr. Sujata Roy
Jan 2025 – Apr 2025
Chennai, Tamil Nadu, India
Key Contributions:
  • Built an unsupervised ML pipeline on a ~49,000 feature gene expression dataset (GEO: GSE57329), applying PCA + GMM (mclust, G=9); silhouette score 0.81
  • Reduced feature space by 98% (49,000 to ~1,000 genes) using sequential statistical filtering
  • Identified clusters enriched for immune-related pathways; highlighted hub genes (CD4, CXCL10, FMO3) via PPI analysis and pathway enrichment (KEGG, Reactome)
RmclustPCAggplot2STRINGCytoscapeKEGG
AI Quantalytics – VihaanAI CyberLabs Pvt Ltd.
May 2023 – Jan 2025
Remote
Key Contributions:
  • Constructed a Fully Residual CNN for brain tumor segmentation; integrated MDRNNs to enhance NLP model performance for malware classification and family prediction
  • Developed and evaluated ML models for classification, clustering, and anomaly detection; built 3+ ETL pipelines
  • Conducted statistical analysis across RL, Distributed ML, and NLP research tracks supporting 5+ projects
  • Contributed to peer-reviewed papers for Computers in Biology and Medicine, Springer, and a patent filing
PythonCNNNLPMDRNNETL PipelinesScikit-learnTensorFlow
Crayon Data Pvt Ltd.
May 2023 – Aug 2023
Chennai, Tamil Nadu, India
Key Contributions:
  • Contributed to NLP pipelines processing 100K+ customer records using TF-IDF/Word2Vec
  • Supported BERT-based text classification achieving 85% accuracy on internal benchmarks
  • Applied PCA and t-SNE for dimensionality reduction; developed sentiment and topic trend visualizations
PythonNLPTF-IDFWord2VecBERTPCAt-SNE
Centre of Excellence in Biofilms, REC · Dr. Saravanan Periasamy
June 2023 – July 2023
Chennai, Tamil Nadu, India
Key Contributions:
  • Synthesized EPS-based nanocomposites (AgNO₃ + SDS) from Bacillus amyloliquefaciens to inhibit oral biofilm
  • Validated antimicrobial efficacy via SEM imaging and UV–Vis spectrophotometry; demonstrated 65% reduction in bacterial adhesion
  • Contributed findings to a published book chapter on nanomaterial-based biofilm interventions
SEM ImagingUV-Vis SpectrophotometryNanocompositesBiofilm Analysis
All Mind AI (ZRAE Global), Zion Robotics
June 2020 – Jan 2021
Chennai, Tamil Nadu, India
Key Contributions:
  • Assisted in building a deep learning pipeline for forensic speech signal segmentation across 3 benchmark datasets (~2,900 audio samples)
  • Contributed to experimental documentation supporting a paper submission on AI-based speech segmentation
Deep LearningSpeech ProcessingPythonSignal Segmentation
Micro-Degree Credit Linked Degree Program — Computer Science
IIT Guwahati, Assam, India
June 2024 – Oct 2025
Guwahati, Assam
Relevant Coursework:
  • MTH101 Mathematics for Computer Science
  • CSE101 Introduction to Computer Science & Programming
  • CSE201 Data Structures and Algorithms
  • CSE202 Database Systems
  • CSE301 Principles of Computer System Design & Architecture
  • CSE302 Capstone Project
Data StructuresAlgorithmsDatabase SystemsComputer ArchitecturePython
Bachelor of Technology — Biotechnology
Rajalakshmi Engineering College, Tamil Nadu, India
Oct 2021 – May 2025
Chennai, Tamil Nadu
Relevant Coursework:
CS / DS / Maths
  • MA19153 Applied Calculus
  • MA19251 Differential Equations & Vector Calculus
  • MA19353 Transforms and Numerical Methods
  • MA19453 Probability and Statistics
  • GE19211 Problem Solving & Programming in Python
  • CS19411 Python Programming for Machine Learning
Biotechnology (core)
  • BT19702 Bioinformatics
  • BT19201 Biochemistry
  • BT19301 Microbiology
  • BT19502 Molecular Biology
  • BT19303 Cell Biology
  • BT19602 Genetic Engineering
  • BT19504 Immunology
BioinformaticsPythonML FoundationsStatisticsMolecular Biology

What I've Built?

CVA Prognosis
Cerebrovascular Accident Prognosis using Supervised ML
Compared Random Forest, XGBoost, and Decision Tree on clinical data to predict stroke prognosis. Applied label encoding, mean imputation, and hyperparameter tuning via grid search evaluated on AUC-ROC.
Python scikit-learn Random Forest +3
  • 99.93% accuracy (Random Forest)
  • 97.94% accuracy (XGBoost)
  • 43,400 patient records trained & tested
Gene Expression
Mclust-Based Gene Expression Clustering for Immune Checkpoint Analysis
Applied PCA + Mclust clustering on gene expression data to identify immune checkpoint regulators in T2D-associated atherosclerosis, validated with t-SNE and STRING network hub gene ranking.
R Mclust PCA +5
  • 49,000+ genes processed
  • 9-cluster VVV model (BIC/ICL validated)
  • 3 hub genes identified (Cd4, Cxcl10, Fmo3)
Healthcare Fraud
Healthcare Claims Fraud & Anomaly Detection Pipeline
Detected fraudulent insurance claims combining rule-based SQL flagging, provider outlier scoring with BIRCH & LOF, and anomaly ranking surfaced through a Power BI dashboard.
Python PostgreSQL SQL +4
  • 1M+ claims processed
  • ~3% of claims flagged by fraud rules
  • Top 1% (~128K claims) priority-queued
Log Analytics
Log Analytics Pipeline on AWS
End-to-end serverless pipeline ingesting raw web server logs into S3, transforming via PySpark Glue ETL, and exposing 5-minute error-rate aggregations as Athena-queryable tables with real-time Power BI visualization.
Python AWS S3 PySpark +5
  • 5M+ log records processed
  • ~90% detection accuracy
  • 2 Glue ETL jobs (parse + aggregate)
Paper Clustering
Scientific Paper Clustering for Similarity Discovery
Grouped arXiv abstracts by semantic similarity using NLP embeddings with topic modelling per cluster and preprocessing strategy comparisons across vectorisation variants, enabling literature discovery at scale.
Python TF-IDF K-Means +3
  • 10K+ research papers
  • 40% literature retrieval efficiency improved
  • Multi-language preprocessing tested
Bayesian MMM
Bayesian Marketing Spend Optimization (MMM)
Built a Marketing Mix Model using Bayesian inference engineering CPC/ROAS features to quantify multi-channel marketing ROI and recommend optimal budget allocation across 5 channels.
R rstanarm tidyverse +3
  • 900 multi-channel ad records
  • 5 marketing channels modelled
  • Shift 10–15% budget from low-ROI channels
More on GitHub

What I've Contributed to the World?

Niveditha A, Shreyanth S, Kathiroli V, Agarwal P and Ram Abishek S
2023
Priyanka Agarwal, Niveditha S, Shreyanth S, Sarveshwaran R and Rajesh P K
2023
Shreyanth S, Harshitha D S and Niveditha S
2023
Shreyanth S, Suwetha P, Kathiroli V, Niveditha S and Jayaprakash Harshitha
2023
Niveditha S and Bhalashri Sethuraman
2023
Memory-Augmented Deep Recurrent Neural Networks for Long-Term Dependency Learning in Natural Language Processing
Shreyanth S, Karthikeyan S, Prianka RR and Niveditha S
Not yet Published
An Advanced Fully Residual Convolutional Neural Network for Segmentation and Classification of Brain Tumors Across Diverse Medical Image Modalities — Computers in Biology and Medicine
Karthikeyan S, Shreyanth S, Niveditha S, Naveen S, Santhi G B and Gopirajan PV
Not yet Published
Shreyanth S, Renangi S, Dr. Jayachandran Shanmuga Sundaram, Rajesh Perinkulam Krishnan, Niveditha Srikanth, Manpreet Singh, Dr. Ashok Kumar Katta
UK Design Patent, Design number: UK 6291782, Year 2023
Sarveshwaran R, Karthikeyan S, Meenalosini V Cruz, Shreyanth S, Niveditha S and PK Rajesh
Proceedings of Ninth International Congress on Information and Communication Technology — Springer, 2024
S. Shreyanth, D. S. Harshitha, Priyanka Agarwal, V. Kathiroli & S. Niveditha
Perspective and Strategies on Newage Education and Creative Learning — Springer, 2024
R. Delshi Howsalya Devi, G. K. Chithra, S. Sharmila, Pk Rajesh, S. Niveditha
Big Data & Edge Intelligence for Enhanced Cyber Defence — CRC Press, 2024
Niveditha S, Shreyanth S, Delshi Howsalya Devi R, Sarveshwaran R and Rajesh P K
Big Data & Edge Intelligence for Enhanced Cyber Defence — CRC Press, 2024
Niveditha S, Shobana D, Visudha S, and Yazhini P M
Machine Learning Models and Architectures for Biomedical Signal Processing — Elsevier, 2025
Anand Ravichandran, Shobana D, Visudha S, Yazhini, Niveditha S and Saravanan P
Cutting-Edge Applications of Nanomaterials in Biomedical Sciences — IGI Global, 2024
Functionalisation Strategies of Silver Nanoparticles
Rajasekar P, Niveditha S, Visudha S, Yazhini P M and Shobana D
Springer Nature, Yet to be published

What I've Accomplished?

Issued by IEEE 12th International Conference on Communication Systems and Network Technologies, Bhopal. Recognized for "Kernelized Deep Networks for Speech Signal Segmentation Using Clustering and AI in Neural Networks"
Issued by Kalinga University at IEEE World Conference on Communication & Computing, Raipur. Recognized for "Cerebrovascular Accident Prognosis using Supervised Machine Learning Algorithms"
Issued by the Department of Pharmaceutical Technology, UCE, BIT Campus, Anna University, Tiruchirappalli. Recognized for "A Multi-functional Aqueous Phytochemical Formulation for Minimalist Skincare and Urticaria Management"
ENIGMA 2023 — Second Prize in Poster Competition (October 2023)
Issued by Sri Manakula Vinayagar Engineering College, Puducherry. Recognized for "Optimization of the Ex-vivo Typodont Model for Dental Biofilm Associated Infections"
Part of U&I's education initiative since 2020, contributing to crowdfunding campaigns that raised $265,000 USD and serving on the COVID-era "Breathe India" crisis response team. Promoted to Learning Circle Leader in 2022, leading volunteers, designing teaching strategies, and mentoring emerging leaders within the program
Contributed to the club's outreach and awareness efforts through technical poster design, backend content creation, and building presentation decks used by the front-facing team. Also crafted captions for Instagram posts, supporting the club's social media presence on health and humanitarian topics
Contributed scientific blog posts to Scioverleaf, a peer-run science communication platform covering research across biology, genomics, medicine, and emerging sciences. Authored pieces on topics like T2DM, molecular science, and galactic origins, visible under the "Recent Blog" section of this page
A college-level national conference on innovations in lifestyle disease management. Handled the technical and coordination side of managing logistics, supporting content creation, and overseeing the event's operational flow
Member of IEEE (Institute of Electrical and Electronics Engineers) — #98761426
Editorial Board Member for PriMera Scientific Engineering Journal (ISSN: 2834-2550)
Reviewer for Medicon Engineering Themes Journal (ISSN: 2834-7218)
Reviewed for 10+ IEEE/Springer-organized conferences indexed in Scopus — EASCT 2023, AIKIIE 2023, ICAIA 2023, ICDSIS 2024

Contact