Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Jupyter notebook markdown generator

Posts

An Introduction to Transformers

19 minute read

Published: December 05, 2024

Although there are many attention and transformer tutorials and papers online, I found myself working through a number of them to pick up different bits of information to allow me to better understand transformers for my applications. I have collated my notes here in the hopes that they will help someone else. Read more

abstracts

portfolio

preprints

publications

Application of the DeepSense Deep Learning Framework to Determination of Activity Context from Smartphone Data

Published in Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2019

Abstract: Current methods of assessing health are infrequent, costly, and require advanced medical equipment. 92% of US adults carry mobile phones, and 77% carry smartphones with advanced sensors (Smith, 2017). Smartphone apps are already being used to identify disease (e.g., skin cancer), but these apps require active participation by the user (e.g., uploading images). The goal of this research is to develop algorithms that enable continuous and real-time assessment of individuals by leveraging data that is passively and unobtrusively captured by cellphone sensors. Our first step to accomplish this is to identify the activity context in which the device is used as this affects the accuracy and reliability of sensor data for measuring and inferring a user’s health; data should be interpreted differently when the user is walking or running versus on a plane or bus. To do this, we use DeepSense, a deep learning approach to feature learning first developed by (Yao, Hu, Zhao, Zhang, & Abdelzaher, 2017). Here we present six experiments validating our model on: (1) a baseline implementation of DeepSense on the same data used by Yao et al., (2017) achieving a balanced accuracy (BA) of 95% over the six main contexts; (2) its ability to classify context using a different publically-available dataset (the ExtraSensory dataset) using the same 70/30 train/test split used by Vaizman et al. (2018), with a BA of 75%; (3) its ability to achieve improved classification when training on a single user, with a BA of 78%; (4) its ability to achieve accurate classification of a new user with a BA of 63%; (5) its improvement to 70% BA for new users when we considered phone placement to remove confounding information, and (6) its ability to accurately classify contexts over all 51 contexts collected by Vaizman et al, achieving a BA of 80% on 9 contexts, 75% on 12, and 70% on 17. We are now working to improve these results by adding other sensors available through smartphone data collection included in the ExtraSensory dataset (e.g., microphone). This will allow us to more accurately assess minor deviations in user behaviors that could indicate changes in health or injury status by accurately accounting for irrelevant, inaccurate, or misleading readings due to contextual effects that may confound interpretation. Read more

Download Paper

Improving the Performance of Fine-Grain Image Classifiers via Generative Data Augmentation

Published in arXiv Preprint, 2020

Abstract: Recent advances in machine learning (ML) and computer vision tools have enabled applications in a wide variety of arenas such as financial analytics, medical diagnostics, and even within the Department of Defense. However, their widespread implementation in real-world use cases poses several challenges: (1) many applications are highly specialized, and hence operate in a sparse data domain; (2) ML tools are sensitive to their training sets and typically require cumbersome, labor-intensive data collection and data labelling processes; and (3) ML tools can be extremely “black box,” offering users little to no insight into the decision-making process or how new data might affect prediction performance. To address these challenges, we have designed and developed Data Augmentation from Proficient Pre-Training of Robust Generative Adversarial Networks (DAPPER GAN), an ML analytics support tool that automatically generates novel views of training images in order to improve downstream classifier performance. DAPPER GAN leverages high-fidelity embeddings generated by a StyleGAN2 model (trained on the LSUN cars dataset) to create novel imagery for previously unseen classes. We experimentally evaluate this technique on the Stanford Cars dataset, demonstrating improved vehicle make and model classification accuracy and reduced requirements for real data using our GAN based data augmentation framework. The method’s validity was supported through an analysis of classifier performance on both augmented and non-augmented datasets, achieving comparable or better accuracy with up to 30% less real data across visually similar classes. To support this method, we developed a novel augmentation method that can manipulate semantically meaningful dimensions (e.g., orientation) of the target object in the embedding space. Read more

Download Paper

Machine Learning Techniques for Reconstruction and Segmentation of Nanoparticle Interferometric Signatures

Published in Boston University, 2022

Abstract: Single particle interferometric reflectance imaging sensor (SP-IRIS) allows for label-free biological nanoparticle detection. This imaging technique allows collection of a 3D defocus profile of interferometric signatures of particles on a reflecting surface. However, automated reconstruction of information from SP-IRIS data and identification of particles remains a challenge. In this work, we develop machine learning techniques to perform accurate reconstruction of defocus profiles from undersampled SP-IRIS data, and machine learning techniques to segment particles of interest from background in SP-IRIS data. We test the performance of a direct, single-signal fully connected neural network based reconstruction technique, an untrained non-convolutional network for single-signal reconstruction, and a 2D convolutional neural network reconstruction model for reconstruction of SP-IRIS signal defocus profiles from undersampled data. We then test a UNet based segmentation model to segment particle signals from background signals. Lastly, we propose a novel combined reconstruction and segmentation model which can perform both reconstruction and segmentation on undersampled SP-IRIS data. Read more

Download Paper

Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea

Published in IEEE EMBC, 2023

Abstract: Topological data analysis (TDA) is an emerging technique for biological signal processing. TDA leverages the invariant topological features of signals in a metric space for robust analysis of signals even in the presence of noise. In this paper, we leverage TDA on brain connectivity networks derived from electroencephalogram (EEG) signals to identify statistical differences between pediatric patients with obstructive sleep apnea (OSA) and pediatric patients without OSA. We leverage a large corpus of data, and show that TDA enables us to see a statistical difference between the brain dynamics of the two groups. Read more

Download Paper

Detection of Sleep Oxygen Desaturations from Electroencephalogram Signals

Published in IEEE EMBC, 2024

Abstract: In this work, we leverage machine learning techniques to identify potential biomarkers of oxygen desaturation during sleep exclusively from electroencephalogram (EEG) signals in pediatric patients with sleep apnea. Development of a machine learning technique which can successfully identify EEG signals from patients with sleep apnea as well as identify latent EEG signals which come from subjects who experience oxygen desaturations but do not themselves occur during oxygen desaturation events would provide a strong step towards developing a brain-based biomarker for sleep apnea in order to aid with easier diagnosis of this disease. We leverage a large corpus of data, and show that machine learning enables us to classify EEG signals as occurring during oxygen desaturations or not occurring during oxygen desaturations with an average 66.8% balanced accuracy. We furthermore investigate the ability of machine learning models to identify subjects who experience oxygen desaturations from EEG data that does not occur during oxygen desaturations. We conclude that there is a potential biomarker for oxygen desaturation in EEG data. Read more

Download Paper

Automated Quantification of Periodic Discharges in Human Electroencephalogram

Published in Biomedical Physics and Engineering Express, 2024

Abstract: Periodic discharges (PDs) are pathologic patterns of epileptiform discharges repeating at regular intervals, commonly detected in the human electroencephalogram (EEG) signals in patients who are critically ill. The frequency and spatial extent of PDs are associated with the tendency of PDs to cause brain injury, existing automated algorithms do not quantify the frequency and spatial extent of PDs. The present study presents an algorithm for quantifying frequency and spatial extent of PDs. The algorithm quantifies the evolution of these parameters within a short (10–14 second) window, with a focus on lateralized and generalized periodic discharges. We test our algorithm on 300 ‘easy’, 300 ‘medium’, and 240 ‘hard’ examples (840 total epochs) of periodic discharges as quantified by interrater consensus from human experts when analyzing the given EEG epochs. We observe 95.0% agreement with a 95% confidence interval (CI) of [94.9%, 95.1%] between algorithm outputs with reviewer clincal judgement for easy examples, 92.0% agreement (95% CI [91.9%, 92.2%]) for medium examples, and 90.4% agreement (95% CI [90.3%, 90.6%]) for hard examples. The algorithm is also computationally efficient and is able to run in 0.385 ± 0.038 seconds for a single epoch using our provided implementation of the algorithm. The results demonstrate the algorithm’s effectiveness in quantifying these discharges and provide a standardized and efficient approach for PD quantification as compared to existing manual approaches. Read more

Download Paper

Sleep Staging from Airflow Signals Using Fourier Approximations of Persistence Curves

Published in arXiv Preprint, 2024

Abstract: Sleep staging is a challenging task, typically manually performed by sleep technologists based on electroencephalogram and other biosignals of patients taken during overnight sleep studies. Recent work aims to leverage automated algorithms to perform sleep staging not based on electroencephalogram signals, but rather based on the airflow signals of subjects. Prior work uses ideas from topological data analysis (TDA), specifically Hermite function expansions of persistence curves (HEPC) to featurize airflow signals. However, finite order HEPC captures only partial information. In this work, we propose Fourier approximations of persistence curves (FAPC), and use this technique to perform sleep staging based on airflow signals. We analyze performance using an XGBoost model on 1155 pediatric sleep studies taken from the Nationwide Children’s Hospital Sleep DataBank (NCHSDB), and find that FAPC methods provide complimentary information to HEPC methods alone, leading to a 4.9% increase in performance over baseline methods. Read more

Download Paper

talks

Deep Learning for Maritime Imagery

Published: May 15, 2019

This talk was given during the 2019 Submarine Technology Symposium at Johns Hopkins Applied Physics Lab in Laurel, MD. More information can be found here Read more

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post. Read more

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post. Read more

Shashank Manjunath

Sitemap

Pages

Posts

abstracts

portfolio

preprints

publications

talks

teaching