2024 projects
2024 projects coming soon
2023 Projects
Advisors. Niha Bhaskar, Amari Lewis, Kristen Vaccaro, Joe Politz, and Mia Minnes.
Title. Analyzing and extending the impact of a first-year mentoring and academic program in computing
The CSE- Peer-led Academic Cohort Experience (PACE, https://pace.ucsd.edu/) program provides weekly mentoring and research exploration sessions for first-year CSE students at UCSD. The 2022-2023 academic year was its first implementation. This year we explored topics including: how choice of training data can introduce bias in machine learning, Bluetooth signal tracking and impacts on privacy, and block-based programming applications in CS education and industry robotics. In summer 2023, we plan to analyze quantitative and qualitative data collected from students’ experience in the program over this year, and design new research-based modules for cohorts of students in the upcoming school year to explore.
Research-focused opportunity: Analyze interviews, survey responses, and other data from students’ experience in the first year of the program to inform improvements and understand what was effective (and not) about the program.
Curriculum development opportunity: Create research-based activity modules on cutting-edge computing topics and the intersection of computing and society for the next group of students.
Advisor. Kristen Vaccaro
Title. Reducing Use of Microaggressions Online
In this project we are designing around microaggressions. This year students have worked on this problem from a number of directions. Some have worked on machine learning: finding data sets of microaggressions, identifying data quality issues, and training an NLP model. Building on prior work, this effort is particularly interested in modeling intersectionally-targeted microaggressions. Others have done design work, exploring a variety of different tools or systems we could build. In summer 2023, we plan to use the models to characterize use of microaggressions on different online communities: which online spaces are more or less toxic? We also plan to conduct human-subjects experiments to determine whether the tools & interventions we have designed actually reduce microaggression use.
Depending on student interest, you might choose to be more involved in modeling or human-subjects experiments. Both will involve extensive data analysis.
Advisor. Imani Munyaka
Title. Security, Privacy, and Societal Change
This project will explore the role of privacy and security in changemakers. We will investigate the needs of changemakers through a mixed methods approach which will include surveys, interviews, policy analysis, and technical analysis. The student collaborator will be asked to investigate the security of software mechanisms using Python and, if time allows, use R and Python to analyze text and quantitative survey results. Due to the nature of this work, details are not posted publicly. However, interested students should reach out to get more information if they need it.
Advisor. Gary Cottrell
Title. Aiding in Identifying Natural Molecules for Medicine
In collaboration with William Gerwick at Scripps Institute of Oceanography, we have been developing systems to speed up structure determination from NMR spectra of small molecules extracted from Natural Products (NPs) (Zhang et al., 2017; Li et al., 2020; Reher et al., 2020). Approximately 70% of all approved drugs are NPs, their analogues, or a chemical modification of an existing NP (Newman & Cragg, 2016). In addition to these academic and societal benefits, NPR provides a powerful incentive for the conservation and sustainable use of biodiversity and biodiverse habitats (Kursar, T. A. et al., 2006). A bottleneck in this research is determining the structure of a new molecule. Molecules are analyzed by extracting the NMR spectrum of a molecule. However, it takes a skilled researcher approximately two weeks to then infer the structure from the spectrum. Our goal is to learn a mapping from the NMR spectra of natural products (sometimes called the “fingerprint” of a molecule) to their structure. We have been developing advanced techniques using deep learning to do this.
We are developing improvements over our previous methods to more specifically produce the structure of a molecule in terms of SMILES strings. The student would learn how to train deep networks for this task.
Advisor. Ndapa Nakashole
Title. Few Shot Text Classification of Clinical Text
A challenge that arises when automatically analyzing natural language in clinical text is that clinicians are free to use their choice of words to describe patient conditions, medications, and other items in the reports. If we analyze the data in its raw form, the results can be misleading due to false positives and false negatives arising from inconsistencies in the data. Thus, lack of uniformity necessitates data normalization so that across different patient reports, even in the face of polysemy, abbreviations, spelling errors, or other variations, the same concepts are mapped to the same name. This project will study entity detection and normalization in biomedical text data in order to link mentions of entities such as symptoms, and medications to their formal names in a biomedical ontology, Unified Medical Language System (UMLS).
Given the problem of limited labeled data in the medical domain, which arises naturally due to rare events, we will therefore develop specialized algorithms for few-shot and zero-shot classification, wherein the goal is to perform classification with zero or only few examples per class.
In this project the student will learn about and extend Prototypical neural networks (Snell et al. 2017) for few shot classification.
Advisors. Caleb Stanford, Tal Garfinkel, Deian Stefan
Title. Securing and Auditing the Rust Programming Language
The Rust programming language is among the most quickly-growing languages, and is increasingly used in industry by companies including Amazon, Google, Microsoft, and Mozilla. However, as the Rust community and ecosystem grows, it has become increasingly important to ensure that Rust code that is downloaded and run by users is safe and does not act maliciously. This project will focus on making the Rust programming language more secure by improving the tool support for auditing and running foreign Rust code. Prior experience with Rust is not necessary! Interest in programming languages, security, or usability/developer experience is helpful. The interested student will write tools that interface with the Rust compiler and operating system and present visually helpful information to effectively find and prevent executing dangerous Rust code.
Advisors. Ruanqianqian (Lisa) Huang, Sorin Lerner
Title. Sensemaking of Computer Programs and Their Behavior
Programmers write computer programs to accomplish a variety of goals, such as building a phone app, training a machine learning classifier, and solving a complex math equation. How programmers interpret the programs’ behavior, nevertheless, does not always match with the reality, which consequently leads to software defects. Existing mechanisms such as software visualization and debuggers aim to deliver information about the programs’ behavior to help the programmers with their understanding. Still, the effectiveness of existing techniques and room for improvement remain under-explored. To help programmers understand the behavior of computer programs better and write software with higher quality, we will reflect on existing techniques for understanding computer programs, brainstorm possible improvements based on their strengths and limitations as well as human cognitive processes in programming, and deploy those improvements.
Advisor. Julian McAuley
Title. Building interactive systems to aid users in various stages of story writing
Creative writing is a challenging task even for humans. People often find it difficult to successfully write prose or stories that are engaging. Previous work from the field of HCI (Bharadwaj et al. 2019, Nilforoshan et al., 2019) have shown that various assistive tools such as maintaining checklists, or guideline-specific feedback, are helpful for users to assure the quality of creative writing. With recent advances in conversational AI, we hypothesize that a conversational assistive system can play a major role in guiding a user to successfully write a story. This guidance will include finding appropriate story templates, suggesting plots and narratives, maintaining character checklists, etc. Our goal is to combine the state-of-the-art dialog systems (Roller et al., 2020) and common story datasets (Mostafazadeh et al., 2016) and conduct user studies to measure the efficacy of such a system.
The student will learn to build an end-to-end dialog system with various natural language processing tasks and gain experience in how to conduct user studies.
Advisor. Leon Bergen
Title. Probing Methods for Neural Language Models
Neural language models such as BERT (Devlin et al., 2019), Transformer-XL (Dai et al., 2019), and GPT-3 (Brown et al., 2020) have achieved success in both text prediction and downstream tasks such as question-answering, text classification, and summarization. The strong performance of these models raises scientific questions about the knowledge they have acquired, in particular, whether these models have acquired linguistic knowledge which is as abstract and general as that of humans. However, previous work has shown that different methods of probing these models’ knowledge (probing methods) produce conflicting results. This has limited our ability to draw strong conclusions about what these models know. Our objective is to map out the space of probing methods for neural language models, and understand why these methods produce varying results. We will determine which methods have greatest internal and external validity.
The student will learn how to set up computational experiments on neural language models, and how to use these experiments to gain scientific understanding of models.
Advisor. Rose Yu
Title. Generative Models for Drug Discovery
Modern drug discovery is a long and expensive process, often requiring billions of dollars and years of effort. Accelerating the process and reducing its cost would have clear economic and human benefits. In early-stage drug discovery, the goal is to find a compound that has a high binding affinity to a designated protein target. In this project, we will design deep generative models to synthesize promising drug candidates, potentially circumventing much of the customary experimental work.
The student will work closely with an interdisciplinary team of professors and undergrads from chemistry, pharmacy, and computer science. The student will learn to extend our recently published deep generative model to improve the efficacy and efficiency of the generated drug candidates.
Advisor. Jingbo Shang
Title. Extracting Emerging Phrases from Massive Text Corpora
We have demonstrated a path to quality phrase extraction from massive text corpora using distant supervision from existing knowledge bases (e.g., Wikipedia) (Shang et al, 2018). This data-driven method unifies multiple statistical signals using ensemble learning techniques, but requires scanning over the entire corpus every time a new document is added. Also, this method is effective only when the phrases are frequent. Our goal in this project is to develop a novel method that more easily incorporates new documents and infrequent, emerging phrases. Recent advances in neural language models, as well as statistical methods that detect trends of phrases, will be employed.
The student would learn how to conduct parallel computing and train deep networks for this task, as well as develop new deep learning models for text processing.
Advisor(s). Niema Moshiri and Tajana Rosing
Title. Accelerating long sequencing in hardware with applications to healthcare
This project will focus on accelerating long sequencing in hardware. This is very important when trying to identify genetic and other diseases, such as COVID-19. Our prior work in accelerating COVID-19 molecular surveillance has already had immense public health impact. such as showing early detection of new SARS-CoV-2 variants directly from wastewater (non-invasive; rather than requiring individuals to come in and test). This work will be focused on using new type of brain-inspired machine learning, called hyperdimensional computing, and accelerating it by developing new hardware, so our system can tolerate a lot more errors in analysis while still delivering correct results multiple orders of magnitude faster.
Advisor(s). Xiaofan Yu and Tajana Rosing
Title. Federated and Lifelong Learning in Large-Scale Sensor Networks for Environmental Monitoring
With recent advances in lightweight machine learning and hardware support, embedding intelligence into small sensor devices has become the trend for the next generation of Internet of Things. In environmental monitoring, sensory data can be first processed with local learning algorithms instead of directly transmitted to the cloud. In-situ learning enables timely reaction (e.g., schedule adjustment or parameter calibration) and saves energy consumption during unnecessary operation. However, multiple challenges remain at the algorithmic level: (i) the algorithm needs to adapt quickly to the environmental changes while being robust to forgetting, and (ii) distributed algorithms on multiple devices need to learn collaboratively while preserving personal patterns. All computation is subject to resource and energy constraints on sensor devices.
In this project, the student will learn to set up a sensor platform using Raspberry Pi or Arduino board. The student will gain experience of developing new algorithms for federated and lifelong learning in large-scale sensor networks, with a focus on environmental monitoring.
Advisor. Michael Coblenz
Title. Better Programming Languages for More Effective Science
Scientists of all kinds write software to analyze data and model the real world. Climate scientists, in particular, create sophisticated models of the environment to understand and predict the effects of climate change. Typically, they use commonly-used languages, such as Python or R, for this analysis. Writing software is hard even for trained software engineers; many of these scientists have their primary expertise in other areas, making software engineering particularly challenging. In this project, I am interested in understanding how to design programming languages that make scientists as effective as possible.
Specific role of the student: The student will observe and interview scientists who do programming work, and then analyze the resulting data to understand how we might design better programming languages for them.
Advisor. Pat Pannuto
Title. Virtual Energy Auditor
This project aims to build an augmented reality (AR) application to “see” energy inefficiencies in the built environment. Imagine walking through a home or building and every appliance, light bulb, and other electricity consumer annotated with estimated energy use, potential energy savings if replaced with a new high-efficiency device, and expected payback period based on actual use and local incentive opportunities. The summer project would take our recent work on automatic, camera-based detection of inefficient lighting and integrate it into a phone/table AR app. For motivated students, there is further opportunity to explore autonomous labeling of appliances or other salient features of energy use in the built environment.
Advisor. Nuno Vasconcelos, Department: ECE
Title. Iterative dataset collection
While a wide spectrum of automated tools have been created for building deep learning models in the past decade, dataset collection has remained a largely manual process with little systematic effort to account for bias in raw data or human annotations. The goal of this project is to build an iterative framework for dataset collection, annotator teaching and model training. Under this unified framework, new examples are automatically selected for human annotation, cleaned for label bias, and added to the dataset progressively. Neural network models are trained on each iteration of data, and model explanation techniques are used to create teaching examples that reduce the bias of crowd-source annotators. The whole framework aims to produce datasets that are optimal for machine learning, under multiple objectives, including classification accuracy and fairness. The research is connected to topics like active learning and human-in-the-loop AI systems. The project aims for top-tier conference publication.
Student Responsibilities: Software development in Python, Linux and at least one popular deep learning framework such as PyTorch. Students will also learn basics in computer vision and natural language processing.
Advisor. Nuno Vasconcelos, Department: ECE
Title. Customizing radiation cancer treatment with deep learning
Brachytherapy is a treatment in which a radioactive source is used to deliver radiation internally to treat cancers such as cervical cancer. Currently, clinicians manually tune treatment parameters to customize the radiation to individual patients’ anatomy. This process can take over an hour, which is problematic because patients are waiting in discomfort and often
under sedation for this to occur. Deep learning can identify anatomical features that relate to ideal, customized radiation treatments by learning from past patient imaging and treatment data. In this project, we will generate new networks and inputs and/or modify existing networks to accurately predict radiation treatment parameters. The end goal is to automate the treatment
customization process to ensure high quality radiation treatments can be produced in a matter of minutes with a single button-click. This project will involve working with a team of medical physicists (including Dr. Sandra Meyers), radiation oncologists and electrical engineers, and is a collaboration between the Vasconcelos and Meyers labs. The project aims for a top-tier conference or journal publication.
Student Responsibilities: Software development in Python, Linux and at least one popular deep learning framework such as PyTorch.
Advisor. Dinesh Bharadia, Department: ECE
Title. Compute and memory efficient SLAM by leveraging RF-signals
Project Description: Localization and mapping for robots is a fundamental requirement to enable downstream applications of autonomous navigation or exploration. Unfortunately, SLAM (Simultaneous Localization and Mapping) is challenging on low-compute hardware like those present on drones or AR/VR headsets. In recent research, we have observed marked improvements (4x) in compute and memory efficiency when employing WiFi and UWB (Ultra-wide band) based sensing into the SLAM framework. Additionally, these ideas can further be extended to allow for more effective collaborative SLAM for a team of robots. This project will involve: Extending the current framework which relies on a mulit-antenna WiFi/UWB receiver on the robot to a single antenna receiver; Leveraging inter-robot WiFi/UWB measurements to improve mapping and localization in multi-robot collaborative efforts; Large scale system integration of the current research to incorporate aspects of real-time mapping and navigation.
Student Responsibilities: Strong fundamental knowledge in signal processing, probability, and linear algebra; Experience with Robot Operating System (ROS), hardware/systems integration; Proficiency in Python/C++
Advisor. Dinesh Bharadia, Department: ECE
Title. Smartphone Enabled Ubiquitous Indoor Navigation and Mapping
Localization in GPS-denied environments has been a long-researched problem, falling under the broader field of Simultaneous Localization and Mapping (SLAM). Aided by multiple sensors’ fusion (cameras, lidars, inertial measurement sensors), computer-vision based feature detection and graph-based optimization, significant progress has enabled robots to autonomously navigate alien environments. Unfortunately, these techniques cannot easily be applied with smartphones to localize users due to the lack of sophisticated sensors and compute power. An attractive alternative to vision-based localization is to use WiFi signals. In simple terms, the WiFi signals which are propagating in the environment create unique signatures that can be used to localize the user. But the current state of the art systems, which provide up to 50 cm error in localization, either require extensive mapping of the environment or perform well only in the absence of multiple propagation paths. The aim of this project is to overcome the above challenges and develop a robust smartphone application to enable the mapping and localization of an environment. Mapping will consist of developing algorithms to generate 2D floor-plans or 3D point-clouds using an RGB-D camera. Localization will involve developing a data-collection pipeline and methods to perform centimeter-level reverse localization of deployed WiFi access points and decimeter-level forward localization of users.
Student Responsibilities: Strong experience in python programming, basic data processing and Android/iOS programming. Preferred qualifications are background knowledge in SLAM, wireless communications or image/signal processing and matlab/python background.
Advisor. Dinesh Bharadia, Department: ECE
Title. Ubiquitous Ultra-wideband Based 6 DoF Tracking and Localization
A myriad of IoT, ranging from tracking equipment in Hospitals, Logistics, Construction Industries to Indoor tracking for large indoor space, demand cm-accurate localization of the sensors, that are robust to blockages from hands, furniture or other obstacles in the environment. With this need, in the recent past, UWB-based localization and tracking has become popular. Its popularity is driven by its cm-accurate localization despite occlusions in the environment. The high accuracy arises from its large bandwidth. Despite almost two decades of research towards UWB-based localization, few implementations targeting the above applications are present. We find that high latency of two-way-ranging (TWR) and/or high energy consumption of concurrent ranging algorithms are the major culprits which prevent a system from deploying multiple tags and readers in the environment. In this project we seek to fill in the gaps of current UWB technologies, and extend it to track in all the 6 degrees of freedom as required in VR systems.
Student Responsibilities: Knowledge about digital signal processing techniques, embedded system development, strong PCB design skills and firmware design and programming background
Advisor. Tania Morimoto, Department: MAE
Title. Handheld input interface with stiffness feedback capability
We are interested in designing a handheld input interface that can be used to teleoperate surgical robots and is capable of providing stiffness information about the robot environment to the users via jamming methods. The input interface would ideally be at a scale and still be able to display sufficiently large stiffnesses to the users.
Student Responsibilities: Participating student researchers will survey existing technologies and select the appropriate jamming method to be applied in the device design. Students will be involved in the design, fabrication, and validation of the chosen device design.
Advisor. Tania Morimoto, Department: MAE
Title. Achieving fiber jamming using novel actuation methods
Fiber jamming is a method used in soft robotics to modulate the stiffness of actuators and other robotic components. Stiffness modulation allows soft robots to leverage the advantages of their compliance, while also enabling them to increase their stiffness when needed to resist large forces. We are interested in investigating new actuation methods, beyond the conventional use of vacuum loads, to achieve fiber jamming.
Student Responsibilities: Participating students will perform experiments/analyses on the chosen actuation method/s. Students will also be responsible for designing and fabricating fiber jamming structures for potential applications.
Advisor. Tania Morimoto, Department: MAE
Title. Ergonomic design of a flexible joystick capable of stiffness feedback and a study on the effect of additional tactile feedback on stiffness perception
In our lab, we have designed a flexible joystick that can be used to control flexible surgical robots and can provide haptic feedback to the users in the form of stiffness information about the remote robot environment. In this project, we are interested in improving the design of the flexible joystick to make it more ergonomic. We are also interested in adding pneumatic pouches that can provide tactile feedback to the users and investigating its effect on the stiffness perception during the use of the flexible joystick.
Student Responsibilities: Students will be responsible for the design of the improved flexible joystick. Participating students will also fabricate pneumatic pouches and investigate the effect of their integration into the flexible joystick on the stiffness perception by the users.
Advisor. Nikolay Atanasov, Department: ECE
Title. Python Robotics
This project focuses on implementing baseline robotics algorithms for localization, mapping, planning, and control, and integrating them in the PyBullet (https://pybullet.org/) physics simulator. Specific algorithms that will be considered include occupancy-grid mapping, particle-filter localization, A* motion planning, and proportional-derivative control for an Ackermann-drive robot. The objective is to document the algorithm implementations and provide visualization and accessible demonstrations of the algorithm operation. The project is inspired by the Pacman project (http://ai.berkeley.edu/project_overview.html) and the Python Robotics project (https://github.com/AtsushiSakai/PythonRobotics) and aims to create a 3D robotics version integrated in the PyBullet simulator. To achieve this, the developed algorithms, demos, visualization, and documentation will be provided on a website with a Jupyter (https://jupyter.org/) and Google Collab (https://colab.research.google.com/) interface.
Student Responsibilities: Learn about the required mathematical background in robotics (position, orientation, kinematics, probability distributions, etc.), study and implement one or several core robotics algorithms for localization, mapping, motion planning, and control in python. Experience with object oriented programming, data structures, and algorithms is required. Experience in robotics, e.g., at the level of Probabilistic Robotics by Thrun, Burgard and Fox is preferred but not required.