STEM2SHTEM 2024
In 2024, from June to August, 63 high school students attended the STEM to SHTEM (Science, Humanities, Technology, Engineering and Mathematics) summer Engineering and Mathematics) summer program hosted by Prof. Tsachy Weissman and the Stanford Compression Forum. During this summer program, the high schoolers pursued fun research projects in various domains under the supervision of 16 mentors, where a YouTube playlist and the entire collection of the high schoolers’ reports can be found below.
STEM2SHTEM 2024 Playlist
- By: Alim A. Oraz, Deeksha Ravi, Hind Essalhi, Md. Shafin Jubayer, Puttipong (Nai) Kong-In
- Mentor: Rocky An
Abstract
Neglected tropical diseases (NTDs) such as schistosomiasis impact over one billion people worldwide, primarily in low- and middle-income countries, as highlighted by the World Health Organization. Despite their widespread prevalence and severe health consequences, these diseases remain poorly understood and significantly underfunded. This research seeks to address the gap in funding and treatment by developing sustainable solutions that integrate environmental and ecological considerations, especially in regions with high co-infection rates. We examine two particular strategies: that of mass drug administration (MDA) and sustainable vector control through vegetation removal. While MDA campaigns have proven cost-effective in preventing NTDs, their true costs and benefits extend beyond immediate drug expenses. Through cost-benefit analyses and life cycle sustainability assessments, this study evaluates the long-term effectiveness of these campaigns. Additionally, we examine the scalability and environmental sustainability of a targeted vector control strategy - vegetation removal - to reduce the prevalence of vector-borne NTDs like schistosomiasis. The outcomes of this research include an optimized strategy for reducing disease prevalence, improving long-term health outcomes, and promoting socio-economic development in affected regions, ultimately advancing global health equity.
Computational Drug Optimization: Using Machine Learning to Inhibit EZH2 (PDF, 2.89 MB)
- By: Adarsh Khullar, April Surac, Ivy Wang, Merab Miller
- Mentor: David Candes, Juan Almanza
Abstract
Enhancer of zeste homolog 2 (EZH2), a histone-lysine N-methyltransferase enzyme encoded by the EZH2 gene, has recently become a key target in drug discovery due to its carcinogenic properties. EZH2 primarily functions as a gene silencer through its role in transcriptional repression. A part of the polycomb repressive complex 2 (PRC2), it also plays a key role in stem cell pluripotency and cell differentiation. When mutated or overexpressed, however, EZH2 has been linked to the excessive inhibition of tumor suppressor genes, resulting in the growth of various cancers. Additionally, dysregulation of EZH2 is tied to accelerated cell proliferation as well as prolonged cell survival, both of which are telltale biomarkers of cancer development.
In this bioinformatics project, we develop and evaluate machine learning models to predict the activity of potential EZH2 inhibitors. We aim to create a framework for computational drug discovery and novel cancer therapies. Using a curated bioactivity dataset from the ChEMBL database, we preprocess our data and perform exploratory analysis based on Lipinski’s Rule of Five. We then remove low-variance features and split the data into training and testing sets to train our models. Ultimately, we employ various regression algorithms to predict pIC50 values, an indicator of inhibitory potency, and assess model performance through metrics such as R-squared and RMSE, visualized with scatter plots and bar charts.
Optimizing Large Language Models: Learning from Mistakes in Gameplay (PDF, 834.8 KB)
- By: Federica D’Alvano Kirakidis, Lily Gao, Aaron George, Alex Huang, Niv Levy
- Mentors: Prof. Benjamin Von Roy, Yifan Zhu, Henry Widjaja
Abstract
In recent years, there has been a surge in research and public interest in Large Language Models (LLMs), which have demonstrated remarkable potential across diverse applications and domains. This paper provides a comprehensive survey of the applications of LLMs, particularly focusing on their roles and capabilities within multi-agent systems (MAS). We utilized Gemini 1.5 Flash by Google to introduce a benchmark for evaluating LLM learning based on mistakes in previous data. Our findings reveal significant variations in LLM performance across different prompt engineering strategies, enhancing our understanding of their strategic thinking in relation to learning through game data. Additionally, we explore the complexities of extending LLM-based self-supervised learning to MAS, emphasizing coordination and communication among agents. By identifying underexplored areas and promising research directions, this survey lays the groundwork for innovative research at the intersection of LLMs, game logic, and MAS, advancing toward Artificial General Intelligence (AGI).
Simulating Evolutionary Processes Using Genetic Algorithms and Variable Constraints (PDF, 1.05 MB)
- By: Ho Lok Cheung, Annum Hashmi, Devaki Rawal, Lundi Moyo
- Mentor: Samuel Do
Abstract
Our project uses the evolution of artificial life in a virtual environment to simulate natural evolution. By observing naturally evolved organisms, we investigated the impacts of environmental conditions and evolutionary processes under changing constraints.