STEM2SHTEM 2023
In 2023, from June to August, 63 high school students attended the STEM to SHTEM (Science, Humanities, Technology, Engineering and Mathematics) summer Engineering and Mathematics) summer program hosted by Prof. Tsachy Weissman and the Stanford Compression Forum. During this summer program, the high schoolers pursued fun research projects in various domains under the supervision of 34 mentors, where a YouTube playlist and the entire collection of the high schoolers’ reports can be found below.
- By: Adit Chintamaneni, Rini Khandelwal, Kayla Le, Sitara Mitragotri, Jessica Kang
- Mentors: Lara Arikan, Tsachy Weissman
Abstract
Data compression is a crucial technique for reducing the storage and transmission costs of data. As the amount of data that is consumed and produced continues to expand, it is essential to explore more efficient compression methodologies. The concept of semantics offers an interesting new approach to compression, enabled by recently developed technology. Concisely, we sought to discover whether the most important features of an image could be compressed into text, and if this text could be reconstructed by a decompressor into a new image with a high level of semantic closeness to the original image. The dataset of images that were compressed is composed of five common image categories: single person, group of people, single object, group of objects, and landscape. Each image was compressed through the following pipeline: image-to-text conversion, text compression and file size determination, file decompression and text recovery, and text-to-image conversion. This pipeline enables any image to be compressed into a few dozen bytes. When examining image-to-text compressors, we experimented with both human and artificial intelligence (AI) powered procedures. We selected the text-to-image model DALL-E 2 as our decompressor. We released multiple surveys to assess structural fidelity and semantic closeness between original images and reconstructed images. We also included compressed JPEGs and WebPs to benchmark performance. Human and AI reconstructions received lower structural fidelity scores than WebP and JPEG images. Individually, images reconstructed from human captions were perceived to have higher structural fidelity and semantic closeness to the original images than AI captions did. Participants' textual descriptions, of both human and AI reconstructions, had high semantic fidelity scores to their descriptions of the original images. This demonstrates that the proposed pipeline is a viable semantic compression mechanism.
- By: Aden Lee, Allan Jiang, Kim-Nga Shenoy, Vihaan Kodeboyina
- Mentors: Junjie Luo, Kepler Boyce
Abstract
With the development of synthetic biology, to achieve highly specific and accurate control of living organisms, or to construct complex metabolic pathways, it is often desirable to create genetic circuits with multiple genetic elements. Traditional approaches involve docking these genetic elements on different chromosomes or integrating them at different loci far apart on the same chromosome and then recombining them. Because the traditional genetic approaches are constrained by the fundamental laws of genetics, the turnover time increases linearly with the number of genetic elements in the circuit. And the cost of maintaining all the genetic elements in the genetic circuit increases dramatically with the number of genetic elements.
Dr. Schnitzer Lab developed a recombination tool that can recombine two transgenes at the same docking site. This approach greatly accelerates the construction of intricate gene circuits and allows for the synthesis of biological strains with numerous genetic elements, leading to the efficient attainment of complex functionalities.
Based on the newest version of the Super Recombination system, SuRe 3.0, which uses 3 orthogonal adaptor pairs to sufficiently recombine any number of genes, we created a computational program that finds the quickest process to recombine multiple genetic elements. The turnover time for the recombination is proportional to the logarithm of the number of transgenes to be recombined. Our application initially assesses whether genes possess recombination capabilities. If recombination is possible, our application determines the shortest and quickest recombination tree by finding the shortest path. Our application allows researchers in the biology field to design the optimized recombination process with a computer automatically.
Behavior Cloning (BC) of Human Policy via Logged Data (PDF, 63.6 KB)
- By: Aashna Kumar, Evelyn Jin, Hooriya Faisal, Samuel Sosa, Tyler Paik
- Mentors: Zhengyuan Zhou, Junyao Chen, Dailin Ji, Ni Yan, Ethan Cao
Abstract
Human decision policy can be learned by machine learning (ML) models using logged data. Our research aims to train a convolutional neural network (CNN) that can predict the next action of a user given the current game state in the snake game. Predicting the user's next action is called behavior cloning. We collected the logged data manually and by heuristics replicating high scoring rounds. The collected data serves as our dataset, consisting of input-output pairs representing the game state and the corresponding actions taken by the human players. After training, our CNN reached an accuracy of 93% on the testing dataset.
- By: Alys Jimenez Peñarrieta, Davyn Paringkoan, Nyali Latz-Torres, Yasmeen Galal, Karen Zhang
- Mentor: Suyeon Choi
Abstract
Augmented Reality (AR) and Virtual Reality (VR) have emerged as transformative perspective tools for medical surgeries. These technologies have the potential to enhance surgical precision, drastically improve patient outcomes, and revolutionize medical training. Furthermore, they can alter the way medical education is approached. However, AR/VR assisted surgeries raise critical policy, accessibility, and privacy concerns, given the information necessary about surroundings and the potential inequities of VR. This research paper provides a comprehensive review of existing literature. The results demonstrate how the mechanisms behind VR improve healthcare.
In addition to our literature review, our team programmed an educational brain anatomy simulation for elementary and middle school students. Educational VR programs could be an effective way of teaching as they are more engaging than traditional teaching mediums and they help students to visualize concepts, which is likely to lead to an improvement in learning retention. We used the program Unity to construct a 3D model of the brain. The different sections of the brain were labeled and color coordinated. When a student clicked on a label, it would take them to a screen with more information regarding what that section of the brain does. In addition, we made a PDF document with the same information, but with 2-D visuals.
We distributed our VR product to a test group, and our PDF document to a separate group. Both groups consisted of 20 elementary school children who are going into the same grade and attend the same summer camp. After each group was given an hour and a half to read the PDF or explore the program, they were given a short test on the information presented. The results demonstrate that VR programs can be an effective tool to teach anatomy and medical concepts.
On the Detection and Prediction of Seizures using EEG (PDF, 328.65 KB)
- By: Fatima Ansari and Aren Wang
- Mentor: Joanna Sands
Abstract
Seizures are abrupt, rapid bursts of electrical activity within the brain. Those with epilepsy, a central nervous system disorder, suffer repeated seizures that appear to occur randomly and without warning. Frequent seizures may cause physical injury or even death. A device that can quickly detect and respond to the onset of a seizure may lessen these risks. The most commonly used instrument to detect such an event is an electroencephalogram (EEG), which is noninvasive and contains graphs of multiple channels. These graphs reveal the brain’s electrical activity. EEG can be used to distinguish different seizure types and epilepsy types (focal or generalized, idiopathic or symptomatic, or a symptom of a larger epilepsy syndrome), and thus the choice of antiepileptic treatment and prognosis prediction.