Dylan Chen

Experience

Agentic AI Developer

Jan 2026 – Present

Momentous Health

Built agentic AI system automating clinical workflows from document intake to robotic execution, eliminating manual bottlenecks in neurological testing
Deployed LLM agents to streamline internal operations, cutting documentation overhead and accelerating data processing pipelines
Orchestrated end-to-end automation layer integrating patient assessments, testing systems, and robotics for autonomous clinical data lifecycle

Automating Neurodegenerative Disease Diagnostics

Momentous Health is building robotics-enabled systems to automate diagnostic testing for neurodegenerative diseases in elderly patients. The current workflow relies on manual test administration, paper-based result recording, and labor-intensive report generation—creating bottlenecks that delay patient care. I'm developing an agentic AI infrastructure that handles the entire clinical data lifecycle: from OCR-based extraction of test results, to automated report writing, to compliance verification before final delivery.

Report Writing Agent with Human-in-the-Loop

The core challenge is generating clinical reports that meet strict healthcare regulations while maintaining accuracy with patient health information. I built a multi-agent pipeline where OCR systems extract raw test data, reinforcement learning models refine extraction accuracy through human feedback, and RAG-powered language models pull the latest medical guidelines to generate compliant reports. Each agent specializes in one task—extraction, validation, or writing—then passes work downstream, creating a modular system that's easier to debug and improve than monolithic approaches.

PII-Aware Document Generation

Healthcare data requires careful handling of personally identifiable information. The system implements safeguards at every stage: OCR outputs are sanitized before storage, RAG retrieval filters out irrelevant patient data, and the final PDF generation layer applies role-based access controls. This architecture ensures compliance with HIPAA and other healthcare privacy standards while maintaining fast turnaround times for clinical staff.

From Manual Workflows to End-to-End Automation

The broader vision integrates this report writing system with robotics performing the actual diagnostic tests, creating a fully autonomous pipeline from patient assessment to results delivery. This work taught me how to design AI systems that operate in regulated, high-stakes environments where mistakes have real consequences—forcing careful attention to validation, interpretability, and failure modes that pure research projects often ignore.

Momentous Health Website

AI Research Assistant

Jan 2026 – Present

UCSF TECH Lab

Developed predictive models for early disease detection using multimodal patient data from the AI-READI dataset
Built time-series forecasting pipeline to analyze longitudinal health patterns and identify risk trajectories before clinical symptoms emerge
Engineered feature extraction systems processing vitals, lab results, and physiological markers into unified health prediction models

Predicting Diabetes Complications Before They Surface

Type 2 diabetes affects over 6% of the global population, but current diagnostic tools only catch problems after clinical symptoms emerge—often when damage is already done. I'm working with the AI-READI dataset, a multimodal collection of 4,000 patient records balanced across race, ethnicity, gender, and disease severity, to build predictive models that identify risk trajectories hours to weeks before patients experience symptoms. The goal is to move from reactive treatment to proactive intervention.

Time-Series Forecasting on Longitudinal Health Data

The AI-READI dataset includes continuous glucose monitoring data, wearable sensor outputs, lab results, and environmental factors tracked over time. I'm implementing time-series forecasting models using JAX and GluonTS to analyze these longitudinal patterns and predict glucose spikes, hypoglycemic events, and disease progression markers. The challenge is handling data collected at varying sampling rates—CGM sensors log every 5 minutes while lab tests happen monthly—requiring models that can reason across different temporal resolutions.

Learning Patient-Specific Disease Dynamics

Rather than building one universal model, we're exploring whether individual patients exhibit unique disease signatures. The hypothesis: if you can fit a personalized model to each patient's historical data, the model's hyperparameters might reveal higher-order patterns that generalize across populations. For instance, two patients with similar glucose variability but different model learning rates might represent distinct disease subtypes with different intervention needs.

Engineering Multimodal Feature Pipelines

Raw sensor data isn't directly usable for ML models. I'm building feature extraction systems that process vitals, lab markers, and physiological signals into unified representations suitable for health prediction. This involves handling missing data (not all patients have all measurements), normalizing across different units and scales, and identifying which combinations of features best predict specific outcomes. The work sits at the intersection of signal processing, clinical domain knowledge, and machine learning architecture design.

AI-READI Website

Product & Technical Lead

Aug 2025 – Present

CITRIS and the Banatao Institute

Led cross-functional team building CITRIS Quest, a mobile game deployed across 4 UC campuses with computer vision-powered gameplay
Trained CNN achieving high-accuracy pixel art classification, implementing loss scheduling and regularization for robust real-world performance
Shipped full-stack product from zero including mobile app, authentication system, and database infrastructure managing user profiles and scan storage

Celebrating 25 Years with Cross-Campus Gameplay

CITRIS Quest commemorates the 25th anniversary of UC's Center for Information Technology Research in the Interest of Society by bringing students across four UC campuses together through a mobile game inspired by Space Invaders. Players physically explore their campuses to find and scan pixel art stickers, earning points for their campus in a semester-long competition. The game merges digital gameplay with physical exploration, requiring computer vision to work reliably under real-world conditions—varied lighting, camera angles, partial occlusions, and low-quality phone cameras.

Training CNNs for Real-World Robustness

I built a PyTorch CNN to classify pixel art characters from phone camera scans. The technical challenge wasn't accuracy on clean test sets—it was generalization to the messy conditions players would actually encounter: outdoor lighting, motion blur, tilted angles, shadows, and stickers that get worn or damaged over time. I implemented cross-entropy loss with learning rate scheduling and aggressive regularization (dropout, data augmentation) to prevent overfitting to pristine training images. The model needed to be both accurate and confident enough to reject non-game images without frustrating players with false negatives.

Full-Stack Product from Zero to Launch

Beyond the ML model, I led development of the entire product stack: Flutter mobile app for iOS and Android, Supabase PostgreSQL backend managing user authentication and scan storage, real-time leaderboards showing campus standings, and admin tools for monitoring player activity. I managed a cross-functional team through sprint planning, code reviews, and feature prioritization while balancing technical debt against shipping deadlines. This meant making architectural decisions early—like choosing Flutter over native development for faster iteration—and living with those constraints later.

Making AI Accessible to Non-Technical Users

The most valuable lesson was designing for actual user behavior, not ideal conditions. Early versions required perfect scans in good lighting. Real users took photos while walking, in shadows, or from weird angles because that's when they found stickers. I rebuilt the pipeline to handle this: multiple model passes at different confidence thresholds, preprocessing to normalize lighting and perspective, and UI feedback guiding users to better shots without being annoying. The goal wasn't just "build a CNN"—it was "build something students actually enjoy using."

Project Proposal Pitch Deck

HPC Intern

Jun 2025 – Aug 2025

NASA Ames Research Center – Advanced Supercomputing Division

Built ANDROMEDA, an ML system automating root cause analysis for supercomputing failures at NASA scale
Developed unsupervised learning pipeline identifying failure patterns without labeled data using contrastive learning architecture
Integrated clustering, semantic analysis, and attention mechanisms producing interpretable diagnostics for high-performance computing operations

Automating Failure Diagnosis at Supercomputing Scale

NASA's supercomputing infrastructure processes thousands of jobs daily across climate modeling, aerodynamics simulations, and mission-critical calculations. When jobs fail, engineers manually sift through logs to diagnose root causes—a bottleneck that delays research and wastes compute resources. ANDROMEDA automates this process using unsupervised machine learning to identify failure patterns, categorize issues, and surface interpretable insights without requiring labeled training data.

The Challenge: No Ground Truth Labels

Traditional supervised learning requires labeled examples of "this log pattern = memory overflow" or "this trace = network timeout." NASA's failure logs don't come pre-labeled, and manually tagging thousands of historical failures isn't feasible. The system needed to learn failure signatures directly from unlabeled data while producing explanations that domain experts could validate and act on.

Building Self-Supervised Pattern Recognition

I designed a contrastive learning architecture that treats similar failure traces as positive pairs and dissimilar ones as negatives, learning representations that cluster related failures together. The pipeline processes raw log data through semantic embedding models to capture contextual meaning, then applies HDBSCAN clustering to group failures by shared characteristics. An attention mechanism highlights which log segments drove each cluster assignment, giving operators transparency into why the system flagged specific patterns.

Production Integration and Interpretability

Rather than a black-box classifier, ANDROMEDA generates structured diagnostic reports mapping each failure to its closest historical matches and surfacing the distinguishing features. This lets engineers validate whether the system's pattern detection aligns with known failure modes. The architecture handles NASA's log volume and diversity while maintaining fast inference for real-time alerting.

Scaling RCA Without Manual Annotation

The system demonstrated that unsupervised learning could replace manual root cause analysis for recurring failure classes. By eliminating the labeling bottleneck, the approach generalizes to new failure types as they emerge rather than requiring retraining on annotated data. This work showed me how critical interpretability is when deploying ML in high-stakes infrastructure—domain experts won't trust opaque models, even if metrics look good.

Presentation Poster Pitch Deck Summary Video

Database Administrator

Aug 2024 – May 2025

Berkeley Student Leadership Academy

Redesigned workshop tracking system with optimized data organization reducing management overhead across 500+ records
Automated tracker setup workflow with Google Apps Script, cutting deployment time from hours to minutes
Overhauled archive infrastructure improving data protection and enabling instant retrieval of historical program data

Leadership Development at Berkeley Scale

Berkeley Student Leadership Academy is a year-long undergraduate program offering workshops on leadership, engagement, and diversity across campus. Students attend seven workshops minimum (four required, three elective) covering topics like team dynamics, communication strategies, and inclusive leadership. Managing workshop logistics—tracking attendance, coordinating facilitators, archiving feedback—generated massive amounts of data across hundreds of participants and dozens of events each year.

Replacing a Broken Spreadsheet Nightmare

The existing system was a single, bloated spreadsheet with hundreds of rows and columns filled with binary 0s and 1s indicating attendance, completion status, and requirements met. Finding information required scrolling through endless cells, updating records meant hunting for the right row, and setting up the tracker each semester took weeks of manual copying, formula adjustments, and data migration. The system worked barely, and only for people who already knew how to navigate it—a major barrier for new administrators.

Automated Tracker Setup in Under 10 Minutes

I built a Google Apps Script automation that generates the entire workshop tracking infrastructure from a template with a single button click. The script creates properly formatted spreadsheets with attendance tracking, automatically generates linked Google Forms for registration and feedback, populates formulas for calculating completion requirements, and sets up data validation rules to prevent entry errors. What previously took a couple weeks of tedious setup now happens in about 10 minutes, freeing administrators to focus on improving workshop content and participant experience.

Redesigning for Human Usability

Beyond automation, I restructured how data was organized. Instead of cryptic binary columns, the new system uses readable status indicators, color coding for quick visual scanning, and separate sheets for different data types (attendance vs. feedback vs. completion tracking). The archive system moved historical data into structured folders with consistent naming conventions, making it trivial to retrieve past program data for trend analysis or reporting. The goal wasn't just technical efficiency—it was making the system usable for future administrators without requiring institutional knowledge or extensive training.

Automation Script Usage Docs Workshop Tracker

Product & Discovery Intern

Feb 2025 – May 2025

Cyclick.ai

Identified friction points in consumer electronics purchasing through user research, mapping pain points across the BNPL payment journey
Conducted competitive analysis on credit vetting systems, evaluating integration strategies with Experian and Equifax APIs
Built pricing models analyzing parallel BNPL offerings to inform go-to-market strategy and competitive positioning

Rethinking Phone Ownership Economics

Cyclick.ai is building a phone subscription model that challenges the traditional buy-and-own approach to personal electronics. Users rent phones instead of purchasing them outright, gaining flexibility to upgrade frequently, bundled repair coverage, and predictable monthly costs. When users return devices, Cyclick resells them to refurbishers in markets like Spain where there's strong demand for certified pre-owned phones. This circular business model reduces the company's capital requirements and passes savings to users through lower subscription prices.

Discovering Hidden Costs in Buy Now, Pay Later

Through user research and competitive analysis, I identified major pain points in existing BNPL (Buy Now, Pay Later) services: hidden fees buried in fine print, confusing interest calculations, and unexpected charges that destroyed trust. Users wanted ownership flexibility but were burned by predatory pricing structures. The insight: users weren't opposed to subscription models—they just didn't understand the economic benefits because competitors obscured the math. Our opportunity was radical transparency: show exactly what users save over time, compare total cost of ownership vs. subscription, and eliminate surprise fees entirely.

Competitive Pricing Strategy and Market Entry

I built pricing models scraping data from major BNPL competitors to understand their fee structures, interest rates, and total cost calculations. The analysis revealed a pricing band where Cyclick could undercut established players while maintaining healthy margins through efficient refurbishment partnerships. The key leverage point: while competitors treat returned phones as losses or sell them at commodity prices, Cyclick's Spanish refurbisher network paid premium rates for quality devices, creating additional revenue streams beyond subscription fees.

Expanding Accessibility in Emerging Markets

The subscription model's biggest advantage isn't just flexibility for affluent users—it's access for populations with limited purchasing power. Markets in Africa and other developing regions have strong demand for smartphones but lack capital for $800+ upfront purchases. Monthly subscriptions dramatically lower the entry barrier while Cyclick's refurbishment model keeps costs sustainable. I evaluated credit vetting integrations with Experian and Equifax to enable this expansion, balancing fraud risk against market accessibility.

Letter of Completion

Featured Projects

AI-READI Health Prediction

UCSF TECH Lab

Built predictive health models that detect disease risk before clinical symptoms appear. Time-series analysis of patient vitals and physiological markers identifies early warning patterns in longitudinal health data.

JAX Time-Series Forecasting Multimodal Data Feature Engineering Predictive Models

Building Early Warning Systems for Diabetes

Current diabetes management is reactive: doctors intervene after patients show symptoms or lab results cross alarm thresholds. By then, physiological damage is underway. I'm developing predictive models that analyze continuous glucose monitoring data, wearable sensor outputs, and lab markers to identify risk patterns before symptoms emerge—giving clinicians hours to weeks of advance warning to adjust medications, recommend lifestyle changes, or schedule preventive interventions.

Multimodal Data at Different Time Scales

The AI-READI dataset combines data sources with vastly different temporal resolutions: CGM sensors log glucose every 5 minutes, heart rate monitors sample continuously, lab tests happen monthly, and environmental factors (air quality, weather) update daily. Standard time-series models struggle with this heterogeneity. I'm using JAX for fast numerical computation and GluonTS for specialized time-series architectures that can handle irregular sampling rates, missing data, and multi-scale patterns.

Learning Patient-Specific Disease Trajectories

Instead of building one universal model for all patients, we're exploring whether personalized models reveal deeper insights. The approach: fit individual models to each patient's longitudinal data, then analyze the learned hyperparameters (learning rates, attention weights, feature importance scores) to identify disease subtypes. Two patients with similar glucose levels might have completely different underlying dynamics—one shows rapid spikes and crashes (high variability), another shows slow drift (low variability but poor long-term control). These patterns might require different treatment strategies.

From Research to Clinical Utility

The ultimate goal isn't just accurate predictions—it's actionable insights clinicians can use. That means models must explain why they flagged a patient as high-risk: Which biomarkers are trending wrong? What behavioral patterns correlate with risk? Is this a short-term spike or long-term trajectory shift? I'm building feature importance analysis and attention visualization tools that surface these insights, making the system useful for doctors who need to understand model reasoning before trusting its recommendations.

AI-READI Website

CITRIS Quest

CITRIS and the Banatao Institute

Cross-campus mobile game bringing computer vision to physical gameplay. Players scan pixel art stickers around campus, classified in real-time by CNN trained to recognize game characters under varying lighting and angles.

Computer Vision PyTorch CNN Flutter Supabase PostgreSQL Full-Stack

Computer Vision Meets Physical Gameplay

Traditional mobile games exist entirely in digital space. CITRIS Quest bridges physical and digital worlds: players explore real UC campuses hunting for pixel art stickers, then scan them with their phones to score points. The game turns campuses into playgrounds while showcasing CITRIS's 25-year history through artwork representing landmark projects and research breakthroughs.

CNN Robustness Under Real-World Conditions

Training a CNN to recognize pixel art isn't hard—getting it to work in students' hands is. Real scans happen outdoors in direct sunlight, indoors under fluorescent lights, while walking (motion blur), at tilted angles, with fingers partially covering stickers, on worn stickers with faded colors. I trained the model on heavily augmented data—random rotations, brightness variations, Gaussian noise, synthetic occlusions—forcing it to learn robust features rather than memorizing pristine examples.

Full-Stack Development and Team Leadership

I architected the entire product: Flutter app for cross-platform mobile, Supabase PostgreSQL for user auth and data storage, real-time WebSocket connections for live leaderboard updates, and admin dashboards for monitoring gameplay. Leading a team meant balancing feature requests against technical constraints, reviewing code for quality and consistency, and making architectural decisions with long-term maintenance in mind—like choosing managed services over self-hosted infrastructure to reduce operational burden.

Designing for Non-Technical Users

The hardest part wasn't building the technology—it was making it intuitive. Early versions required perfect scans, frustrating players. I added progressive feedback: if the model is 60% confident, show a hint; if 80%, show "almost there"; if 95%, accept the scan. This guided users toward better shots without explicit instructions. The lesson: great ML products hide complexity behind simple interfaces.

Project Proposal Pitch Deck

ANDROMEDA

NASA Ames Research Center

Automated root cause analysis for supercomputing job failures at NASA. Unsupervised ML pipeline identifies failure signatures and generates interpretable diagnostics, replacing manual log analysis with pattern-based insights.

Unsupervised Learning Contrastive Learning HDBSCAN Clustering Semantic Analysis Attention

Automating Failure Diagnosis at Supercomputing Scale

The Challenge: No Ground Truth Labels

Building Self-Supervised Pattern Recognition

Production Integration and Interpretability

Scaling RCA Without Manual Annotation

Presentation Poster Pitch Deck Summary Video

Assistify

Baidu Ernie Hackathon – 2nd Place

Voice assistant helping elderly users navigate iOS through real-time screen understanding and spatial audio guidance. Achieved 92% task completion rate using vision-language models and personalized interaction memory.

Vision-Language Models Voice AI Spatial Audio RAG Memory iOS

Voice AI for Non-Technical Seniors

Elderly users often struggle with smartphone interfaces: small text, complex navigation, changing app layouts, and features they've never seen before. Assistify is an iOS voice assistant that helps seniors use their phones through natural conversation. Users describe what they want to accomplish ("I want to text my daughter"), and the system provides step-by-step audio guidance while understanding the current screen context—eliminating the gap between "I need help" and "here's what to tap next."

Real-Time Screen Understanding and Guidance

The system uses vision-language models (Gemini API) to analyze screenshots in real-time, identifying UI elements, reading on-screen text, and mapping the layout. When a user asks for help, Assistify generates contextualized instructions: "Tap the green button in the bottom right corner" or "Scroll down and look for the Settings icon." This requires understanding both spatial relationships (where things are) and semantic meaning (what they do)—bridging computer vision and language understanding.

Spatial Audio for Always-Available Assistance

Rather than requiring users to open the app explicitly, Assistify uses spatial audio to create distinct voice positions: the assistant speaks from "left," the user from "center." This lets users talk to the assistant anytime—even while using other apps—because their voice is directionally distinct from other audio. Combined with background screen recording, the system maintains context across the entire phone experience, not just when the app is foreground.

RAG-Powered Personalization Memory

The system stores user preferences, common tasks, and interaction history in a pgvector database, enabling personalized responses over time. If a user frequently texts their daughter, Assistify learns that context and can say "Would you like to text Sarah?" instead of generic "Would you like to send a message?" This personalization layer, implemented via RAG (retrieval-augmented generation), pulls relevant memories into each conversation, making interactions feel natural and contextual rather than robotic.

Achieving 92% Task Completion in User Testing

We tested Assistify with elderly participants across common tasks: sending texts, making calls, finding apps, adjusting settings, and navigating menus. The system achieved 92% successful task completion, with most failures occurring when users gave ambiguous instructions ("open the thing") rather than system limitations. The work demonstrated that voice AI paired with visual understanding could dramatically improve accessibility—not by simplifying the phone, but by making complexity navigable through conversation.

Github Repo Demo Video Pitch Deck

Dylan Chen

About

Experience

Agentic AI Developer

Automating Neurodegenerative Disease Diagnostics

Report Writing Agent with Human-in-the-Loop

PII-Aware Document Generation

From Manual Workflows to End-to-End Automation

AI Research Assistant

Predicting Diabetes Complications Before They Surface

Time-Series Forecasting on Longitudinal Health Data

Learning Patient-Specific Disease Dynamics

Engineering Multimodal Feature Pipelines

Product & Technical Lead

Celebrating 25 Years with Cross-Campus Gameplay

Training CNNs for Real-World Robustness

Full-Stack Product from Zero to Launch

Making AI Accessible to Non-Technical Users

HPC Intern

Automating Failure Diagnosis at Supercomputing Scale

The Challenge: No Ground Truth Labels

Building Self-Supervised Pattern Recognition

Production Integration and Interpretability

Scaling RCA Without Manual Annotation

Database Administrator

Leadership Development at Berkeley Scale

Replacing a Broken Spreadsheet Nightmare

Automated Tracker Setup in Under 10 Minutes

Redesigning for Human Usability

Product & Discovery Intern

Rethinking Phone Ownership Economics

Discovering Hidden Costs in Buy Now, Pay Later

Competitive Pricing Strategy and Market Entry

Expanding Accessibility in Emerging Markets

Featured Projects

AI-READI Health Prediction

Building Early Warning Systems for Diabetes

Multimodal Data at Different Time Scales

Learning Patient-Specific Disease Trajectories

From Research to Clinical Utility

CITRIS Quest

Computer Vision Meets Physical Gameplay

CNN Robustness Under Real-World Conditions

Full-Stack Development and Team Leadership

Designing for Non-Technical Users

ANDROMEDA

Automating Failure Diagnosis at Supercomputing Scale

The Challenge: No Ground Truth Labels

Building Self-Supervised Pattern Recognition

Production Integration and Interpretability

Scaling RCA Without Manual Annotation

Assistify

Voice AI for Non-Technical Seniors

Real-Time Screen Understanding and Guidance

Spatial Audio for Always-Available Assistance

RAG-Powered Personalization Memory

Achieving 92% Task Completion in User Testing

Skills & Technologies

Machine Learning

Agentic AI

Programming Languages

Data & Analytics

Development & Infrastructure

Let's Connect