I'm Jeremy, a second-year Ph.D. candidate at MIT, lucky to be working with Manya Ghobadi on systems for machine learning. Broadly, I'm interested in computer systems, particularly operating systems, storage systems, and ML infrastructure. I'm often exploring how AI can help solve problems in these areas, whether on its own or working alongside humans.
I spent my undergraduate years at Columbia University, where I worked with Asaf Cidon on applying Linux eBPF to OS abstractions that don't align with hardware trends. Alongside research, I spent much of my time as a teaching assistant for introductory and advanced systems courses.
publications
-
Checkmate: Zero-overhead Model Checkpointing via In-Network Gradient Replication
NSDI '26
pdf · code -
cache_ext: Customizing the Page Cache with eBPF
SOSP '25
pdf · code
preprints
-
PeeR: First-Class Scheduling for Latency Critical eBPF Applications
Under Submission -
BPF-oF: Storage Function Pushdown Over the Network
Under Submission
pdf
teaching
- COMS 4118 Operating Systems
- COMS 4157 Advanced Systems Programming
- COMS 3157 Advanced Programming
- COMS 4995: From Algorithm to Development
- COMS 1004: Intro Java Labs
Checkmate enables per-iteration checkpointing in DNN training with no training slowdown. In data-parallel training, gradients already flow through the network, so Checkmate multicasts them to a shadow cluster that maintains a live copy of the model. Training never has to pause to write state to storage.
cache_ext is an eBPF framework that lets developers customize the Linux page cache's eviction policy without modifying the kernel. Different applications can plug in their own policies tailored to their workloads, while the page cache still shares memory across processes and keeps policies from interfering with each other.
PeeR makes eBPF programs preemptable and schedulable. Today's eBPF runs to completion in softirq context, invisible to the scheduler, creating isolation problems. PeeR adds cooperative preemption at helper calls and integrates with sched_ext for first-class scheduling.
BPF-oF is a remote-storage pushdown protocol built on NVMe-oF. Instead of bouncing dependent I/O requests back and forth between client and storage server, applications push custom eBPF functions to execute directly on the remote server.
Columbia's foundational operating systems course. Covers process scheduling, virtual memory, synchronization, and file systems, with assignments that involve hacking the Linux kernel.
A course focused on how foundational systems tools are built. Covers the implementation of git, containers, debuggers, and linkers.
Columbia's introductory systems course, serving 300–400 students per semester. Covers C, UNIX, make, git, sockets, and HTTP, building from UNIX basics to networked applications.
A competitive programming class implementing algorithms in C, C++, Java and Python.
Guided small groups of students through introductory Java assignments, focusing on basic OOP design.