ClusterPilot is live, and a kernel got 25x faster


Hey friends,

Big fortnight. A tool I had been putting off building for years shipped to PyPI, and a GPU kernel I thought was fine turned out to be reading mostly zeros.

What I’ve Been Up To

ClusterPilot is out. I built it because I was genuinely tired of the loop: write a SLURM script, rsync files, refresh squeue, wait, repeat. If you work on HPC clusters, you know the loop. It is a terminal app where you describe your job in plain language, it generates a cluster-aware SLURM script using AI, handles upload, submission, monitoring, and syncs results back when it finishes. Logs update in real time over SSH. Optional push notifications so you can close the laptop and actually walk away. It is MIT-licensed, free to self-host with your own API key, and pip install clusterpilot works. I have also set up a Fly.io instance with a capped key if you want to try it without any setup. Conda-forge package is in review.

On the PhD side: my Monte Carlo kernel was faithfully multiplying all 216 entries of a coupling matrix per spin flip, for a cubic lattice where exactly 6 of those entries are nonzero. The kernel did not know that. And the larger the system, the worse it gets. Swapping in a precomputed sparse neighbour list cut the work to just the 6 values that actually matter: 25.8x speedup, bit-identical outputs under deterministic RNG, and a run that used to take 7 hours now finishes in 16 minutes. The dense path stays as a fallback for long-range systems. Obvious in retrospect. Most things are.

Also rebuilt juliafrank.net. Claude Sonnet 4.6 handled the mockup and design decisions; I did the CSS and WordPress work by hand. That split turned out to be more satisfying than fully automating it. Having someone walk you through the reasoning means you actually understand what you built. Custom CSS stops feeling intimidating once you are the one making the choices.

Worth Reading

cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia Tim Besard and Keno Fischer, the people behind CUDA.jl, have wrapped CUDA 13.1’s tile-based kernel abstraction for Julia. The idea is that managing individual threads and memory hierarchies by hand is a level of detail most kernels should not require. I had not come across cuTile.jl before and I am genuinely curious whether it would simplify the kind of kernel work I have been doing. Worth a look if you write CUDA kernels in Julia.

Chipmunk: GPU Kernel Optimisations, Part III A detailed walkthrough of why naive sparsity does not automatically win on GPUs, and what you have to do to your data layout for it to pay off. The authors get a 9.3x speedup by rethinking how sparse data is packed in memory. I found this after writing up the neighbour list result above and it felt uncomfortably relevant. If the 25x number made you curious about the underlying mechanism, this is the explanation.

How I use Obsidian for academic work I use Obsidian for my PhD vault and content work, and I am always curious how other people set theirs up, especially when they are researchers rather than productivity enthusiasts. This one is by a PhD researcher in AI: three plugins, typed links, a downloadable template, and a system that has been working for five years. No methodology, no philosophy. Just someone in a similar field showing their actual setup.

Quick Thought

I fed several months of Claude chats about my PhD into Obsidian, then passed the notes into NotebookLM to make flashcards. The flashcards surfaced gaps I had not noticed. Conversations in chats feel productive in the moment. But recall does not let you fake it the same way, when you have to retrieve something from memory, you find out quickly what actually stuck. It has also, somewhat accidentally, been useful for shaping the PhD proposal I need to present in a few weeks.


Until next time, Julia


You’re getting this because you signed up at juliafrank.net. If this is not your thing any more, you can unsubscribe below.

3-20 Brandt Street, Unit 124, Steinbach, MB R5G 1Y2
Unsubscribe · Preferences

Julia Frank

Linux and programming workflows documented honestly from inside a real PhD. The stuff modern research actually requires, learned the hard way.

Read more from Julia Frank

Two GPUs, four new tools, and a 1959 paper Hey friends, This fortnight started with three days of Obsidian and Claude Code configuration and ended with a 1959 paper, making me feel slightly embarrassed by how impressed I was. What I’ve Been Up To I finally got the full PhD setup working the way I wanted it. Obsidian for the PhD vault and the content side, both connected to Claude Code and integrated with my simulation project repository. Three days to get it right. The point was to be able to...

Issue #1 Hi friends, It’s been a while since I last wrote you, but today I have something exciting to share. Over the past 24 hours, I threw myself into an unexpected challenge: building my first Obsidian plugin from scratch. And when I say scratch... I mean it. I had zero plugin development experience, zero TypeScript or Node.js background, and only a rough idea of how Obsidian plugins even work under the hood. Yet — 24 hours later — I now have a working private plugin that lets me publish...