Posts

Showing posts from July, 2025

20. A Sketch of Helpfulness Theory With Equivocal Principals

You may rightly wonder what else I’m up to, if it’s not just blogging and the occasional adventure. The short answer is that I spend most of my time and mental energy taking part in a summer research fellowship in AI safety called PIBBSS, where my research direction has to do with questions like “how does an agent help a principal, even if the agent doesn’t know what the principal wants?” and “how much harder is it for the principal to teach the agent what they want, and how much worse are the outcomes, if the principal doesn’t know for sure what they want?”. The long(ish) answer is this post. I’ll avoid jargon, mathematical notation, and recommendations to read other long papers, but some amount of that will be unavoidable. Here’s the basic setup - called a “Markov decision process”, sometimes with adjectives like “decentralized” or “partially observable” - which forms the ground assumption for the entire subfield my research direction lies in, called “inverse reinforcement learning”:...

19. Notes On Hyperbolic Blue Paint

Image
(Read post 3, “Secret Colors, Impossible Colors”, first, or this will probably not make as much sense.) You know how sometimes you set out to do something and you psych yourself up to put a lot of effort in to have to figure out precisely how to achieve some desired goal or effect, and then the very first thing you try approximately works? Yeah, that’s about what happened when I set out to mix my own hyperbolic blue paint. It turned out to be such a straightforward effect to achieve under fairly permissive lighting conditions that I’ve shown it off to easily dozens of people, and easy enough to make that I led a small group in making some in person at a small event in March 2025. For the sake of posterity, I’ll provide a short recipe here. You will need the following: Synthetic ultramarine pigment (~$1/10 g) “Blue Lit” phosphorescent pigment, by Stuart Semple (~$20/50 g) A paint base, like linseed oil or acrylic base (Optional) Kaolin powder A small scale A mixing vessel, anything from...

18. Positive Feedback is More Efficient Than Negative Feedback - A Geometric Approach

(Epistemic status: a hastily-written recreation of something I remember reading once and can’t find anymore; if you can find it let me know. The original post had some good images. Probably best read after post 11, “Why the First “High Dimension” is Six or Maybe Five”.) Positive reinforcement is just plain more efficient and effective than negative reinforcement, if both are feasible, and I can pretty straightforwardly present a model that argues strongly for it. Consider the following setup: we’re trying to get some tiny simple agent to navigate to a goal area within some simple space. No overly complex obstacles, no particular hazards, just a tiny simple agent capable of approaching or avoiding marked areas and a goal with a strong but extremely short-range attractiveness. We have negative feedback markers, which the agent will strive to avoid, and positive feedback markers, which the agent will try to approach; we can model these both as having some infinite-range repulsive or attra...

17. Schelling Points, Flagpoles, Drip-Trays

Communities exist. This much we can hopefully agree on. Some communities arise naturally, while others are founded for purpose - anything from a desire for companionship to the need for more than just a few people to participate in some activity to the assembly of a lever to move the world with. One type of community is what I’d term a “flagpole”: a community that can be either intentional or accidental, but is always about something in particular: some cause or trait or practice. It’s generally the only one of its kind in its catchment basin, whatever that basin might be - walking distance, easy driving distance, or even nested within some nebulously defined internet subculture; if it isn’t the only one, there sure aren’t that many of them. It must also make itself prominent, often explicitly advertising to a large potential audience. After all, if a community isn’t growing, it’s shrinking - so goes the aphorism. This makes the flagpole a natural place for people who wave its metaphor...

16. Maybe Big Someday, Definitely Good Now

The world finds itself in peril, on the brink of self-immolating calamity in any of a handful of ways. Bit by bit, geopolitics has been turned into a multipolar tinderbox; drive for profit at all costs and for ruthless corporate growth are an ever-hungry furnace, with the continued impoverishment of billions; the climate warms, and forests burst into flames as the next zoonotic plague looms. And this is to say nothing of the neglected funeral pyres of nuclear nonproliferation, poor distribution of food and medical goods, and any of half a dozen simmering genocides at some or other great power’s behest, to name just a few ongoing failures. With all these cause areas screaming for resources with which to fight the flames, what are we to do, wishing to do the most good as efficiently as possible, if we hope to douse the rising flames our global civilization sleeps fitfully among? The prospect of near-term AGI has only stoked the flames of our perdition, and threatens to flash into a sudde...

15. “Too Stupid to Work” is Too Stupid to Work

(You might want to refer briefly back to post 4, “Seven-ish Words from My Thought-Language”, for the concept of [vanilla-obvious]ness.) I used to find myself repeatedly running into the problem of neglecting to do something of real value because I thought it might be too obvious or stupid to do. I still sometimes do, mind, I just used to have this problem way worse. The problem is in the heuristic that something can just plain be too stupid to work, which is, itself, much too stupid to work and in fact often fails badly. To that end, let me quickly describe a few things that I think solidly fall into the reference class of “things that feel too dumb to work, but totally work”. First off, write things down if you want to remember them much later on. This can be anything - conversations you have with people, notes to yourself, shower thoughts, events that have happened that you were a part of. If you want to remember anything more than broad strokes in a year’s time, write explicit notes...

14. On Microtonal Music (Extra Bits)

Here are two extra bits to talk about that were out of scope for my previous post about microtonal music, but which I still think are valuable to read. First, some music I recommend, if you want to try listening to microtonal/xenharmonic/untwelvish music. If you like guitar-driven rock, Brendan Byrnes is a surprisingly prolific musician who’s done a lot with sharp-fifth tunings like 22edo and 27edo; I recommend starting with the albums “Realism” and “Holocene Dream”. (“Neutral Paradise” is one of my favorites, and “Micropangaea” is his older and most varied album.) For a better-known option, King Gizzard and the Lizard Wizard has done plenty of music in 24edo, mostly emulating Anatolian, traditional Arabic, and Jewish scales; “Flying Microtonal Banana” is the starring example among albums, and the two-part album “K. G.”/”L. W.” presents another excellent example. For jazz, go looking for the sweetened thirds and flat fifths of 31edo; Hear Between the Lines has published some excellent ...

13. Platonism and Wilderness Guides

Image
  Consider the following: math is a part of nature. That is - even were there no humans in the universe, indeed no life at all, mathematics would still exist; every theorem an undergrad from our world would ever encounter in class would still hold, in this lifeless universe. Consider also that mathematical beauty is real, though in a more limited sense: perhaps uniquely human aesthetics are a key part of what it means for a piece of math to be beautiful, and another species that grew up beneath the light of a different star, under different evolutionary conditions, would find our love of elegant proofs which cast light on aspects of reality and our delight in surprising second-order consequences of natural-seeming assumptions every bit as alien as we might find their sense of beauty in math. Nonetheless, it then stands to reason that mathematical beauty is just as much a kind of natural beauty as the vastness of the Grand Canyon, or the power and splendor of Victoria Falls, or the ...

12. The Space of Olfaction is δ-Hyperbolic

Image
(Epistemic status: barely even half-baked - but unique, intriguingly plausible, and anyway no one has any better ideas.) Vision, hearing, the numerous aspects of touch, taste, and smell: of these, smell - or olfaction - is by far the worst-understood, even if we try to tease out the role that olfaction plays in flavor, separating it from the gustation and chemoception that strict-sense taste encompasses. As Convergent Research puts it, “We can’t yet replicate animal olfaction synthetically as a sensing and classification modality. We currently lack a comprehensive model explaining how biological systems decode and classify chemical signals through olfaction. Understanding this process is critical for applications ranging from flavor science to disease diagnostics to understanding and harnessing animal communication.” This past weekend, I briefly attended a “gap mapping” research hackathon organized by YJK; my thanks both to him and to DK who invited me. While I couldn’t hope to build a...

11. Why the First “High Dimension” is Six or Maybe Five

Image
(Epistemic status: morally correct, in the mathematician’s sense; here to give flavor and intuition without too much rigor.) People toss around the concept of “high-dimensional space” a lot without having a good idea of where that starts. Interpretability researchers staring at vectors in hundreds of dimensions are clearly contemplating a high-dimensional space - loss landscapes packed full of saddle points and spheres where any given pair of vectors is almost certainly nearly orthogonal - but do they know what it means to be a high-dimensional space? Surely we can do better for ourselves than Justice Stewart’s infamous maxim - “I know it when I see it”! What do we care about, when we want to tell how many orthogonal axes we need in a space to start calling it high-dimensional? I claim that we should care about looking at unit-radius disks - like, literal filled-in two-dimensional circles - and in particular, checking when, whether, and how pairs of those unit-radius disks overlap when...

10. Fermi Estimates

The world is vast and complicated, and precise measurement of its systems can be hard, expensive, time-consuming, imprecise, or any or all of the four. Nonetheless, we sometimes find ourselves in need of - or even just itching with curiosity for - quick cheap estimates of magnitude or prevalence of some phenomenon that lives in the world. These can be as trivial as the number of piano tuners in New York City or as practical as the number of potential customers walking by a storefront on any given day or even as weighty as the explosive yield of a nuclear weapon. Lacking hard data, we cannot hope for precise answers, but might we work some magic to come up with a decent guess? The answer is yes, and it’s less of a miracle than you might think. A secret, a mystery: all knowledge is one, for the systems of the world are deeply and redundantly enmeshed with each other; out there in the world there is some true object or phenomenon, permanently inaccessible directly to the brain, but nevert...

9. On Investment Casting, Part 2

Image
When we left off, we had sprued the positive form into place and assembled the flask in preparation for the investment - the surrounding with mold material - that gives the method its name. Mix the plaster according to recipes that can be found elsewhere, making sure to be precise in weighing out the plaster mix and the water both in the correct ratios and as a function of the flask volume, mix it thoroughly, and then use a vacuum chamber to degas it as well as possible. Fill the flask thoroughly, avoiding reintroducing air into the wet plaster, leaving voids, or damaging the positive or its attachment to the sprues. Gently but firmly bang the table to ensure that as many trapped air bubbles as possible separate from the positive and the sprues and float to the top of the flask, and fill with leftover plaster if necessary. At this point you will need to wait 2~4 days for the plaster to cure preliminarily and be firm but damp to the touch, and can wrap the plaster to prevent further moi...

8. On Investment Casting, Part 1

Image
(This one turned out very very long. Turns out carefully describing a workflow for a general audience takes a lot of words! It will thus appear in two parts.) Investment casting is a manufacturing method for shaping metal into a desired form with a high degree of precision. In brief, the method involves starting with a replica of some desired form made out of a sculptable and relatively flammable material like wax or PLA, putting the form in a metal tube and surrounding it with a highly heat-resistant but brittle material that captures form well like plaster or water glass, and putting the metal tube in a furnace to completely destroy the flammable material so as to leave behind negative space. Then, one pours molten metal into a space left in the top of the tube, waits for the metal to cool somewhat and solidify, and finally plunges the whole tube into water to finish the cooling process and destroy the hard outer layer to recover the cast metal. Lost-wax casting is a particularly lon...