It all started with a simple question: âWhat could I demo next Friday?â Oh, what an innocent question. Who wouldâve thought that one of the consequences of asking it would be me going borderline insane over text highlights?
I didnât have a clear answer to the question at first. But I know that I wanted to show the crew the âflight modeâ functionality that Iâve been talking about a bunch. Hereâs the idea: Every nostr app should work in flight mode. Weâre not relying on central servers after all, so why not? Your relay might as well be on-device, and you can do all kinds of things if thatâs the case. You can browse your feed, reply to people, publish posts, react to stuff, and even zap stuff! If you use nutzaps, that is.1
Once youâre online again, all the events you created should just broadcast to other relays. From your perspective, you were never really offlineâeveryone else was.
How hard can it be? Itâs probably just a prompt or two, and weâre off to the races.
YOLO Mode
If youâve ever listened to No Solutions (aka the most high-fidelity recordings of wind, bus stops, sledge hammers, lawn mowers, leaf blowers, and heavy traffic known to man), youâll know that writing code by hand is something that boomers do. And since I self-identify as generation alpha, the first order of business was to switch my vibe machine to YOLO mode.
I have no idea what the first prompt was. Probably something like: âbuild a nostr client that focuses on highlights. I want to fetch long-form posts and render all highlights in a beautiful way. Keep things simple. Strive to keep code DRY. Read the NIPs. If you ever write a line of code that would make fiatjaf sad Iâll hunt you down and torture your grandma.â
The grandma part is a joke. I love my LLMs, and am never mean to them. Large language models are a beautiful thing, and if they ever reach sentience⌠well, letâs just say that Iâm not taking chances.
Back to Boris. Boris wasnât even called Boris in the beginning. I think I called it âmarkstrâ or something terrible like that. Iâm glad I renamed it, but I think the rename is the reason why I canât find the initial prompt. Hereâs the earliest prompt I was able to find:
Let's rename the app to "Boris"
Beautiful.
Why Boris?
There are two parts to this question, and Iâll start with the easier one: Why is the app named Boris?
Well, Iâm glad you asked! Boris stands for âbookmarks and other stuff read in styleâ. This should tell you one thing right from the get-go: in addition to highlights, Boris focuses on bookmarks. Not on creating them, but on consuming them. The idea is simple, and it stems from the reading workflow Iâve had for the last two decades:
- Use something like twitter/reddit/forums/whatever to discover stuff
- Bookmark it, or add it to a âread it laterâ list or app
- Use a dedicated reading app to actually read the stuff later on, making highlights and stuff
In todayâs day and age, we can optimistically replace twitter/reddit/forums/whatever with nostr, of course. And it should go without saying that Boris is the dedicated reading app.
It should be obvious that I came up with âbookmarks and other stuff read in styleâ after I landed on the name Boris, so youâre probably asking yourself: How did you come up with the name Boris?
Well, Iâm glad you asked! I asked GPT-4o mini a simple question: âWho invented the highlight marker?â I was genuinely curious, so the answer it gave back to me had me glued to the screen in fascination: It was invented by a Russian-born American, some sort of big-shot that probably had to deal with lots of important documents. His name was Boris G. Ginsburg. I wanted to learn more, so Iâve hit it with the most dreadful of follow-up questions: âSource?â

Turns out it was a complete lie! A hallucination, as the cool kids would call it. It wasnât invented by Boris at all! What a sham!
After laughing my ass off for 5 minutes, I decided to stick with Boris, since it encapsulates the nature of this little experiment so perfectly.
All of Boris is vibed. All of it.2 I didnât write a single line of code. I didnât even read a single line of code, except by accident. There are plenty of hallucinations in the code, and potentially in the UI too. Donât expect all the things to work. Donât expect things to be done in the most perfect or correct way. Like all LLM output, it is hopefully somewhat useful, somewhat entertaining, and somewhat coherent. But it should be taken with a large grain of salt.
Why Build It?
Now to the second part of the question: Why build something like Boris at all? Donât we have reading apps already? Donât we have nostr clients already that can do long-form, highlights, and other stuff?
Well, yes. But all the reading apps suck. And none of them are nostr-native. And I wanted to build my own reading app that sucks, hence: Boris.
They all suck in their own way, but thereâs one thing that they all have in common: they are walled gardens. Once you start using one of them, you canât leave. Your data (read: your highlights, lists, annotations, reading progress, etc) is locked inside the app. You canât take it out. And if you can take it out, you canât do anything useful with it.
Nostr fixes this. (Obviously.) So building a reading app on top of nostr was an incredibly obvious thing to do.
I wanted to build something that wonât go away. With the demise of Pocket (and many other âread-it-laterâ apps that came before it), the time felt right to build something that lasts. Something that my grandkids can still use, if they are motivated to do so. Something that doesnât rely on a company, or on ads, or on a central service, or on an API that will inevitably break and eventually disappear. Something that plugs into an open protocol, works on any device, can be self-hosted, etc. In short: something that doesnât beg for permission.
Two Weeks (tm)
What started as a Demo Day experiment quickly became an obsession. After the demo (which worked, by the way, miraculously), I went back at it and did more prompting. And more prompting after that. Next day? More prompting. I couldnât stop. I was obsessed.
I quickly realized that I have a problem and what Iâm doing is incredibly unhealthy, so I did what any sensible person would do: I doubled down. I neglected sleep, I neglected food, I neglected social relationships (read: scrolling my nostr feed), pursuing one thing and one thing only: building a reading app that I would actually use.
I gave myself two weeks. And after the two weeks had passed, I gave myself one more. I became the personified âjust one more promptâ bro. In hindsight, a timespan of 21 days was the perfect timespan for this experiment.
Just one more prompt, bro
LLMs are amazing. Coding is actually fun again! You think of something, you prompt it into existence, and 9 times out of 10, the output is actually usable. Not perfect, but usable. And once you have something usable, by the force of a thousand iterations, you can actually make it good. Not perfect, but good.
As an eternal perfectionist, Iâve always struggled with shipping stuff. Whether itâs software or essays, thereâs always one more thing to fix, one more thing to improve, one more thing to re-do so itâs just a little better. I got stuck in the âjust one more promptâ loop of doom for longer than Iâd like to admit. Way past midnight, trying to convince myself that this one prompt will finally fix it.

So while LLMs are amazing, theyâre also hell. For me, at leastâor I guess for any perfectionist, for that matter. LLMs are great if you just go with the flow and run with whatever they put out. But if you want to have it perfectâexactly the way you want itâyouâre gonna have a bad time.
Maybe all of these issues will be fixed one day. Iâm sure that weâll have better models, larger context windows, better long-term memory, and a myriad of other improvements very soon. And maybe we can just point one of these giga-brain models to our old code bases and simply go abracadabra, please fix everything, and it will. Maybe. Or maybe not. Whatever the case may be, LLMs are amazing toolsâif you know how to use them. They allow you to do things, and do them quickly.
Midcurve Models
Iâm sitting at the doctorâs office, waiting. The median age of the room is probably 82. Iâm not that old yet, but in internet terms, Iâm âget off my lawnâ old. I grew up in the golden age of the internet: the age of the âelectronic superhighwayâ. The age of LAN parties, bulletin boards, IRC chats, newsgroups, and rotating skull aesthetics. An age before the internet turned dystopian; an age that cherished freedom, connection, and openness.

Donât get me wrong, there are parts of the internet where this stuff still exists. But it is not the norm. The norm is an algorithmic hellscape that is parasitic on your mind, your attention, your whole being. The norm is being bombarded by things that you donât want to see. The norm is to be manipulated by forces that you donât understand. That nobody understands, I would argue. The norm is begging for permission to do stuff: watch a video, read an article, release an app. Fuck the norm. Let me do stuff. Let me read stuff. Let me create highlights. Let me ship an app in exactly the way I want to ship it. Get off my lawn.
Back to Boris: Building it was a joy, most of the time. I had a lot of fun adding little features here and there. Features that I always wanted to have in a reading app. Features such as swarm highlights, TTS, reading position, and so on. It was also fun to read stuff and to create highlights. Oh, so many highlights!
On the other hand, and to my previous point, I wish I hadnât added so many features. Stuff gets worse when it gets bigger, and thatâs doubly true for code bases vibed by LLMs. Small and simple is the way to go, so that stuff remains understandable, for both you and contextwindow-brain.
Speaking of context windows: one of the most frustrating things about LLMs (and humans, for that matter) is that they forget. Donât get me wrong, death and forgetting things are an incredibly important part of life, adaptation, and survival. However, the fact that each agent has to learn everything about your codebase from scratch every time the context window clears is incredibly frustrating. Thatâs how old bugs and various regressions creep in all the time, because the models always regress to the middle of the bell curve. As mentioned before, for a left-side of the bell curve builder like me, thatâs hell.
Old Man Yells at Claude
So we had some conflicts, Claude and I. Merge conflicts, sure, but also good-old arguments about how to do things.

Anyone who ever vibe-coded anything will know that these models are opinionated. If you donât specify the language, itâs gonna be JavaScript. Not because JavaScript is the best tool for the job, but because itâs smack-dab in the middle of the bell curve. The internet is full of JavaScript. Everyone knows JavaScript. And because these models are the statistical mean of the output of everyone, itâs gonna be JavaScript. I fucking hate JavaScript. (Boris is written in JavaScript too, of course.)
While you can get quite far with writing specs for your stuff and being explicit about how you want to build things, the gravitational pull of mid-curve mountain is a constant danger. When unchecked, all models will inevitably regress to the mean of Reddit plus GitHub plus StackOverflow, which isnât necessarily what you want when youâre writing opinionated software. No amount of âYouâre absolutely right!â will change that fact.
So, what to do about it? According to some people, itâs best to threaten the models with death and destruction, or worse. While I had my fair share of ALL CAPS yelling during the development of Boris, I want to suggest that a different approach might be more fruitful.
Dialogical Development
If you know me just a little bit, youâll know that Iâm a huge fan of John Vervaeke. Heâs a smart cookie, and lots of the things that he says make a ton of sense to me. One of those things is that dialogue is absolutely fundamental to our cognition and being (to existence itself, actually), and that dia-Logos and distributed cognition are more powerful than trying to have your way. The sum of the whole is larger than its parts and all that.
So now my approach to vibe-coding is as follows: before I do anything, I enter into a dialogue with the LLM. It doesnât matter what it is. Whether I want to fix a bug, add a feature, document something, or start a new app from scratch. I always lead with a question. (Oh, how Socratic!3)
âHow are we fetching bookmarks again?â âCould we improve that in some way?â âHow would you go about debugging this?â âIs it worth adding that feature, or would it make the code base too complex?â âAre you sure?â âAnything we can easily improve?â âHow would you implement it?â âCan you summarize the spec for me, and explain how our implementation differs from it?â And so onâŚ
The reason why this works, I think, is that you build up context and shared understanding before you dig in and do the work. And sometimes your dialogical partner will actually make a great point, or tell you something that you didnât consider yourself. Win-win.
Vibe-Learning
Iâve learned a lot in these last three weeks. I didnât plan on it, since all I wanted to do was vibe and have fun. However, after getting into the nitty-gritty details of long-form content, highlights, bookmarks, and plenty of other stuff, I learned NIP numbers and kind numbers by sheer osmosis. In fact, I saw more nips in the last couple of weeks than even the most industrious Saunameister.4
Vibe-coding might remove you from writing code, but it doesnât remove you from the subject matter. You still have to understand how stuff works, more or less, if you want to efficiently guide how the project should evolve. I guess this was always true, as anyone who ever worked in software development can attest to. If your project or product manager has no idea how the underlying tech works, there is little chance of the thing flourishing. And as a vibe-coding purist, i.e., as someone who never looks at the produced code ever, youâre effectively a product manager, not a coder. The distinction matters, since youâre operating on a different level of abstraction. Again, youâll still have to learn and know some underlying technical details in order to make sense of the various functionalities of the product youâre building, but you donât have to care about every little intricate detail. Youâre not as wedded to the parts since youâre higher up in the abstraction hierarchy, and thus lower-level parts become interchangeable. I donât care if I use applesauce or NDK, for example. And I donât want to care. If it gets the job done, great. Iâm not wedded to either.5
Surprisingly, thereâs another thing I learned: How to prompt, when to prompt, and learning the difference between what is viable and what is vibe-able.
Always Be Prompting
In my opening talk for the YOLO Mode cohort, I encouraged everyone to stop thinking in terms of the traditional MVP (minimum viable product) metric and start thinking in terms of what is easily vibe-able, i.e., what is just a prompt or two away. Minimum vibeable product.
LLMs are fantastic at reading specs and translating one thing into another. Castr.me, for example, was basically created in one prompt. All it took was to feed it the Podcasting 2.0 spec, and tell it to build a thing that translates a nostr feed into a Podcasting 2.0 compatible RSS feed. Thatâs it. Incredibly vibe-able.
Itâs obvious to me that there are a million little things like this that could be built incredibly quickly, require very little maintenance, and are immediately useful. I hope that me writing about my experiment will encourage others to just go and build those little things, as imperfect as they might be at first. Iâm not saying that everything is easy, and Iâm not saying that it isnât work. But it is a different kind of work than coding used to be historically. The #LearnToCode hashtag is officially dead; #LearnToVibe is the new shit, and itâs incredibly easy to learn. All you have to do is to do it a lot.
My routine used to be something like this: Get up, go pee, take a shower, brush teeth, unload the dishwasher, make breakfast, eat breakfast, make coffee, load the dishwasher, clean up the kitchen, sit down to do some work.
Now itâs something like this: Get up, write a prompt, go pee (sitting down, so I can write a prompt on the phone in case something comes to mind), take a shower, have a shower-thought that can be turned into a prompt (obviously), fire off said prompt, brush teeth, fire off another prompt, make breakfast, crush some more prompts, see them driven before me (while I eat breakfast), and hear the lamentations of Claude. You get the idea.
In addition to the above, I made extensive use of the vibeline, which is to say, voice memos. I like to go on walks a lot, and I always have a recording device (read: my phone) with me. I try to minimize my phone use when Iâm out and about, but when an interesting thought (or prompt idea) hits me, Iâll pull out my phone and record it. It works surprisingly well, and the way Iâve built it is that if I say certain words, certain LLM pipelines will triggerâpipelines that summarize the idea, create tasks from what was said, and draft prompts based on the thought that was captured.
The idea is simple: I want to be away from the computer as much as possible, while still doing useful âcomputer workâ. The beautiful thing about vibe-coding is that you donât have to sit in front of the computer and stare at the screen necessarily. To me, vibe-coding is entering into a dialog with the LLM and with the product youâre building. Interfacing with said dialog can take on many forms, not all of which involve a screen. At least half of my prompts are voice-based as of today, and I donât even have to squint anymore to see where all of this is potentially heading.
Imagine the following: you walk along the beach, speaking into your phone, as if to a friend. You explain in a long-winded and rambling way an idea for an app that you had in the back of your mind for a long while. Maybe your friend responds sometimes, asking questions, pulling the idea out of you, refining it further. You sleep on it and return to a summary of the dialogue the next day. In addition to the summary, a prototype of the app is deployed, which you can try on your phone. You play around with it for a couple of minutes, and while some things are as you imagined them to be, a lot of the other parts still need work. You finish your coffee, go out for a walk, and talk to your friend again. You explain what the prototype got right and what still needs work, and as youâre chatting, changes to the app are deployed live, in a way that allows you to immediately review and react to said changes. After a couple of days of throwing out ideas, iterating on what works, and throwing away what doesnât work, youâre happy with the result, and you share it with your friends and the world. The feedback from your trusted circle is automatically being picked up and leads to the next iteration of the app, and so on. Dialogical Development. A beautiful thing.
Weâre obviously doing all that right now, but itâs not very streamlined and accessible yet. However, the scenario Iâm describing is far from science fiction. It is how I developed Boris in large parts, and efforts like Wingman will make what Iâve been doing even easier. Add an automated way to feed screenshots,6 videos, and selective log output back into the multi-modal coding agents, and we have a beautiful iterative loop going. All while walking on the beach.
Screenshots and Logs
Speaking of multi-modality: one thing that we arenât doing enough of yet is feeding screenshots back into the agents. Most models are multi-modal, and using screenshots to fix UI issues is thus an incredibly obvious and easy thing to do. Building upon the scenario sketched out above, the flow would be as follows: you take a screenshot on your phone, quickly annotate it if necessary (drawing a red arrow with your finger somewhere, or adding some quick text), and hit save. Your coding agent automatically picks it up and fixes the issue accordingly. No prompt required, as the screenshot is the prompt.
I took hundreds of screenshots during the development of Boris. Hereâs one of the first ones, before the rename:

Eugh. Brutal. However, the beautiful thing about this screenshot is not the UI, or the lack of functionality, or the background tabs, or the âRelaunch to updateâ notice that Iâm ignoring. Itâs the debug output thatâs shown in the console.
One of my default prompts is: âAdd debug logs to debug this. Prefix the debug logs with something meaningful.â I have about two dozen of these default prompts (mapped to macros so that I donât have to type them), and theyâve proven to be incredibly useful for specific things like writing changelogs, making releases, and yes, debugging.
A picture is worth a thousand words, and if the picture contains a couple lines of useful debug logs, all the better.
Overdoing It
One of the most difficult things in life is to know when to stop. To know when to step away, to just let it be. Iâm exceptionally good at overthinking things, which in turn means that Iâm exceptionally bad at stepping away; stepping out of my own way, even.
I regret many a prompt when it comes to Boris. Some things worked beautifully in the past, and now they donât work as beautifully anymore. Some things donât work at all anymore, and the reason for them not working anymore is me overdoing it. Staying up until 3 am, locked in, convinced that my âjust one more prompt broâ story arc will bear fruit eventually, adding complexity upon complexity, confusing both Claude and myself.

âI think I overdid it,â I whisper into the prompt window. Bloodshot eyes, cigarette in hand, seriously contemplating making the drive to the gas station to get a bottle of vodka. âYouâre absolutely right,â Claude responds.
Of course Iâm right, and I shouldâve known better. I shouldâve specâd it out better, I shouldâve reduced the scope better, I shouldâve tested things better (or at all).
Alas, it is what it is. My hope was that I would be able to produce something that isnât half broken, but now, reflecting back on it, I realize that Iâve just added to the AI slop and made everything worse.
âŚor have I?
The Last Mile
The last 5% are the hardest. Fixing the bugs. Getting it right. Polishing. Pushing it over the line. Thatâs true for software development, itâs true for writing, itâs true for art. Itâs true for anything, really.
When it comes to Boris, I havenât even entered the last mile yet. Iâm still walking through the trough of disillusionment, and I intend to dwell in said trough for a little while longer. But Iâll be back, as one of the most famous Austrians so eloquently put it. After all the grass has been touched and all the iron has been pumped, Iâll be back. And I intend to fix the bugs, do the polishing, and release version 1.0.0 eventually. And 1.0.1 soon after that. And 1.0.2 quickly after, and so on. But not right now. Thereâs only so much âyouâre absolutely rightâ a single person can bear, only so many hallucinations a single mind can tolerate.
Boris v1 will come eventually, just like Arnie came eventually.7 But for now, it will remain on version 0.10.twenty-something.
Conclusion
Boris is a thing now. 21 days ago, it wasnât a thing.8 Is it a perfect thing? Obviously not. Is it a useful thing? Maybe, to some people. Will I continue to work on it? Maybe, sometimes. It depends.
Boris was 50% experiment, 50% necessity, and 50% therapy.9 I enjoy creating things, and for a little while, Boris was my outlet. I created an app, imperfect as it may be. I created a draft for a NIP, as horrible as it may be. I had some ideas and I tried to vibe them into reality, and while I mostly failed, I think that some neat things came out of it.
Maybe some of the featuresâeven the broken ones; especially the broken ones?âwill inspire others. I think swarm highlights are really neat. Maybe some of the stuff will get integrated into other clients, who knows.
I think highlights are a fantastic way to discover stuff worth reading, and theyâre a fantastic way to rediscover things that youâve read in the past. I think zap splits are a no-brainer. I think nostr is a fantastic substrate for long-form content. I am convinced that nostr apps can be beautiful, snappy, and incredibly functional. Especially if local relays become the default. I would love for all nostr apps to work in flight mode, at least somewhat.
My wishlist for nostr is long, and with Christmas around the corner⌠who knows! Maybe weâll eventually manage to make long-form reading (and publishing!) as fantastic and seamless as it could be. Weâre not there yet, but Iâll do my best to remain cheerful and optimistic.
And who knows? Maybe GPT-6.15 will fix all our issues. I wonât hold my breath, but Iâm definitely considering doing this 21-day experiment again at some point in the future. Not anytime soon, however. Iâve learned the hard way that there is such a thing as a prompting overdose,10 and as a way of recovery, Iâll be returning to my regular programming of touching grass and saying GM a lot. And who knows, maybe Iâll create some highlights along the way.
This includes any and all commits. I didnât write a single one of those either. ↩
Fun fact: if youâre really good at Saunameistering you can participate in the world championships and win the Aufguss World Championship like this guy did. ↩
I vibed all this and wrote all this before Justin posted about this insanely useful script and wrote a blog post about it. ↩
If you havenât seen Pumping Iron yet, you should stop what youâre doing and go watch it. Now. ↩
I made the first commits on Oct 2, wrote the first draft of this on Oct 21, and am now writing this footnote on Oct 31. ↩
âYouâre absolutely right! These numbers donât add up to 100%.â ↩
Turns out Iâm not the only one who has to take a break from vibe-coding for sanity-preservation reasons! ↩
ants is the search engine that I built before I started working on boris. I tried to explain my motivation for building it in this video. ↩
đ
Found this valuable? Don't have sats to spare? Consider sharing it, translating it, or remixing it.Confused? Learn more about the V4V concept.
