State machines - Why and how to use them in web development.

What is a state machine?

I think Wikipedia does a very good job of defining a state machine.

A finite-state machine (FSM) or finite-state automaton (FSA, plural: automata), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition. An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition.

In software development, a state machine is usually represented by some aggregate data structure; an object in an OOP language, or a hash-map in a functional language like Clojure. A state machine can also be saved to your DB as a row in a table.

This object has fields for current state and the data in needs to do it's job. There is also code associated with this object that defines how it transitions between it's states.

An example

Let's look at an example. A user trying to change their primary email address, which is also their username.

The states are:

  • initial => This is the starting state in which the machine is initialized.
  • change-in-progress => The user has asked for the email address to be changed.
  • verification-in-progress => We have sent an email to the old email address, asking the user to confirm the change.
  • verification-done => The user has verified the change.
  • validation-in-progress => We have sent an email to the new email address, asking the user to validate that they can receive emails there.
  • validation-done => The user has validated their new email address.
  • changed => The change has been applied.
  • request-cancelled => The request has been cancelled.

Transitions are:

  • initial => change-in-progress. Initiated when the user requests the change via a web form.
  • change-in-progress => verification-in-progress. We have sent the verification email to the current email address.
  • verification-in-progress => verification-done. The user has verified that they intended to make this change by clicking a link sent to their existing email.
  • verification-done => validation-in-progress. We have send the validation email to the new email address.
  • validation-in-progress => validation-done. The user has validated the new email address by clicking a link in the email sent to it.
  • validation-done => changed. We have made the change in our DBs, and run any other processing required for this change.
  • *any* => request-cancelled. The request was cancelled by either the user or our systems.

You could also add states for verification or validation failures. Also for failures of our system to send an email.

The reason to have states like change-in-progress and validation-done is to make sure we only change to the in-progress states after we have sent the email. A failure in our email sending system should not put the user in a state where they need an email to proceed further but our system thinks the email has been sent.

There are more states that can be added to make this more robust. I've skipped any states that deal with error conditions (validation failure, etc). For this hypothetical system, we can transition to request-cancelled but you might want more granular states to record exact points of failure.

How do we communicate/document state machines?

While we can describe state machines with written descriptions, it's much easier to use state diagrams. These are the standard way of describing a state machine, and are great at communicating how a state machine functions.

What's the point?

Looking at the example above, you may be thinking; what's the point of using a state machine? It seems like we're needlessly adding a layer of complexity to a simple feature that most web applications built today support happily without a state machine.

Here's a secret. All software development is building state machines.

Computers are themselves FSMs. As is all the software we write on top of them. It's just that we don't normally think of the enormous space of possible states, instead we think in terms of values of variables and what they represent in our software.

Thinking explicitly in terms of FSMs for small parts of our software makes it easy to reason about  it, which is why it's very useful to model our software as an FSM on smaller scales, in critical modules where we must be absolutely sure of how the software will react to different inputs.

A practical example

I think this whole state machine business is a lot easier to explain with a code sample. Xstate is a popular JS library that makes it easy to build state machines. Instead of copying the code here, I'll just link it.

Here's a tutorial from the Xstate site that walks you through building an app that displays post from a sub-Reddit. Notice how the code is simpler to reason about. You're almost breaking the functionality into it's constituent pieces; what to do while the posts are loading, what behavior to expose when the posts are loaded, and how to react if loading fails. 

My knowledge management system

This is a follow-up on my previous notes about Zettelkasten; Thoughts on Zettelkasten and the slip box. Since then, I’ve had a chance to read and think more about the problems I listed out with trying to adapt a Zettelkasten style slip box for my knowledge management system.

I’ve found a few answers and I’ve come up with a new system that I hope will serve me better.

Why did I struggle with using the Zettelkasten?

The most important learning I’ve had while searching for an answer to this is something that should have been obvious to me from the start.

A Zettlekasten is a system designed to facilitate publishing. It’s wasn’t meant to be used as a general knowledge management system.

This fact very quickly cleared up why the slip box wasn’t working for me as a knowledge management system.

When you’re using a Zettelkasten system, you “ask” your slip box for questions you should find answers to. This “asking” can be by looking at questions you have open in your notes, or by seeing where you’re forming lumps or groups of notes and expanding on the knowledge already there.

Since the starting point for your reading and research is a bunch of notes already in your slip box, any new notes will be taken with an eye towards linking it to your existing notes. You naturally build up a group of interconnected notes.

This is the reason why a slip box doesn’t need much hierarchy or a well maintained index/table of contents. Most notes you add will be linked to older ones. You build up your graph by adding connected notes to it. You seldom add a completely unconnected note to your slip box.

I didn’t need a system to facilitate publishing. I needed a system to store knowledge. These 2 goals might overlap a bit, but they are quite distinct.

Most of my notes are on completely unrelated topics. I read based on whatever I find interesting on social media; Hacker News, Indie Hackers, my collection of books on disparate topics, etc. Sometimes I read to understand a topic better by trying to answer questions I had in older notes, in which case I can build a small network of notes - but that’s an infrequent activity for me.

A Zettelkasten is a poor system to hold notes on a wide variety of subjects, with only few notes per subject. It’s difficult to go back to disconnected notes you have written without maintaining some sort of hierarchy. Trying to shoe horn a hierarchy into a Zettelkasten felt foreign, and was frowned upon in most literature I read about the subject.

My new system

With this new information, I can finally get rid of the self-inflicted pain of trying to use the Zettelkasten system to manage my knowledge. Using hierarchical tools makes sense, because my knowledge graph isn't well connected. It's mostly a set of disparate notes.

Some notes form lumps or groups when I become interested in researching something in detail. Most don't. I need a system I can put my knowledge into and get it back out when needed, without relying on linking between notes.

I will use MOC (Map of content) instead of folders. Folder give me everything I need for building a hierarchy, but miss out on 1 important feature. There's no way to demonstrate relationship b/w notes in the same folder. In a MOC, notes can be near to each other when they are related, can form a hierarchical relationship by being indented under other notes, etc.

I got introduced to the concept of MOCs by this excellent blog post from Nick Milo. He also has a course - Linking Your Thinking, that talks about building a personal knowledge management system.

I’ve decided to use Logseq as my writing tool. Here’s how my new system will work day to day.

  • I take fleeting or literature notes anywhere. They all come to my dashboard through the queries I have. Mostly this means these notes are in the journal pages.
  • Once a fleeting or literature note is done, it's marked as status:: complete and it disappears from the dashboard.
  • Permanent notes have to be created from one of the MOC pages. This includes MOCs for topics, but also MOCs for courses, books, etc. This allows me to have a browsable list of notes in my system.
  • What do I make permanent notes out of? To answer this, I need to answer a deeper question. What is the primary reason for my writing?
    • To make things clearer to me, to understand deeply. Thus I can make notes of things that I want to make sure I understand. Topics that I don't care about don't need a permanent note.
  • I can add any tags that I think are necessary. Not sure how I can make this more efficient now, but I will add tags for now as another way to discover related notes.
  • When I want to write for publishing, I will use Ulysses as I really like that interface and I can then easily copy to my blogging platform from there. This does mean that I end up writing twice, but I think of my notes in Logseq as a first draft. Rewriting them again before publishing makes the final piece better.

This system is very much an experiment. Once I have been using it for a few months, I should have a better idea of how effective it is in helping me manage my knowledge. I might do a follow-up post then.

Bootstrap with Ruby on Rails 7

Bootstrap with Ruby on Rails 7

If you have a brand new RoR 7 project that you created with the defaults by running rails new <PROJECT> then you can safely follow the following steps to get Bootstrap 5 installed in your project.

1. Install gems

Add the following to your Gemfile and run bundle install.

gem 'bootstrap', '~> 5.2.0'
gem 'jquery-rails'
gem 'sass-rails' # This may already be present in the file in a commented line, in which case you should uncomment it.

2. Setup Javascript

In your app/javascript/application.js, add the following at the top.

//= require jquery3
//= require popper
//= require bootstrap

3. Load Javascript in your views

In the <head> section of your app/views/layouts/application.html.erb, add this:

<%= javascript_importmap_tags %>

4. Import Bootstrap CSS

Rename the existing app/assets/stylesheets/application.css to app/assets/stylesheets/application.scss and add a line with @import "bootstrap" near the top.

The sass-rails Gem allows processing SCSS files to CSS on the fly. RoR 7 is already setup to make use of it without any additional configuration beyond installing the Gem.

In your HTML the CSS is loaded by the tag <%= stylesheet_link_tag "application", "data-turbo-track": "reload" %> which should already be present in your app/views/layouts/application.html.erb.

Why did I write this?

I’ve been helping a non-tech fried learn programming for the past few months. He’s working through a Ruby on Rails course, and he’s now at the point where the course walks him through building mini apps; simple web socket based chat, stock trackers, etc…

Unfortunately the course uses Rail 6, and Rails 7 introduced a couple of new things that changed how JS and CSS files were processed and added. My friend has had constant problems getting Bootstrap to work nicely inside the apps he creates.

I’ve tried helping him by hacking away over Zoom, following instructions from a bunch of different sources. It worked sometimes, but the last few times he’s asked me to help, I couldn’t not get Bootstrap working, and I had to ask him to move to the next lesson without Bootstrap. It wasn’t a blocker, but it wasn’t a great experience either.

So today, I spent a few hours pouring over the documentation. What always confused me before was the 2 different ways of processing Javascript that RoR 7 has:

  1. Import maps: Working with Javascript in Rails
  2. The asset pipeline

I thought these were 2 different systems and you had to choose one over the other. Unfortunately the official Rails Guides (linked above) don’t clarify this in the guides for both of these systems.

After reading the documentation and experimenting with a local Rails app, I was able to finally understand the basics of these two systems, and how they work together. I’ll describe it next for the next person who faces this confusion.

How import maps and the asset pipeline fit together

Import maps are a way to import Javascript modules directly from the browser. Here’s a nice official (I think) resource about it: https://github.com/WICG/import-maps

Import maps in Rails 7 let you define mappings between the “bare” name you want to use in import React from “react” and the ESM compatible specifier that must be one of; absolute path, relative path, or a URI.

That’s it. Import maps have no business in how the files are pre-processed on loaded. If you use the import map tag in your HTML file, it will spit out the following code in the HTML:

<script type="importmap" data-turbo-track="reload">{
      "imports": {
        "application": "/assets/application-45b83ea01a8c68b3493391ceecb79f31baf4159ca091fee6fd122bf413d79500.js",
        "@hotwired/turbo-rails": "/assets/turbo.min-e5023178542f05fc063cd1dc5865457259cc01f3fba76a28454060d33de6f429.js",
        "@hotwired/stimulus": "/assets/stimulus.min-b8a9738499c7a8362910cd545375417370d72a9776fb4e766df7671484e2beb7.js",
        "@hotwired/stimulus-loading": "/assets/stimulus-loading-1fc59770fb1654500044afd3f5f6d7d00800e5be36746d55b94a2963a7a228aa.js",
        "controllers/application": "/assets/controllers/application-368d98631bccbf2349e0d4f8269afb3fe9625118341966de054759d96ea86c7e.js",
        "controllers/hello_controller": "/assets/controllers/hello_controller-549135e8e7c683a538c3d6d517339ba470fcfb79d62f738a0a089ba41851a554.js",
        "controllers": "/assets/controllers/index-2db729dddcc5b979110e98de4b6720f83f91a123172e87281d5a58410fc43806.js"
      }
    }
</script>
<link rel="modulepreload" href="/assets/application-45b83ea01a8c68b3493391ceecb79f31baf4159ca091fee6fd122bf413d79500.js">
<link rel="modulepreload" href="/assets/turbo.min-e5023178542f05fc063cd1dc5865457259cc01f3fba76a28454060d33de6f429.js">
<link rel="modulepreload" href="/assets/stimulus.min-b8a9738499c7a8362910cd545375417370d72a9776fb4e766df7671484e2beb7.js">
<link rel="modulepreload" href="/assets/stimulus-loading-1fc59770fb1654500044afd3f5f6d7d00800e5be36746d55b94a2963a7a228aa.js">
<script src="/assets/es-module-shims.min-d89e73202ec09dede55fb74115af9c5f9f2bb965433de1c2446e1faa6dac2470.js" async="async" data-turbo-track="reload"></script>
<script type="module">
  import "application"
</script>

The actual loading of the files is left to the Asset Pipeline. Which is why you can use the import map tag in your HTML file, while still using the //= require jquery directives in your JS files. The Asset Pipeline also provides the fingerprinting that you see in the filenames above.

Thoughts on Zettelkasten and the slip box

I had a bunch of thoughts yesterday about the Zettelkasten method and how I could use it effectively to manage my knowledge base. I started the day by dumping my thoughts into Logseq. Here they are.

These are open questions for now. I plan to investigate this further and try out different iterations to see what works for me.

  • I've been in a place before where I used Roam to gather a small number of notes (> 100) but then found all of that to be an unmanageable mess.
  • Issues that I see with this setup
    • With notes spread all over the place how do I find anything to link to? I can't go through 200 notes every time I add a new one.
    • All notes are in the same "directory". Because there is no hierarchy, my notes about productivity are in the same place as my notes about data structures & algorithms. This seems unsustainable.
    • There is 1 benefit I see to this. With everything being in the same place, I can find unexpected connections. Unfortunately, that doesn't work for me because I don't go through all my existing notes every time I add a new one.
    • I am interested in many things; productivity, parentings, Islam, algorithms, data structures, system design, programming languages. Having everything in one place seems to add to the mess.
    • As I understand, Niklas Luhmann researched 1 topic extensively - social science. It would make sense for him to keep all his notes in 1 place.
  • The reason I've heard repeated for not having a hierarchy:
    • It promotes unexpected connections b/w notes
    • A note isn't tied to a single category. It can live in multiple places
  • Is that a good enough reason to let go of the organization benefits of a hierarchical structure though?
    • It's much easier to break down browsing notes to find connections if you can "save your place" in your notes. I can look through the notes in 5 folders today, and go through the 5 others later.
    • Notes about the same thing fit differently in different categories. I can take 1 idea and have it fit differently in my notes on parenting and in my notes on productivity. It's easy to copy and link notes together with our digital systems. I can even symlink the same note to multiple places in the hierarchy.
  • Zettelkasten also has index notes. Folders essentially serve the same purpose.
  • Questions I have
    • Did Luhmann succeed because of his slip box, or in spite of it? Are there other examples of successful writers using such a system?
    • Are slip boxes supposed (who decides?) to hold notes on disparate topics?
    • How do people with large slip boxes navigate? Do they use index cards? If so, can I use folders to serve the same purpose as well as reduce the mental load of browsing through my knowledge base?
      • How did Luhmann manage his slip box with 90K notes? I can't imagine he went through all 90K notes every time he added a new one.
  • I imagine most courses around "building a 2nd brain" or "cultivating your knowledge garden" answer this question. I haven't taken any yet, but I plan to. That knowledge might change my perspective on this.
  • Roam/Logseq vs. Obsidian
    • Logseq is very structured. Everything is a bullet point. This provides a number of benefits to me:
      • Writing is much easier. With a forced structure in place, I can think in outlines and short paragraphs, without getting into the weed of how to structure my writing.
      • If every thing is a block, I can easily reference other blocks inline with my writing.
      • I can write queries that show me subsets of my blocks. For example I have a query to show me all notes with a "status:: incomplete" tag on my dashboard.
    • The feature I like most about Logseq is the daily journal. I can write everything there and have it show up in different places. I was attracted to Roam due to this as well.
      • It's great for gathering data and thinking, but didn't work out for me when consolidating and keeping long term.
    • Obsidian offers free form writing. It's not as great to think in, but I feel that it offers a better experience once the thinking is done and I'm writing the final artifact.
    • Artifacts can be permanent notes, but also blog posts and other long form pieces.
    • Here's what I'm thinking of adapting
      • Temporary notes go in Logseq. These are fleeting notes, literature notes, my daily journals with todo items.
      • Once I've gathered data and thought about it in Logseq, the final artifacts go into Obsidian, where they are neatly categorized in folders for easy browsing. I can then link b/w notes in Obsidian as well, using the folder hierarchy to easy searching.
      • Because doing the same thing that I did in Roam again in Logseq isn't useful. I'll end up repeating the same situation where I get to a few 100 notes and then declare bankruptcy.

How Horo timer educates users

Horo is a simple menu bar timer app for the Mac.

I use it like a pomodoro timer, but without limiting myself to just 25m blocks. It's a very small application, but one that provides a lot of value for my workflow.

I recently noticed the really smart way in which it trains the user in its features, and I believe it has some great lessons for software developers and UX designers.

The challenge of educating users

Educating users about your apps features is the kind of problem that is often overlooked when starting a new project. Indie developers often don't think about it until the app is complete. It's not something that is often seen as an "interesting problem to solve".

Like marketing however, which is often relegated to the "boring business things to do at the end" category, user education can often have a large impact on the success of your application. If you are unable to teach users how to get value out of you application, users will often stop using it – just because they didn't know it could solve their problem. It's must be frustrating to hear users stop using your application because they thought it didn't have a feature that you spent hours working on.

The usual solutions

  • User guide or manual
  • Walkthrough videos
  • In-product walkthrough or popups (like Clippy from Microsoft Office)

All of these require effort from the user. The user has to spend time:

  • Reading the manual.
  • Watching the videos.
  • Going through the demo.

Most users won't want to invest the time upfront in reading the manual or watching product videos. They'll only search for solution when they encounter a roadblock. As a side note, having a searchable manual is a must so that users can search for solutions when they want to. Product videos are worst than a manual in my opinion because users can't skim through easily to the stuff they want.

In-product walkthroughs sound like a good idea. The user is in your application and has all the context in-front of them, so no need to jump b/w the manual and your app. Anecdotal evidence however suggests that users will usually skip demos, or forget the lessons afterwards. I honestly don't believe in-product walkthroughs solve any problem other than discovery – you can tell users about new or unique features of you application and they might remember to use it later. 

The Horo solution

You might have noticed in the screenshot at the beginning the faded text "@9:30am". This is one of the placeholders that you see when you open Horo from the menubar.

The placeholder is shown in the only input field that Horo has. The user has to enter the duration they want the timer to run for. Horo allows "Natural Language" input in the field. From it's description on the Mac App Store:

Some examples of Horo’s flexible Natural Language support:

  • “1:30:45” starts a timer for 1 hour, 30 minutes, and 45 seconds
  • “1.5h” starts a timer for 1 hour, 30 minutes
  • “45m” starts a 45 minute timer
  • “1h 15m” becomes an hour and fifteen minutes
  • “60s” will play a sound in a minute
  • Leaving the input blank will start a stopwatch
  • “@3pm” will set a countdown timer to go off at 3pm

While short, this is still a lot of possible formats. I would never be able to remember these just by reading the manual once. Luckily, Horo utilizes the placeholder in an intuitive way to educate users. Every time you open the app, you see a different example of the input formats you can use.

This is the first app I've used that I've noticed uses this way of educating the user. While the list of accepted formats is always available in the apps web page, presenting these formats in the placeholder does 2 things for the user:

  • Automatic discovery. The user intuitively builds a sense that they can use multiple formats for the input, and are shown a different format every time which educates them on the possibilities.
  • Spaced repetition. By showing the formats repeatedly, the user ends up remembering more than they would just by reading a manual once. Spaced repetition is a well researched method of learning effectively, and I think Horo uses it to it's advantage quite well.

The takeaway

The lesson I learned is to look for these opportunities in the applications I build. A great place to educate users is:

  • Where you can unobtrusively add hints on usage. The placeholder does not get in the way yet delivers the information.
  • Where the user will see the hint multiple times during regular usage of the application.

Finding places in your applications where this strategy can be used might take time and some luck, but if you find these, I believe capitalizing on the opportunity will lead to a much more delightful user experience.

After all I ended up writing a whole blog post on how a simple timer app uses a placeholder. :)