Blog Company

What We Learned Building Blocks

The decisions and architectural trade-offs behind the Blocks platform — and what operations teams actually needed from a workflow tool that existing automation products couldn't deliver.

Michal LupuMarch 24, 20266 min read

Why a Sabbatical, Not a Hackathon

The first version of Blocks didn't start with a pitch deck or a product brief. It started with a problem I kept running into over several years building internal automation tools at a B2B SaaS company in Tel Aviv. The problem wasn't that we lacked automation tools — we had Zapier, we had custom scripts, we had Jira workflows. The problem was that every multi-step process that involved both people and systems required engineering to wire together, and the moment the process changed (which it always did), the wire-up broke and nobody wanted to touch it.

The pattern was specific enough to be frustrating: a sales approval required a human decision, but the data context for that decision was in three different systems. A customer onboarding had automated steps and human steps, but they lived in entirely separate tools with no shared state. An incident triage process had an AI classification step bolted onto the side via a separate script, but it wasn't part of the workflow — it was a preprocessing layer that the workflow didn't know about and whose outputs didn't affect workflow routing.

After years of watching this pattern repeat, I took six months off to build a prototype from first principles. Not a startup idea I was shopping around — just a tool I wanted to use. The sabbatical context mattered: there was no investor pressure to define the market size, no growth target to hit in 18 months, just enough time to understand whether the core problem was solvable with the architecture I had in mind.

What the Prototype Revealed

The first working version of Blocks had three things: a canvas-based workflow editor where you could drag nodes and connect them, a simple task inbox where human review tasks appeared with context from upstream steps, and an LLM routing layer that let you configure a prompt and an output schema for an AI agent step and have its output feed into the next step as a structured variable.

It was rough. The canvas was functional but not polished. The LLM routing layer had no error handling — if the model returned malformed output, the workflow silently failed. The task inbox was a web page, not a Slack integration. But the core mechanism worked: you could design a multi-step process that involved a human decision, an AI classification, and an automated integration action, and run it as a single coherent workflow where all three actors participated in sequence, with the context from each step available to the next.

I shared the prototype with five operations leads at Tel Aviv-area startups — people I knew from the local B2B SaaS community who had mentioned variations of the same operational pain. I expected to hear feature requests. What I heard instead was: "This is the thing that doesn't exist. When can I actually use it?"

That reaction was clarifying in a specific way. It told me the problem was real and specific enough that people recognized the solution immediately. It also told me that the three people who wanted early access weren't looking for a fully polished product — they were looking for the mechanism. The workflow canvas plus the human inbox plus the LLM routing layer was enough of a complete thing to be recognizable as a platform, not a feature set.

What the Early Users Actually Taught Us

The three early users who came in from that initial group shaped the first production release more than I would have predicted. Not through structured feedback sessions — through actually trying to build their own workflows and running into the edges of what the prototype could do.

The clearest lesson: the human task inbox was underspecified. In the prototype, the task inbox showed the task and the upstream context. What it didn't do was give the reviewer the right decision options for their specific workflow step. One of the early users had a workflow where the reviewer's options were Approve, Reject, and "Request Revision from the submitter" — not a binary approve/reject. The prototype had no way to configure custom action options per step. Building that configuration surface — making the human task node's decision options as configurable as any other node property — was the first feature that went from prototype to production because of early user input, not internal design.

The second lesson: SLA timers were not optional. Every early user who deployed a workflow that involved a human approval step ran into the same problem within the first week: tasks that sat unresponded-to, with no automatic escalation. I had thought about SLA timers as a "nice to have" feature for a later release. The early users made it clear that without SLA timers with configured escalation paths, the human task routing wasn't reliable enough to use in a real business process. That feature went into the production release before any of the polishing work I had planned.

The third lesson was about workflow versioning — specifically, the terrifying experience of editing a workflow definition that had running instances. One early user changed a node configuration on a live workflow and disrupted two instances that were mid-flight. After that incident, proper version management (draft vs. published, new instances on the new version, running instances continue on the old version) became a non-negotiable architectural requirement before any broader launch.

The Market Question We Kept Getting Wrong

For the first few months after the prototype, we described Blocks as a "workflow automation platform." That positioning attracted interest from the wrong audiences: people looking for a Zapier alternative, people looking for a better BPMN tool, people looking for a no-code app builder. All of those people would look at Blocks and say "interesting, but I already have X for that."

The positioning that resonated — which we arrived at slowly, through many conversations where we watched operations leads' faces change when we got to the human-AI coordination part — was "the platform for workflows that need both human judgment and AI execution." The category is specific: not all workflows, not AI-only workflows, not human-only task management. Workflows that require both, coordinated in the same execution model, with shared state and a common audit trail.

That specificity turned out to be a feature, not a limitation. Operations teams at growing software companies have that kind of workflow — contract review, customer onboarding, security incident response, vendor approval — and they've been solving it badly with combinations of email, separate automation tools, and manual tracking spreadsheets. When we described the specific problem to them, they would stop us mid-sentence: "Yes. That is exactly the problem." That recognition is what made early revenue possible before we had a polished product.

What We'd Do Differently

The honest answer to what I'd do differently is: build the error handling and observability layer first, not last. The prototype had almost no error handling — malformed LLM outputs silently dropped, integration failures surfaced as stuck workflow instances with no explanation, SLA timer fires had no logging. We spent a significant fraction of the first production release period retrofitting error handling and an operations dashboard into a codebase that wasn't designed for them from the start.

Building for observability from day one — structured logging at every step transition, explicit error states with recoverable versus non-recoverable classifications, and a real-time dashboard that shows instance state across all running workflows — would have saved several weeks of debugging production issues and would have made the early user conversations more useful. When you can show an operations lead exactly what happened in a workflow that failed, the feedback you get is precise. When the failure is opaque, the feedback is "something broke."

The sabbatical gave us something that most startup timelines don't: the space to understand the problem before committing to an architecture. That space is the reason Blocks has a persistent state machine at its core rather than a simpler event-driven design that would have been faster to build but would have hit ceilings on the kinds of workflows it could support. Some architectural decisions are hard to change later. Getting the state machine model right at the start was the most consequential decision we made, and we made it because we had time to think through the implications rather than rushing to a demo.

Why a Sabbatical, Not a Hackathon

What the Prototype Revealed

What the Early Users Actually Taught Us

The Market Question We Kept Getting Wrong

What We'd Do Differently

Try what we built.