TL;DR
- PartyKit gave us a working real-time room in ~150 lines of TypeScript
- Firebase was overkill and persists data you don't want; raw WebSockets put you on the hook for everything
- Free to host: PartyKit on Cloudflare Workers, frontend on GitHub Pages
- Gotchas: ephemeral state, no auth model, full-state broadcasts
The Problem
We had a Salesforce team of five people, five different role titles, and no shared understanding of who was actually responsible for what. The usual solution is a RACI spreadsheet: Responsible, Accountable, Consulted, Informed, one row per activity.
The problem with spreadsheets is that everyone fills theirs out in isolation. You end up with five versions, a meeting to reconcile them, and an hour of defensive justifications before anyone changes anything. What you actually want is everyone filling in the same matrix at the same time, able to see where the disagreements are before the conversation starts.
So we built a tool for that. One URL, no login, everyone joins a room and assigns RACI values live. When two people have different answers for the same cell, the conflict shows up immediately. The conversation starts from the data, not from someone defending a spreadsheet they sent on Tuesday.
The question was how to build it without spending a week on infrastructure.
What We Built
The tool works like this: one person creates a room and gets a six-character code. They share the URL with the rest of the team. No accounts, no email verification, no OAuth flow. Everyone opens the link and types their name.
Inside the room, there is a matrix of activities across five domains: architecture, governance, platform operations, DevOps, and team delivery. Each row is an activity. Each column is a role. You click a cell to assign R, A, C, or I. Every answer you give is broadcast to everyone else in real time.
There is a compare view that shows all participants' answers side by side for any given activity. If three people say the architect is Responsible and one person says Accountable, you can see that before the meeting devolves into a debate about terminology.
Sessions are ephemeral. When everyone disconnects, the state is gone. That is intentional. A RACI workshop is a point-in-time alignment exercise, not a system of record.
Why Not Firebase?
Firebase was the first thing we looked at. It has real-time database sync, a generous free tier, and good TypeScript support. On paper, it fits.
In practice, it brought more than we needed. Firebase persists everything by default. For a workshop tool that should live for 90 minutes and disappear, that means you are writing cleanup logic before you write anything useful. You also need a Firebase project, anonymous auth configuration, security rules, and an SDK that adds meaningful weight to your bundle. None of that is hard, but it is all ceremony for a tool that is essentially a short-lived shared clipboard.
The bigger issue was the data model. Firebase wants you to think in documents and collections. A RACI matrix is a sparse key-value store: participant name, role ID, activity ID, answer. Mapping that onto Firestore in a way that fires the right listeners without over-fetching took more design time than the actual feature.
If you need any of that, Firebase makes sense. We didn't.
Why Not Raw WebSockets?
We briefly considered raw WebSockets, and even looked at running the workshop in Miro or Figma. Both felt like the wrong trade. Managing a whiteboard template in Miro meant spending workshop prep time fighting the tool instead of preparing the content. Raw WebSockets meant we owned the infrastructure instead of the product.
You need a server to run them on: a Node.js process, deployed somewhere, with a domain, with SSL termination. You need to manage room isolation yourself: a map of room codes to sets of connections. You need heartbeats to detect stale connections. You need reconnection handling on the client because WebSocket connections drop.
You also need to decide whether your server is stateful, and if it is, what happens when it restarts. If you ever want more than one server instance, you need sticky sessions or Redis pub/sub.
None of this is conceptually difficult. But it adds up. A realistic estimate for a raw WebSocket implementation that is production-solid is three to four days of work, plus an ongoing hosting cost and something to monitor. For an internal workshop tool used a few times a month, that trade-off does not make sense.
How PartyKit Solved It
PartyKit is built on Cloudflare Workers. The model is simple: each room is a Durable Object with its own isolated state. You write a TypeScript class. PartyKit handles everything else.
The entire backend is one class with three methods:
export default class RACIServer implements Party.Server {
state: RoomState = {
participants: {},
customActivities: [],
deletedActivityIds: [],
roles: [...DEFAULT_ROLES],
creatorName: "",
};
onConnect(conn: Party.Connection) {
conn.send(JSON.stringify({
type: "sync",
participants: this.state.participants,
customActivities: this.state.customActivities,
deletedActivityIds: this.state.deletedActivityIds,
roles: this.state.roles,
creatorName: this.state.creatorName,
}));
}
onMessage(message: string, sender: Party.Connection) {
const msg = JSON.parse(message);
// handle join, answer, add_activity, remove_role, etc.
this.broadcast();
}
broadcast() {
this.room.broadcast(JSON.stringify({
type: "sync",
...this.state,
}));
}
}
onConnect sends the new participant a full snapshot of current state. onMessage handles every event: joining, submitting an answer, adding a custom activity, removing a role. After any mutation, broadcast() sends the full state to everyone in the room.
Room isolation and connection tracking are both handled. There is no registration step, no project setup beyond a partykit.json file, and deployment is one command:
npx partykit deploy
This gives you a URL like raci-workshop.yourusername.partykit.dev. The frontend is a static HTML file on GitHub Pages. Total hosting cost: $0.
The config file tells PartyKit where to find the server:
{
"name": "raci-workshop",
"main": "party/index.ts"
}
Two lines. We moved on to building the UI.
What It Actually Saved Us
The server is 150 lines of TypeScript. That includes all message handling, role management, custom activities, and ephemeral state. There is no reconnection logic because PartyKit handles connection lifecycle. There is no cleanup job because the state disappears when the room empties. There is no SSL configuration, no load balancer, no deployment pipeline.
Against the Firebase path, we avoided: project setup, auth config, security rules, SDK integration, data schema design, and cleanup logic for expired sessions. A realistic estimate for a solid Firebase implementation of this feature set is 12 to 16 hours. The PartyKit version took a weekend, most of which was spent on the frontend matrix UI.
Against raw WebSockets, we avoided: server setup, hosting, SSL, room management, heartbeat logic, reconnection handling, and ongoing infrastructure cost. That is closer to 30 to 40 hours of work and an indefinite monthly cost for something that runs a few times a month.
There is no server to go down on a Saturday morning before a workshop.
The Gotchas
State is ephemeral by design, but that bites you in unexpected ways. We learned this the hard way across a couple of sessions. People would forget to export their answers before leaving, which meant starting from scratch next time. We also hit a case where someone had added custom activities during the session and those were not captured in the exported JSON at all. The export only contained the answers of the person who triggered it, not everyone in the room.
We fixed all three: the export now captures every participant's answers plus any custom activities added during the session, so anyone can export and the next session can pick up exactly where it left off. If I were to take this further, I would add explicit import support and probably rename the button to Save so it is obvious what it does rather than relying on people to remember to do it before they leave.
There is no auth model. Room access is controlled by a six-character code. Anyone who gets the URL can join. For internal team tools this is fine. For anything involving sensitive data, you would need to layer your own auth on top, which partially undercuts the simplicity.
Every change broadcasts full state. When one participant updates a single cell, the server sends the entire room state to every connected client. At workshop scale (5 to 20 people), this is imperceptible. At 200 concurrent participants with a large matrix, it would become a bandwidth and latency problem. If you are building something at that scale, you would want differential updates instead.
The server must be TypeScript or JavaScript. There is no Python or Go runtime option. If your team's backend expertise is not in JS/TS, that is a genuine constraint.
Cloudflare Workers constraints apply. CPU time per invocation is limited. Memory per instance is capped. For a message-passing server this never matters, but if you are tempted to do heavy computation in onMessage, that is the wrong place for it.
Where Else This Fits
The pattern generalises to any tool where a group needs to record their individual stance on a shared set of items and compare answers in real time. For Salesforce teams specifically, there are a few obvious candidates.
Sprint planning. Each team member estimates story points independently. When everyone has answered, the estimates appear side by side. The outliers drive the conversation instead of the loudest voice in the room.
Org health reviews. During an incident, multiple leads are looking at different parts of the org. A shared room where each person marks their area's status gives the incident lead a live picture without a status update chain.
Dependency mapping. When squads from different teams need to agree on who owns a shared component, a live matrix is faster than a shared doc that no one updates in time.
Release readiness checks. Each stream lead marks their go/no-go against a shared checklist. The release manager sees the full picture before the call, not during it.
Acceptance criteria coverage: product, dev, and QA each mark which criteria they believe are covered. Gaps surface before the sprint review, not during it.
None of these need a database. They just need everyone looking at the same thing at the same time.
Would We Use It Again?
Yes, for the right problem. PartyKit is the right call when you need real-time state sync for a small group, you do not need persistence, and you want to spend your time on the product rather than the infrastructure. The free tier covers internal tooling comfortably.
It is not the right call if you need server-side auth, persistent state, or compute-heavy processing on the server. And if you are building something for hundreds of concurrent users, you will want to think carefully about the broadcast model before you commit to it.
For a workshop tool used by 5 to 20 people a few times a month, it was the correct choice. The sessions we ran had no infrastructure issues. The problems we hit were product problems: people forgetting to export, custom tasks not making it into the saved state, one person's export not capturing everyone's answers. Those are solvable. A server going down mid-session is a different kind of problem entirely, and we never had to deal with it.