Site icon Inside Atlassian

Creating with Rovo: How We Built a Collaborative AI Canvas

Rovo is our hero AI solution at the core of our platform. Through the Teamwork Graph, it can search across internal and third‑party knowledge, surfacing the right information from tools like Confluence, Jira, and connected apps. It can also interact with and create objects across the ecosystem – from Jira issues to Confluence pages.

What it didn’t have was a fully featured creation canvas: a collaborative space where people and AI can co-create, iterate on, and refine rich content together in real time. That gap was the catalyst for Creating content with Rovo.

The core insight behind creating with Rovo was simple: Creation should be a collaboration between the user and the AI, not a handoff from one to the other.

This post goes behind the scenes of content creation with Rovo and covers:

https://atlassianblog.wpengine.com/wp-content/uploads/2026/04/create-with-rovo-rev3-edited-v4-3.mp4

Core principles

We didn’t want to build a standalone AI writing tool siloed inside a single product.

Creating with Rovo has an entry point in Confluence, but it’s natively built on Rovo and accessible across every Rovo surface. Creation should feel like a natural extension of the chat experience: you ask a question, co‑author a page, iterate on a whiteboard, and refine a database – all in the same conversation, with the same AI that understands your context.

That led to a few core design principles:

Technical implementation

At a high level, Rovo’s backend is built around a top‑level orchestrator agent with access to a set of specialized skills – cross‑product search (including 3rd party products), Teamwork Graph search, a Jira retrieval agent, and more. The orchestrator invokes these skills based on the user’s intent.

To support creating content with Rovo, we introduced a new Confluence creation and editing skill – a purpose‑built agent responsible for:

This agent is not isolated. It inherits Rovo’s full context: conversation history, connected knowledge sources, and relevant work items. A user can ask Rovo to:

“Create a project plan based on our Q3 goals,”

and the agent will pull the right context from the Teamwork Graph before generating the document.

Producing Confluence content with an LLM

Confluence content types have very different underlying representations and constraints:

To maximize quality, we needed to find the right model + output format for each content type. That meant running evaluations across different LLMs and formats, and ultimately landing on different approaches for each. This testing process yielded unexpected discoveries about current frontier LLM capabilities.

Pages / Live Docs

For pages, we needed to:

For page creation, we have the LLM produce ADF (nested JSON) directly, then pass it through a proprietary ADF repair library. This library:

For page edits, we defined a small set of editor‑style commands that the LLM can call to manipulate the ADF – for example:

Within each tool call, the LLM produces ADF as the value. Editing runs as an agentic loop with reflection: think of it as a coding agent, but for Atlassian documents.

https://atlassianblog.wpengine.com/wp-content/uploads/2026/04/edit-demo-2.mp4

Whiteboards

Whiteboards add another layer of complexity: spatial layout and visual semantics.

We evaluated a number of output formats, including:

We created an extensive dataset to use for content generation evals, and had human’s and LLMs (via output screenshots) judge the results. The judgements were centered around visual, connectors/grouping and layout quality. The ability to parse and stream the data also influenced the decision. Ultimately, SVG was the clear winner.

LLMs are trained on a vast amount of data. We found that SVGs drew the firmest parallel to how people think about infinite‑canvas boards (shapes, text, positions), and to what LLMs can reliably produce and understand.

For whiteboards we:

<svg viewBox="0 0 1200 800" xmlns="http://www.w3.org/2000/svg">
  <rect id="background" x="0" y="0" width="1200" height="800" fill="#ffffff"/>
  
  <text id="haiku-title" x="600" y="250" font-size="32" font-weight="bold">
    <tspan x="600" dy="0">Haiku</tspan>
  </text>
  
  <text id="haiku-text" x="600" y="350" font-size="24" font-style="italic">
    <tspan x="600" dy="0">Cherry blossoms fall</tspan>
    <tspan x="600" dy="40">Soft petals dance on the breeze</tspan>
    <tspan x="600" dy="40">Spring whispers goodbye</tspan>
  </text>
  
  <line id="decoration-line" x1="400" y1="520" x2="800" y2="520" stroke="#dfd8fd"/>

  <text id="haiku-form" x="600" y="580" text-anchor="middle" font-size="14">
    <tspan x="600" dy="0">5 - 7 - 5 syllables</tspan>
  </text>
</svg>

Ingesting the LLM chunks in realtime is handled by a streaming SVG parser in tandem with constraint solving algorithms to ensure containment and resolve layout overlaps. This allows users to see their Whiteboards be “assembled” in realtime.

https://atlassianblog.wpengine.com/wp-content/uploads/2026/04/whiteboard_streaming_small-2.mov

For editing, we:

We also introduce a special todo_list tool so the LLM lays out its plan before making changes. This simple pattern significantly improved quality for complex, multi‑step edits.

<todo_list>
1. Make all stickies red
2. Move all stickies to the left
3. Create more red stickies below
</todo_list>

Example output for editing – TODO lists come first, always.

Databases

Databases are built by the LLM as three separate CSVs (schema, views, data) wrapped in XML tags. This keeps schema, presentation, and data clearly separated and parseable as it is streamed in.

For creation, the model always produces three CSV sections wrapped in XML tags:

For edits, the model receives the database in this representation plus the user’s selection, then outputs two CSVs:

  1. Metadata changes – schema, views, filters, sorts
  2. Data changes – add/edit/delete rows

Each row in these CSVs represents a single, declarative change that downstream code can parse and apply reliably.

https://atlassianblog.wpengine.com/wp-content/uploads/2026/04/databae-edit-2.mp4

Streaming generated content to the client

Determining what the LLM should produce was only half the problem. The other half was delivering that output to the frontend – incrementally, in real time, and across multiple rendering surfaces.

Rovo Chat / Content iFrame communication in Rovo Canvas

The Canvas uses the shared Rovo Chat platform component that is used everywhere.

Rather than building a simplified preview, we render content with the same components used by the main Confluence experience, embedded in an iframe. That means the canvas has full feature parity with the content objects in Confluence.

To support this, we defined a new streaming API contract for LLM actions:

This has to work on Rovo‑supported surfaces. We used the same commands for creation/editing within the Rovo Canvas as used for editing Content in Confluence directly with Rovo.

Rovo Chat editing regular content in Confluence

From within the iFrame, the user can select text and ask for edits, which needs to be executed through the Rovo Chat component outside of the iFrame. Therefore, it was clear we needed a bidirectional communication channel between chat and content objects:

The result is the Rovo Bridge API – a library that lets distinct applications communicate with each other using local function calls, while abstracting away the underlying transport.

Under the hood:

Evaluations and monitoring

Offline evals were run daily using comprehensive datasets across all content types, leveraging a novel screenshot-based evaluation approach with LLM judges that assessed not just task completion, but visual quality, tone, and knowledge accuracy – consistently baselined against human feedback.

Online evals tracked success rate and task completion metrics on live customer and internal traffic, providing a real-world signal decoupled from frozen datasets.

For real-time reliability and quality monitoring, the team set up automated health checks that validated agentic flows, ensuring the right sub‑agents, tools, and content actions were invoked correctly, not just that a single high‑level API call succeeded. This is on top of extensive reliability monitoring dashboards and SLOs, with detectors for both sudden error spikes and degradations

Conclusion

Building for AI requires a shift in mindset – the industry is moving so fast, and so the code, prompts, model will continue to rapidly evolve. But a robust eval suite, extensive metrics for online experimentation, and comprehensive reliability monitoring are what allows rapid iteration with confidence.

Creating content with Rovo is a new foundational experience that will serve as the building block for upcoming Confluence AI features. This is only the beginning.

Exit mobile version