Content
View differences
Updated by Hagen Mahnke 7 months ago
<br>
The import from and export to markdown is considered lossy, but we don't know what is lost in either direction.
Converting from BlockNote JSON to Y.doc binary should be lossless, but converting from Y.doc to BlockNote JSON will lose the editing history.
To check what will be lost and have data to confirm whether it's acceptable, build a proof of concept and run it with test data.
## Tasks
1. Build proof of concept for export from Y.doc to markdown
2. Build proof of concept for import from markdown to Y.doc
3. Build test data
1. export test data into markdown
2. import results from a into Y.doc
##
###
# LEARNINGS <br>
## Overview
Learnings about converting between BlockNote/Yjs documents and Markdown for API-based workflows, as well as relevant insights from TipTap's markdown implementation.
## BlockNote: Markdown Conversion for API
### Supported Operations
#### READ Document
* **Status**: Will work
* **Process**: Convert from Y.doc to markdown and return
* **Limitations**: Custom blocks need to include conversion method
* **Example**: See [serverside-conversion experiment](https://github.com/opf/op-blocknote-extensions/tree/experiments/experiments/serverside-conversion)
#### CREATE Document
* **Status**: Will work
* **Process**:
1. Receive markdown from API
2. Convert markdown to BlockNote JSON
3. Create new Y.doc from JSON
4. Store in database
#### UPDATE Document (with merging)
* **Status**: Does not work
* **Process**:
1. Receive markdown from API
2. Convert markdown to BlockNote JSON
3. Create new Y.doc from JSON
4. Merge into existing Y.doc from database
* **Issue**: Merging into existing Y.doc produces inconsistent results
* **Current State**:
* No straightforward solution available
* BlockNote maintainer (Yousef) suggests a diffing algorithm could solve this and is confident they can do it
* Requires deep BlockNote and Yjs expertise to implement
* **Examples**:
* [Server-side conversion experiment](https://github.com/opf/op-blocknote-extensions/tree/experiments/experiments/serverside-conversion)
* [Frontend-only experiment (no Yjs)](https://github.com/opf/op-blocknote-extensions/tree/experiments/experiments/markdown-conversion)
### Markdown Limitations
Multiple BlockNote features are **not supported** in markdown:
* Colors
* Complex indentation
* Custom block types (without additional work)
**Proposed Solution**: Inform users that certain features (colors, complex indentation) are not available in API-based workflows due to markdown conversion limitations.
### Preserving Y.doc History
Due to merge issues, we explored preserving Y.doc history when completely overwriting content with update request body.
#### Server-Side Approach (ServerBlockNoteEditor)
* **Status**: ❌ Not Working
* **Issue**: Paste functionality breaks deep in dependencies, assumed dependency on browser environment
* **Conclusion**: Not easily achievable on the server
#### Frontend Approach (BlockNoteEditor)
Because it's not working on the server, I tested it on the frontend to verify that it could work in principle.
* **Status**: ⚠️ Partially Tested
* **Implementation**:
* Use `editor.replaceBlocks(editor.document, [])` to clear current content
* Use `editor.pasteMarkdown()` to paste new content
* **Note**: History preservation was not verified, as I was unable to get Yjs or BlockNote to properly access the history or changes.
## TipTap: Markdown Conversion
### Overview
TipTap provides bi-directional conversion between markdown and TipTap JSON, with extensions for custom blocks.
### Limitations
Not all features can be translated to markdown (e.g., colors), similar to BlockNote.
### Custom Blocks Extension
* **Syntax**: Uses `:::` as marker for custom extensions
* **Example**: See [TipTap Markdown Full Demo](https://github.com/ueberdosis/tiptap/tree/6cdba33235224e3fcbb19f7e2a604e5118eb8d49/demos/src/Markdown/Full/React)
* Test with color extension to verify features that don't convert to Markdown
### Conversion Architecture
#### Markdown → TipTap JSON
```text
Markdown String
↓
Custom Tokenizers (identify custom syntax)
↓
Standard MarkedJS Lexer
↓
Markdown Tokens
↓
Extension Parse Handlers
↓
TipTap JSON
```
#### TipTap JSON → Markdown
```text
TipTap JSON
↓
Extension Render Handlers
↓
Markdown String
```
### Documentation
Full documentation available at: [https://tiptap.dev/docs/editor/markdown](https://tiptap.dev/docs/editor/markdown)
## Miscellaneous
### Export to markdown via HTML
BlockNote's export to markdown works by exporting to HTML first and then to markdown. Loss of information is happening in this step, see comment on
`createExternalHTMLExporter` function in [externalHTMLExporter](https://github.com/TypeCellOS/BlockNote/blob/main/packages/core/src/api/exporters/html/externalHTMLExporter.ts#L31).
```text
// Used to export BlockNote blocks and ProseMirror nodes to HTML for use outside
// the editor. Blocks are exported using the `toExternalHTML` method in their
// `blockSpec`, or `toInternalHTML` if `toExternalHTML` is not defined.
//
// The HTML created by this serializer is different to what's rendered by the
// editor to the DOM. This also means that data is likely to be lost when
// converting back to original blocks. The differences in the output HTML are:
// 1. It doesn't include the `blockGroup` and `blockContainer` wrappers meaning
// that nesting is not preserved for non-list-item blocks.
// 2. `li` items in the output HTML are wrapped in `ul` or `ol` elements.
// 3. While nesting for list items is preserved, other types of blocks nested
// inside a list are un-nested and a new list is created after them.
// 4. The HTML is wrapped in a single `div` element.
```
<br>
## Key Takeaways
1. **BlockNote UPDATE operations** have unresolved merge issues, should be omitted for now and instead use 'last-write-wins'
2. **Markdown conversion** is lossy for visual features (colors, complex formatting)
3. **Y.doc history preservation** during content replacement is difficult to verify/implement
4. **TipTap's architecture** provides a reference for handling custom blocks and extensions in markdown
5. **User communication** is important regarding API workflow limitations
## Next Steps / Open Questions
* [ ] Investigate diffing algorithm approach for BlockNote UPDATE merge operations
* [ ] Verify Y.doc history preservation when using `pasteMarkdown()` on frontend
* [ ] Determine if TipTap's architecture patterns can be applied to BlockNote
* [ ] Document user-facing limitations of API-based markdown workflows
The import from and export to markdown is considered lossy, but we don't know what is lost in either direction.
Converting from BlockNote JSON to Y.doc binary should be lossless, but converting from Y.doc to BlockNote JSON will lose the editing history.
To check what will be lost and have data to confirm whether it's acceptable, build a proof of concept and run it with test data.
## Tasks
1. Build proof of concept for export from Y.doc to markdown
2. Build proof of concept for import from markdown to Y.doc
3. Build test data
1. export test data into markdown
2. import results from a into Y.doc
##
###
# LEARNINGS
## Overview
Learnings about converting between BlockNote/Yjs documents and Markdown for API-based workflows, as well as relevant insights from TipTap's markdown implementation.
## BlockNote: Markdown Conversion for API
#### READ Document
* **Status**: Will work
* **Process**: Convert from Y.doc to markdown and return
* **Limitations**: Custom blocks need to include conversion method
* **Example**: See [serverside-conversion experiment](https://github.com/opf/op-blocknote-extensions/tree/experiments/experiments/serverside-conversion)
#### CREATE Document
* **Status**: Will work
* **Process**:
1. Receive markdown from API
2. Convert markdown to BlockNote JSON
3. Create new Y.doc from JSON
4. Store in database
#### UPDATE Document (with merging)
* **Status**: Does not work
* **Process**:
1. Receive markdown from API
2. Convert markdown to BlockNote JSON
3. Create new Y.doc from JSON
4. Merge into existing Y.doc from database
* **Issue**: Merging into existing Y.doc produces inconsistent results
* **Current State**:
* No straightforward solution available
* BlockNote maintainer (Yousef) suggests a diffing algorithm could solve this and is confident they can do it
* Requires deep BlockNote and Yjs expertise to implement
* **Examples**:
* [Server-side conversion experiment](https://github.com/opf/op-blocknote-extensions/tree/experiments/experiments/serverside-conversion)
* [Frontend-only experiment (no Yjs)](https://github.com/opf/op-blocknote-extensions/tree/experiments/experiments/markdown-conversion)
### Markdown Limitations
Multiple BlockNote features are **not supported** in markdown:
* Colors
* Complex indentation
* Custom block types (without additional work)
**Proposed Solution**: Inform users that certain features (colors, complex indentation) are not available in API-based workflows due to markdown conversion limitations.
### Preserving Y.doc History
Due to merge issues, we explored preserving Y.doc history when completely overwriting content with update request body.
#### Server-Side Approach (ServerBlockNoteEditor)
* **Status**: ❌ Not Working
* **Issue**: Paste functionality breaks deep in dependencies, assumed dependency on browser environment
* **Conclusion**: Not easily achievable on the server
#### Frontend Approach (BlockNoteEditor)
Because it's not working on the server, I tested it on the frontend to verify that it could work in principle.
* **Status**: ⚠️ Partially Tested
* **Implementation**:
* Use `editor.replaceBlocks(editor.document, [])` to clear current content
* Use `editor.pasteMarkdown()` to paste new content
* **Note**: History preservation was not verified, as I was unable to get Yjs or BlockNote to properly access the history or changes.
## TipTap: Markdown Conversion
### Overview
TipTap provides bi-directional conversion between markdown and TipTap JSON, with extensions for custom blocks.
### Limitations
Not all features can be translated to markdown (e.g., colors), similar to BlockNote.
### Custom Blocks Extension
* **Syntax**: Uses `:::` as marker for custom extensions
* **Example**: See [TipTap Markdown Full Demo](https://github.com/ueberdosis/tiptap/tree/6cdba33235224e3fcbb19f7e2a604e5118eb8d49/demos/src/Markdown/Full/React)
* Test with color extension to verify features that don't convert to Markdown
### Conversion Architecture
#### Markdown → TipTap JSON
```text
Markdown String
↓
Custom Tokenizers (identify custom syntax)
↓
Standard MarkedJS Lexer
↓
Markdown Tokens
↓
Extension Parse Handlers
↓
TipTap JSON
```
#### TipTap JSON → Markdown
```text
TipTap JSON
↓
Extension Render Handlers
↓
Markdown String
```
### Documentation
Full documentation available at: [https://tiptap.dev/docs/editor/markdown](https://tiptap.dev/docs/editor/markdown)
## Miscellaneous
### Export to markdown via HTML
BlockNote's export to markdown works by exporting to HTML first and then to markdown. Loss of information is happening in this step, see comment on
`createExternalHTMLExporter` function in [externalHTMLExporter](https://github.com/TypeCellOS/BlockNote/blob/main/packages/core/src/api/exporters/html/externalHTMLExporter.ts#L31).
```text
// Used to export BlockNote blocks and ProseMirror nodes to HTML for use outside
// the editor. Blocks are exported using the `toExternalHTML` method in their
// `blockSpec`, or `toInternalHTML` if `toExternalHTML` is not defined.
//
// The HTML created by this serializer is different to what's rendered by the
// editor to the DOM. This also means that data is likely to be lost when
// converting back to original blocks. The differences in the output HTML are:
// 1. It doesn't include the `blockGroup` and `blockContainer` wrappers meaning
// that nesting is not preserved for non-list-item blocks.
// 2. `li` items in the output HTML are wrapped in `ul` or `ol` elements.
// 3. While nesting for list items is preserved, other types of blocks nested
// inside a list are un-nested and a new list is created after them.
// 4. The HTML is wrapped in a single `div` element.
```
<br>
## Key Takeaways
1. **BlockNote UPDATE operations** have unresolved merge issues, should be omitted for now and instead use 'last-write-wins'
2. **Markdown conversion** is lossy for visual features (colors, complex formatting)
3. **Y.doc history preservation** during content replacement is difficult to verify/implement
4. **TipTap's architecture** provides a reference for handling custom blocks and extensions in markdown
5. **User communication** is important regarding API workflow limitations
## Next Steps / Open Questions
* [ ] Investigate diffing algorithm approach for BlockNote UPDATE merge operations
* [ ] Verify Y.doc history preservation when using `pasteMarkdown()` on frontend
* [ ] Determine if TipTap's architecture patterns can be applied to BlockNote
* [ ] Document user-facing limitations of API-based markdown workflows