Content

Updated by Andreas Pfohl 5 months ago

# Data architecture for hierarchical attributes

## Context and Problem Statement

What is the data architecture for serving a hierarchy of tag with associated metadata to an OpenProject custom field implementation?

## Decision Drivers

* The data architecture needs to structure tags in a hierarchical way (like a tree), where each tag has associated metadata.

* The structure can change at any point in time.

* Changes to the structure need to be recorded throughout the life-time.

* The data architecture must be capable to be used for filtering based on given tags.

* When the hierarchical structure changes, it must be possible to update pointers to it (the custom field).

* When the hierarchical structure changes, it must be possible to to let pointers point to "older" versions of the structure.

* Changes to the structure must be auditable.

## Considered Options

* Begin of the insertionSingle TableEnd of the insertion Begin of the deletion ~~Event sourced structure~~End of the deletion

* Begin of the insertionltree in PostgreSQLEnd of the insertion Begin of the deletion ~~Using paper trails~~End of the deletion

* Begin of the insertionReal graph database

* Event SourcingEnd of the insertion Begin of the deletion ~~No historic data is captured~~End of the deletion

## Decision Outcome

Chosen option: "{title of option 1}", because {justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.

### Consequences

* Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}

* Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}

* …

### Confirmation

{Describe how the implementation of/compliance with the ADR is confirmed. E.g., by a review or an ArchUnit test. Although we classify this element as optional, it is included in most ADRs.}

## Pros and Cons of the Options

### Begin of the insertionSingle TableEnd of the insertion Begin of the deletion ~~Event sourced structure~~End of the deletion

Begin of the insertion`id`End of the insertion Begin of the deletion ~~{example~~End of the deletion | Begin of the insertion`name`End of the insertion Begin of the deletion ~~description~~End of the deletion | Begin of the insertion`short`End of the insertion Begin of the deletion ~~pointer to more information~~End of the deletion | Begin of the insertion`parent_id` | (`child_ids`)End of the insertion Begin of the deletion …}End of the deletion

Begin of the insertionUsing a single table to hold the hierarchical structures.

End of the insertion * Good, because Begin of the insertionsimple implementation (Work packages and Project do this already)End of the insertion Begin of the deletion ~~{argument a}~~End of the deletion

* Good, because Begin of the insertionspeed is not a big concernEnd of the insertion Begin of the deletion ~~{argument b}~~End of the deletion

* Begin of the insertionBad,End of the insertion Begin of the deletion ~~Neutral,~~End of the deletion because Begin of the insertionhaving historical hierarchies is very hard to do (maybe copies of whole table parts, or: [https://wiki.postgresql.org/wiki/Temporal\_Extensions](https://wiki.postgresql.org/wiki/Temporal_Extensions))

### ltree in PostgreSQL

`ltree` is a method to have some tooling in PostgresSQL to query hierarchical structures: [https://www.postgresql.org/docs/current/ltree.html](https://www.postgresql.org/docs/current/ltree.html)

`root.parent.child.*`

* Good, because query language already thereEnd of the insertion Begin of the deletion ~~{argument c}~~End of the deletion

* Begin of the insertionGood, becuase speed is not a concern

* End of the insertion Bad, because Begin of the insertionmetadata like `short` needs to be encoded into the labelsEnd of the insertion Begin of the deletion ~~{argument d}~~End of the deletion

* Begin of the insertionBad, because no historic data per defaultEnd of the insertion Begin of the deletion …End of the deletion

### Begin of the insertionReal graph database

End of the insertion Using Begin of the insertiona real graph database would give us most the flexibilities needed: querying, metadataEnd of the insertion Begin of the deletion ~~paper trails~~End of the deletion

Begin of the deletion ~~{example | description | pointer to more information | …}~~

End of the deletion * Good, because Begin of the insertionit fits the tree as graph representation naturallyEnd of the insertion Begin of the deletion ~~{argument a}~~End of the deletion

* Good, because Begin of the insertionperformanceEnd of the insertion Begin of the deletion ~~{argument b}~~End of the deletion

* Begin of the insertionBad,End of the insertion Begin of the deletion ~~Neutral,~~End of the deletion because Begin of the insertionwe would need another running database just for thisEnd of the insertion Begin of the deletion ~~{argument c}~~End of the deletion

* Bad, because Begin of the insertionnoEnd of the insertion Begin of the deletion {argument d}

* …

### NoEnd of the deletion historic data Begin of the insertionper default (maybe with snapshots)

### Event sourced structureEnd of the insertion Begin of the deletion ~~is captured~~End of the deletion

Begin of the insertionWith Event Sourcing we wouldn't store complete trees in a table but rather record events that discribe the changes made to a tree.

In PostgresSQL we would have a table having a strcuture like: `id`End of the insertion Begin of the deletion ~~{example~~End of the deletion | Begin of the insertion`tree_id`End of the insertion Begin of the deletion ~~description~~End of the deletion | Begin of the insertion`event_type`End of the insertion Begin of the deletion ~~pointer to more information~~End of the deletion | Begin of the insertion`sequence_number` | `timestamp` | `data`.End of the insertion Begin of the deletion …}End of the deletion

Begin of the insertionFrom that table we could recreate any historical tree at any point in time. To speed things up, we would need to introduce certain read models.

End of the insertion * Good, Begin of the insertionbecuase it's the most flexible conceptEnd of the insertion Begin of the deletion ~~because {argument a}~~End of the deletion

* Good, Begin of the insertionbecuase it has historic data build it by defaultEnd of the insertion Begin of the deletion ~~because {argument b}~~End of the deletion

* Neutral, because Begin of the insertionperformance might be a concern, but can be mitigated with the use of read and write modelsEnd of the insertion Begin of the deletion ~~{argument c}~~End of the deletion

* Bad, because Begin of the insertionit's very complex to implementEnd of the insertion Begin of the deletion {argument d}

* …End of the deletion

## More Information

{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.} Begin of the deletion

### ~~Requirements~~

* ~~The structure must be able represent a tree, where every node has metadata, too.~~

* ~~Historical data~~

* ~~If the hierarchy is changed, it will result in conflict with already assigned values~~

* ~~We must be able to preserve historical value assignments, which also need access to the historical hierarchy~~

* ~~We must be able to update conflicting value assignments -&gt; Important: auditability (journals)~~

* ~~Filtering must be able to find values based on historical hierarchies (filter query language needed?)~~

### ~~IMPORTANT~~

* ~~we need to document all decisions, as there will be some heavy lifters~~End of the deletion

Back

Top Menu

Side Menu

Content