Content
Updated by Sebastian Fiedlschuster over 1 year ago
Pull Request: [https://github.com/opf/openproject/pull/11243](https://github.com/opf/openproject/pull/11243)
In the context of the [baseline-comparison feature](https://community.openproject.org/projects/openproject/work_packages/26448), this pull request adds convenience methods to retrieve historic data of journable objects like work packages.
As the convenience methods introduced in this pull request are not used, this pull request does not change behaviour of the application, yet.
### Retrieve historic state of one journable
```ruby
WorkPackage.find(1).at_timestamp(1.year.ago)
```
This historic state of the work package is still a `WorkPackage` object, but it contains data from the `work_package_journals` table. It knows that it represents historic data and, therefore, cannot be modified.
```ruby
work_package = WorkPackage.find(1)
historic_work_package = work_package.at_timestamp(1.year.ago)
historic_work_package.id # => 1 (same as work_package.id)
historic_work_package.historic? # => true
historic_work_package.save # raises `ActiveRecord::ReadOnlyRecord`
```
### Rollback to historic state of one journable
```ruby
WorkPackage.find(1).at_timestamp(1.year.ago).rollback!
```
This updates the work-package record with the attributes from the historic record, which does not remove journal entries, but, in fact, creates a new journal entry.
```ruby
WorkPackage.find(1).update! WorkPackage.find(1).at_timestamp(1.year.ago).attributes
```
### Retrieve historic state of multiple journables
```ruby
WorkPackage.at_timestamp(1.year.ago)
```
This returns an `ActiveRecord::Relation` that collects `WorkPackage` objects, which contain data from the `work_package_journals` table. These objects know that they represent historic data and, therefore, cannot be modified.
### Filter on historic data
```ruby
WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1)
# or, equivalently:
WorkPackage.where(assigned_to_id: 1).at_timestamp(1.year.ago)
```
This returns an `ActiveRecord::Relation` that represents all work packages that have been assigned to `User.find(1)` one year ago, regardless of whom they are assigned now.
### Plucking on historic data
```ruby
WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1).pluck :id
```
This returns the IDs of all `WorkPackage` records that have been assigned to `User.find(1)` one year ago.
These IDs are the IDs of the work-package records, not the work-package-journal records. Therefore, when fetching the work-package records from the database using these IDs, this will fetch the records in their current state, not the historic state.
```ruby
work_package_ids = WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1).pluck :id
work_packages = WorkPackage.where(id: work_package_ids)
work_packages.first.historic? # => false
```
### Sub queries
```ruby
sub_query = WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1)
WorkPackage.where(id: sub_query)
```
This returns the IDs of all `WorkPackage` records that have been assigned to `User.find(1)` one year ago. As this is using a sub query, which is an `ActiveRecord::Relation`, this only hits the database once.
### Advances queries
```ruby
WorkPackage.at_timestamp(1.year.ago).maximum(:estimated_hours)
```
This finds largest `estimated_hours` considering all work packages in their states one year ago.
```ruby
WorkPackage.at_timestamp(1.year.ago).group(:assigned_to_id).count # => {1 => 15, ...}
```
This counts how many work packages have been assigned to each user one year ago.
```ruby
WorkPackage.at_timestamp(1.year.ago).group(:assigned_to_id).having('estimated_hours > 3').count # => {1 => 12, ...}
```
This counts how many work packages have been assigned to each user one year ago, but only considers work packages with hour estimates over 3 hours.
### Cavests
#### Filtering on associated tables
When adding where conditions on associated tables, one might expect to also filter on historic data.
```ruby
# Does not work:
WorkPackage.at_timestamp(1.year.ago).joins(:time_entries).where(time_entries: {hours: 1.0})
```
Because `WorkPackage.at_timestamp` only searches the journal of the work packages, not the journal of the time entries, this does **not** return work packages that had time entries of 1 hour one year ago, but instead returns the state of the work packages of one year ago that do now have time entries of 1 hour.
#### Manual selects
Watch out when using the `id` column in manually specified parts of the query.
The wrapper introduced with `at_timestamp` internally modifies the active-record relation to query the work-package-journals table rather than the work-packages table. So, for example, when running
```ruby
WorkPackage.at_timestamp(1.day.ago).select("work_pacakges.id")
# raises `ActiveRecord::StatementInvalid: PG::UndefinedTable`
```
you would get an error because the `work_packges` table is not know to that query.
As a workaround, you may use the column `journables.id`, which is induced by the wrapper:
```ruby
WorkPackage.at_timestamp(1.day.ago).select("journables.id")
```
### Baseline comparison
Consider this pseudo API call to `GET /api/v3/work_packages` with the following parameters:
- `author_id: 1` specifies the author user of the work packges
- `offset: 2` specifies the page number of the paginated result
- `sortBy: [["status", "asc"]]` specifies the sort order to the result
- `timestamps: ["2021-09-04T13:21:14.452+00:00", "2022-09-04T13:21:14.452+00:00"]` specifies the timestamps at which historic data should be searched
The result should contain
- the work packages that match the filter `author_id` at **any** of the given `timestamps`
- the work-package data at **all** given `timestamps` in order for the user interface to show what has changed comparing the data at those `timestamps`.
Using the convenience methods introduced by this pull request, this is still a several-steps process:
```ruby
# given: params[:author_id], params[:offset], params[:sort_by], params[:timestamps]
# Apply filters
work_packages_relation = WorkPackage \
.where(author_id: params[:author_id])
# Get IDs of work packages matching the filters at any of the given timestamps.
work_packages_for_all_timestamps = params[:timestamps].collect do |timestamp|
work_packages_relation.at_timestamp(timestamp)
end.reduce(:or)
# Order the result
ordered_work_packages_relation = WorkPackage \
.where(id: work_packages_for_all_timestamps) \
.order(params[:sort_by])
# Paginate the result
paginated_work_packages_relation = \
ordered_work_packages_relation \
.paginate(page: params[:offset])
# Of all the matching work packages on that page, retrieve the historic attributes
# at all of the given timestamps.
paginated_work_packages_by_timestamp = {}
params[:timestamps].each do |timestamp|
paginated_work_packages_by_timestamp[timestamp] = WorkPackage \
.where(id: paginated_work_packages_relation) \
.at_timestamp(timestamp)
end
```
### See also
- **Epic**: https://community.openproject.org/projects/openproject/work_packages/26448
- https://github.com/fiedl/openproject/issues/3
In the context of the [baseline-comparison feature](https://community.openproject.org/projects/openproject/work_packages/26448), this pull request adds convenience methods to retrieve historic data of journable objects like work packages.
As the convenience methods introduced in this pull request are not used, this pull request does not change behaviour of the application, yet.
### Retrieve historic state of one journable
```ruby
WorkPackage.find(1).at_timestamp(1.year.ago)
```
This historic state of the work package is still a `WorkPackage` object, but it contains data from the `work_package_journals` table. It knows that it represents historic data and, therefore, cannot be modified.
```ruby
work_package = WorkPackage.find(1)
historic_work_package = work_package.at_timestamp(1.year.ago)
historic_work_package.id # => 1 (same as work_package.id)
historic_work_package.historic? # => true
historic_work_package.save # raises `ActiveRecord::ReadOnlyRecord`
```
### Rollback to historic state of one journable
```ruby
WorkPackage.find(1).at_timestamp(1.year.ago).rollback!
```
This updates the work-package record with the attributes from the historic record, which does not remove journal entries, but, in fact, creates a new journal entry.
```ruby
WorkPackage.find(1).update! WorkPackage.find(1).at_timestamp(1.year.ago).attributes
```
### Retrieve historic state of multiple journables
```ruby
WorkPackage.at_timestamp(1.year.ago)
```
This returns an `ActiveRecord::Relation` that collects `WorkPackage` objects, which contain data from the `work_package_journals` table. These objects know that they represent historic data and, therefore, cannot be modified.
### Filter on historic data
```ruby
WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1)
# or, equivalently:
WorkPackage.where(assigned_to_id: 1).at_timestamp(1.year.ago)
```
This returns an `ActiveRecord::Relation` that represents all work packages that have been assigned to `User.find(1)` one year ago, regardless of whom they are assigned now.
### Plucking on historic data
```ruby
WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1).pluck :id
```
This returns the IDs of all `WorkPackage` records that have been assigned to `User.find(1)` one year ago.
These IDs are the IDs of the work-package records, not the work-package-journal records. Therefore, when fetching the work-package records from the database using these IDs, this will fetch the records in their current state, not the historic state.
```ruby
work_package_ids = WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1).pluck :id
work_packages = WorkPackage.where(id: work_package_ids)
work_packages.first.historic? # => false
```
### Sub queries
```ruby
sub_query = WorkPackage.at_timestamp(1.year.ago).where(assigned_to_id: 1)
WorkPackage.where(id: sub_query)
```
This returns the IDs of all `WorkPackage` records that have been assigned to `User.find(1)` one year ago. As this is using a sub query, which is an `ActiveRecord::Relation`, this only hits the database once.
### Advances queries
```ruby
WorkPackage.at_timestamp(1.year.ago).maximum(:estimated_hours)
```
This finds largest `estimated_hours` considering all work packages in their states one year ago.
```ruby
WorkPackage.at_timestamp(1.year.ago).group(:assigned_to_id).count # => {1 => 15, ...}
```
This counts how many work packages have been assigned to each user one year ago.
```ruby
WorkPackage.at_timestamp(1.year.ago).group(:assigned_to_id).having('estimated_hours > 3').count # => {1 => 12, ...}
```
This counts how many work packages have been assigned to each user one year ago, but only considers work packages with hour estimates over 3 hours.
### Cavests
#### Filtering on associated tables
When adding where conditions on associated tables, one might expect to also filter on historic data.
```ruby
# Does not work:
WorkPackage.at_timestamp(1.year.ago).joins(:time_entries).where(time_entries: {hours: 1.0})
```
Because `WorkPackage.at_timestamp` only searches the journal of the work packages, not the journal of the time entries, this does **not** return work packages that had time entries of 1 hour one year ago, but instead returns the state of the work packages of one year ago that do now have time entries of 1 hour.
#### Manual selects
Watch out when using the `id` column in manually specified parts of the query.
The wrapper introduced with `at_timestamp` internally modifies the active-record relation to query the work-package-journals table rather than the work-packages table. So, for example, when running
```ruby
WorkPackage.at_timestamp(1.day.ago).select("work_pacakges.id")
# raises `ActiveRecord::StatementInvalid: PG::UndefinedTable`
```
you would get an error because the `work_packges` table is not know to that query.
As a workaround, you may use the column `journables.id`, which is induced by the wrapper:
```ruby
WorkPackage.at_timestamp(1.day.ago).select("journables.id")
```
### Baseline comparison
Consider this pseudo API call to `GET /api/v3/work_packages` with the following parameters:
- `author_id: 1` specifies the author user of the work packges
- `offset: 2` specifies the page number of the paginated result
- `sortBy: [["status", "asc"]]` specifies the sort order to the result
- `timestamps: ["2021-09-04T13:21:14.452+00:00", "2022-09-04T13:21:14.452+00:00"]` specifies the timestamps at which historic data should be searched
The result should contain
- the work packages that match the filter `author_id` at **any** of the given `timestamps`
- the work-package data at **all** given `timestamps` in order for the user interface to show what has changed comparing the data at those `timestamps`.
Using the convenience methods introduced by this pull request, this is still a several-steps process:
```ruby
# given: params[:author_id], params[:offset], params[:sort_by], params[:timestamps]
# Apply filters
work_packages_relation = WorkPackage \
.where(author_id: params[:author_id])
# Get IDs of work packages matching the filters at any of the given timestamps.
work_packages_for_all_timestamps = params[:timestamps].collect do |timestamp|
work_packages_relation.at_timestamp(timestamp)
end.reduce(:or)
# Order the result
ordered_work_packages_relation = WorkPackage \
.where(id: work_packages_for_all_timestamps) \
.order(params[:sort_by])
# Paginate the result
paginated_work_packages_relation = \
ordered_work_packages_relation \
.paginate(page: params[:offset])
# Of all the matching work packages on that page, retrieve the historic attributes
# at all of the given timestamps.
paginated_work_packages_by_timestamp = {}
params[:timestamps].each do |timestamp|
paginated_work_packages_by_timestamp[timestamp] = WorkPackage \
.where(id: paginated_work_packages_relation) \
.at_timestamp(timestamp)
end
```
### See also
- **Epic**: https://community.openproject.org/projects/openproject/work_packages/26448
- https://github.com/fiedl/openproject/issues/3