Content
View differences
Updated by Markus Kahl almost 3 years ago
## Requirements
An ideal replacement would also fulfil the following requirements.
**Multi queue workers with priorities**
Scenario:
* 3 workers
* 1 worker that has a lot of RAM and storage
* 2 workers only get little RAM
* the 2 workers shall be used for small tasks such as sending emails
* the 1 worker shall be used for expensive tasks such as work package exports or backups
* it should be able to handle the cheap jobs as a fallback if there are no expensive jobs available
* the small workers should not attempt these expensive jobs
* info: delayed\_job does not support this
**Bonus: avoid orphaned jobs**
With delayed\_job it can happen that a worker is killed, either during a deployment or due to exceeding the allowed memory usage. In that case the job becomes orphaned and will not be picked up again even after a restart.
An ideal replacement would be able to handle this scenario some way.
## Candidates
<img class="op-uc-image op-uc-image_inline" style="width:888px;" src="https://community.openproject.org/api/v3/attachments/59146/content">
1. Popularity.
2. Rails integration.
3. Complexity.
4. Maintainability.
5. Save resources/Make more scalable.
6. Have a fancy dashboard :)
* DelayedJob
1. Not actively developed for years.
2. +/-
3.
4. We have a couple of modification that have to be maintained.
1. Cron jobs scheduling
2. Job statuses.
5. Process-based. So, need more RAM to scale.
6. Not present
* Sidekiq
1. Popular and actively developed.
2. Some part of sidekiq API is not supported by ActiveJob.
3. Complexity will increase significantly.
1. Redis as a new dependency to take care of.
2. Backup functionality will become more complex.
1. saas-openproject
2. openproject
4. Maintainability will decrease due to complexity increasing :)
5. \+ because of multi threading
6. Present
* GoodJob
1. Quite popular and actively developed.
2. Positioned as fully compatible with ActiveJob.
3. Complexity will stay on the same level at least.
1. Backup functionality stays the same, because jobs are stored inside PostgreSQL
4. Maintainability will increase.
1. Cron functionality should become more simple, because library is responsible for scheduling and controlling recurrent jobs. Our job is to provide a schedule. So, some of our code will be removed.
2. Job status stays the same for now.
5. \+ because of multi threading
6. Present
### Prototype scripts for backing up part of the keys from Redis:
* Backup
```ruby
require 'benchmark'
redis = RedisClient.new(host: 'redis')
backup_script = <<~LUA
local result = {}
result['apartment'] = ARGV[1]
result['queues'] = {}
local queues = redis.call('KEYS', 'queue:*')
for index, queue in ipairs(queues) do
result['queues'][queue] = {}
local jobs_encoded = redis.call('LRANGE', queue, 0, -1)
for index, job_encoded in ipairs(jobs_encoded) do
local job_decoded = cjson.decode(job_encoded)
if job_decoded['args'][1]['schema_name'] == ARGV[1] then
table.insert(result['queues'][queue], job_decoded)
end
end
end
return cmsgpack.pack(result)
LUA
puts Benchmark.measure {
$backup_string = redis.call("EVAL", backup_script, 0, 'public')
}
```
* Restore
```ruby
require 'benchmark'
redis = RedisClient.new(host: 'redis', timeout: 10)
restore_script = <<~LUA
local backup_string = ARGV[1]
local backup_decoded = cmsgpack.unpack(backup_string)
local apartment = backup_decoded['apartment']
local queues = redis.call('KEYS', 'queue:*')
for index, queue in ipairs(queues) do
local jobs_encoded = redis.call('lrange', queue, 0, -1)
for index, job_encoded in ipairs(jobs_encoded) do
local job_decoded = cjson.decode(job_encoded)
if job_decoded['args'][1]['schema_name'] == apartment then
redis.call('lrem', queue, 1, job_encoded)
end
end
end
for queue, jobs in pairs(backup_decoded['queues']) do
for index, job in ipairs(jobs) do
redis.call('LPUSH', queue, cjson.encode(job))
end
end
return true
LUA
puts Benchmark.measure { redis.call("EVAL", restore_script, 0, $backup_string) }
```
An ideal replacement would also fulfil the following requirements.
**Multi queue workers with priorities**
Scenario:
* 3 workers
* 1 worker that has a lot of RAM and storage
* 2 workers only get little RAM
* the 2 workers shall be used for small tasks such as sending emails
* the 1 worker shall be used for expensive tasks such as work package exports or backups
* it should be able to handle the cheap jobs as a fallback if there are no expensive jobs available
* the small workers should not attempt these expensive jobs
* info: delayed\_job does not support this
**Bonus: avoid orphaned jobs**
With delayed\_job it can happen that a worker is killed, either during a deployment or due to exceeding the allowed memory usage. In that case the job becomes orphaned and will not be picked up again even after a restart.
An ideal replacement would be able to handle this scenario some way.
## Candidates
<img class="op-uc-image op-uc-image_inline" style="width:888px;" src="https://community.openproject.org/api/v3/attachments/59146/content">
1. Popularity.
2. Rails integration.
3. Complexity.
4. Maintainability.
5. Save resources/Make more scalable.
6. Have a fancy dashboard :)
* DelayedJob
1. Not actively developed for years.
2. +/-
3.
4. We have a couple of modification that have to be maintained.
1. Cron jobs scheduling
2. Job statuses.
5. Process-based. So, need more RAM to scale.
6. Not present
* Sidekiq
1. Popular and actively developed.
2. Some part of sidekiq API is not supported by ActiveJob.
3. Complexity will increase significantly.
1. Redis as a new dependency to take care of.
2. Backup functionality will become more complex.
1. saas-openproject
2. openproject
4. Maintainability will decrease due to complexity increasing :)
5. \+ because of multi threading
6. Present
* GoodJob
1. Quite popular and actively developed.
2. Positioned as fully compatible with ActiveJob.
3. Complexity will stay on the same level at least.
1. Backup functionality stays the same, because jobs are stored inside PostgreSQL
4. Maintainability will increase.
1. Cron functionality should become more simple, because library is responsible for scheduling and controlling recurrent jobs. Our job is to provide a schedule. So, some of our code will be removed.
2. Job status stays the same for now.
5. \+ because of multi threading
6. Present
### Prototype scripts for backing up part of the keys from Redis:
* Backup
```ruby
require 'benchmark'
redis = RedisClient.new(host: 'redis')
backup_script = <<~LUA
local result = {}
result['apartment'] = ARGV[1]
result['queues'] = {}
local queues = redis.call('KEYS', 'queue:*')
for index, queue in ipairs(queues) do
result['queues'][queue] = {}
local jobs_encoded = redis.call('LRANGE', queue, 0, -1)
for index, job_encoded in ipairs(jobs_encoded) do
local job_decoded = cjson.decode(job_encoded)
if job_decoded['args'][1]['schema_name'] == ARGV[1] then
table.insert(result['queues'][queue], job_decoded)
end
end
end
return cmsgpack.pack(result)
LUA
puts Benchmark.measure {
$backup_string = redis.call("EVAL", backup_script, 0, 'public')
}
```
* Restore
```ruby
require 'benchmark'
redis = RedisClient.new(host: 'redis', timeout: 10)
restore_script = <<~LUA
local backup_string = ARGV[1]
local backup_decoded = cmsgpack.unpack(backup_string)
local apartment = backup_decoded['apartment']
local queues = redis.call('KEYS', 'queue:*')
for index, queue in ipairs(queues) do
local jobs_encoded = redis.call('lrange', queue, 0, -1)
for index, job_encoded in ipairs(jobs_encoded) do
local job_decoded = cjson.decode(job_encoded)
if job_decoded['args'][1]['schema_name'] == apartment then
redis.call('lrem', queue, 1, job_encoded)
end
end
end
for queue, jobs in pairs(backup_decoded['queues']) do
for index, job in ipairs(jobs) do
redis.call('LPUSH', queue, cjson.encode(job))
end
end
return true
LUA
puts Benchmark.measure { redis.call("EVAL", restore_script, 0, $backup_string) }
```