15692: Introduce background jobs (#16927)

* Introduce reusable BackgroundJob framework

A new abstract class can be used to implement job function classes. It
handles the necessary logic for starting and stopping jobs, including
exception handling and rescheduling of recurring jobs.

This commit also includes the migration of data source jobs to the new
framework.

* Restore using import_string for jobs

Using the 'import_string()' utility from Django allows the job script
class to be simplified, as module imports no longer need to avoid loops.
This should make it easier to queue and maintain jobs.

* Use SyncDataSourceJob for management command

Instead of maintaining two separate job execution logics, the same job
is now used for both background and interactive execution.

* Implement BackgroundJob for running scripts

The independent implementations of interactive and background script
execution have been merged into a single BackgroundJob implementation.

* Fix documentation of model features

* Ensure consitent code style

* Introduce reusable ScheduledJob

A new abstract class can be used to implement job function classes that
specialize in scheduling. These use the same logic as regular
BackgroundJobs, but ensure that they are only scheduled once at any given
time.

* Introduce reusable SystemJob

A new abstract class can be used to implement job function classes that
specialize in system background tasks (e.g. synchronization or
housekeeping). In addition to the features of the BackgroundJob and
ScheduledJob classes, these implement additional logic to not need to be
bound to an existing NetBox object and to setup job schedules on plugin
load instead of an interactive request.

* Add documentation for jobs framework

* Revert "Use SyncDataSourceJob for management"

This partially reverts commit db591d4. The 'run_now' parameter of
'enqueue()' remains, as its being used by following commits.

* Merge enqueued status into JobStatusChoices

* Fix logger for ScriptJob

* Remove job name for scripts

Because scripts are already linked through the Job Instance field, the
name is displayed twice. Removing this reduces redundancy and opens up
the possibility of simplifying the BackgroundJob framework in future
commits.

* Merge ScheduledJob into BackgroundJob

Instead of using separate classes, the logic of ScheduledJob is now
merged into the generic BackgroundJob class. This allows reusing the
same logic, but dynamically deciding whether to enqueue the same job
once or multiple times.

* Add name attribute for BackgroundJob

Instead of defining individual names on enqueue, BackgroundJob classes
can now set a job name in their meta class. This is equivalent to other
Django classes and NetBox scripts.

* Drop enqueue_sync_job() method from DataSource

* Import ScriptJob directly

* Relax requirement for Jobs to reference a specific object

* Rename 'run_now' arg on Job.enqueue() to 'immediate'

* Fix queue lookup in Job enqueue

* Collapse SystemJob into BackgroundJob

* Remove legacy JobResultStatusChoices

ChoiceSet was moved to core in 40572b5.

* Use queue 'low' for system jobs by default

System jobs usually perform low-priority background tasks and therefore
can use a different queue than 'default', which is used for regular jobs
related to specific objects.

* Add test cases for BackgroundJob handling

* Fix enqueue interval jobs

As the job's name is set by enqueue(), it must not be passed in handle()
to avoid duplicate kwargs with the same name.

* Honor schedule_at for job's enqueue_once

Not only can a job's interval change, but so can the time at which it is
scheduled to run. If a specific scheduled time is set, it will also be
checked against the current job schedule. If there are any changes, the
job is rescheduled with the new time.

* Switch BackgroundJob to regular methods

Instead of using a class method for run(), a regular method is used for
this purpose. This gives the possibility to add more convenience methods
in the future, e.g. for interacting with the job object or for logging,
as implemented for scripts.

* Fix background tasks documentation

* Test enqueue in combination with enqueue_once

* Rename background jobs to tasks (to differentiate from RQ)

* Touch up docs

* Revert "Use queue 'low' for system jobs by default"

This reverts commit b17b2050df.

* Remove system background job

This commit reverts commits 4880d81 and 0b15ecf. Using the database
'connection_created' signal for job registration feels a little wrong at
this point, as it would trigger registration very often. However, the
background job framework is prepared for this use case and can be used
by plugins once the auto-registration of jobs is solved.

* Fix runscript management command

Defining names for background jobs was disabled with fb75389. The
preceeding changes in 257976d did forget the management command.

* Use regular imports for ScriptJob

* Rename BackgroundJob to JobRunner

---------

Co-authored-by: Jeremy Stretch <jstretch@netboxlabs.com>
This commit is contained in:
Alexander Haase
2024-07-30 19:31:21 +02:00
committed by GitHub
parent dde84b47da
commit d3a3a6ba46
23 changed files with 597 additions and 314 deletions

View File

@@ -1,19 +1,14 @@
import json
import logging
import sys
import traceback
import uuid
from django.contrib.auth import get_user_model
from django.core.management.base import BaseCommand, CommandError
from django.db import transaction
from django.utils.module_loading import import_string
from core.choices import JobStatusChoices
from core.models import Job
from extras.jobs import ScriptJob
from extras.scripts import get_module_and_script
from extras.signals import clear_events
from netbox.context_managers import event_tracking
from utilities.exceptions import AbortTransaction
from utilities.request import NetBoxFakeRequest
@@ -33,44 +28,6 @@ class Command(BaseCommand):
parser.add_argument('script', help="Script to run")
def handle(self, *args, **options):
def _run_script():
"""
Core script execution task. We capture this within a subfunction to allow for conditionally wrapping it with
the event_tracking context manager (which is bypassed if commit == False).
"""
try:
try:
with transaction.atomic():
script.output = script.run(data=data, commit=commit)
if not commit:
raise AbortTransaction()
except AbortTransaction:
script.log_info("Database changes have been reverted automatically.")
clear_events.send(request)
job.data = script.get_job_data()
job.terminate()
except Exception as e:
stacktrace = traceback.format_exc()
script.log_failure(
f"An exception occurred: `{type(e).__name__}: {e}`\n```\n{stacktrace}\n```"
)
script.log_info("Database changes have been reverted due to error.")
logger.error(f"Exception raised during script execution: {e}")
clear_events.send(request)
job.data = script.get_job_data()
job.terminate(status=JobStatusChoices.STATUS_ERRORED, error=repr(e))
# Print any test method results
for test_name, attrs in job.data['tests'].items():
self.stdout.write(
"\t{}: {} success, {} info, {} warning, {} failure".format(
test_name, attrs['success'], attrs['info'], attrs['warning'], attrs['failure']
)
)
logger.info(f"Script completed in {job.duration}")
User = get_user_model()
# Params
@@ -84,8 +41,8 @@ class Command(BaseCommand):
data = {}
module_name, script_name = script.split('.', 1)
module, script = get_module_and_script(module_name, script_name)
script = script.python_class
module, script_obj = get_module_and_script(module_name, script_name)
script = script_obj.python_class
# Take user from command line if provided and exists, other
if options['user']:
@@ -120,40 +77,29 @@ class Command(BaseCommand):
# Initialize the script form
script = script()
form = script.as_form(data, None)
# Create the job
job = Job.objects.create(
object=module,
name=script.class_name,
user=User.objects.filter(is_superuser=True).order_by('pk')[0],
job_id=uuid.uuid4()
)
request = NetBoxFakeRequest({
'META': {},
'POST': data,
'GET': {},
'FILES': {},
'user': user,
'path': '',
'id': job.job_id
})
if form.is_valid():
job.status = JobStatusChoices.STATUS_RUNNING
job.save()
logger.info(f"Running script (commit={commit})")
script.request = request
# Execute the script. If commit is True, wrap it with the event_tracking context manager to ensure we process
# change logging, webhooks, etc.
with event_tracking(request):
_run_script()
else:
if not form.is_valid():
logger.error('Data is not valid:')
for field, errors in form.errors.get_json_data().items():
for error in errors:
logger.error(f'\t{field}: {error.get("message")}')
job.status = JobStatusChoices.STATUS_ERRORED
job.save()
raise CommandError()
# Execute the script.
job = ScriptJob.enqueue(
instance=script_obj,
user=user,
immediate=True,
data=data,
request=NetBoxFakeRequest({
'META': {},
'POST': data,
'GET': {},
'FILES': {},
'user': user,
'path': '',
'id': uuid.uuid4()
}),
commit=commit,
)
logger.info(f"Script completed in {job.duration}")