Celery Observed Events

Observed on 2026-03-12 in the local plywatch lab stack.

Setup

  • Broker: redis://redis:6379/0

  • Result backend: redis://redis:6379/1

  • Monitor API: http://127.0.0.1:8080

  • Producer API: http://127.0.0.1:8090

  • Worker task events: enabled

Scenarios Emitted

  • success

  • failure

  • retry

  • parallel

  • callbacks enabled for success, failure, retry

Observed Event Types

  • worker-heartbeat

  • task-sent

  • task-received

  • task-started

  • task-succeeded

  • task-failed

  • task-retried

Current sample counts from the retained raw buffer:

  • worker-heartbeat: 460

  • task-sent: 10

  • task-received: 10

  • task-started: 10

  • task-succeeded: 6

  • task-failed: 2

  • task-retried: 2

Important Findings

  • Callback execution appears as independent Celery tasks:

    • loom.callback.RecordSuccessCallback

    • loom.callback_error.RecordFailureCallback

  • Job execution appears with names like:

    • loom.job.HelloSuccessJob

    • loom.job.HelloFailureJob

    • loom.job.HelloRetryJob

  • task-sent and task-received include the task name.

  • task-started, task-succeeded, task-failed, and task-retried do not reliably include the task name.

  • For monitoring, task identity must therefore be correlated by uuid, using earlier task-sent or task-received events.

  • Retry attempts reuse the same task uuid.

  • Retry scheduling is visible in task-sent via eta.

  • Callback tasks inherit root_id and parent_id from the source job, which is useful for timeline grouping.

Useful Fields By Event Type

worker-heartbeat

Useful fields:

  • hostname

  • pid

  • clock

  • freq

  • active

  • processed

  • loadavg

  • sw_ident

  • sw_ver

  • sw_sys

Example:

{"captured_at":"2026-03-12T12:41:16.877476+00:00","event_type":"worker-heartbeat","uuid":null,"hostname":"celery@b3d5810e88a8","payload":{"hostname":"celery@b3d5810e88a8","utcoffset":0,"pid":1,"clock":1357,"freq":5,"sw_ident":"py-celery","sw_ver":"5.6.2","sw_sys":"Linux","timestamp":1773319276.8766975,"type":"worker-heartbeat","local_received":1773319276.877462}}

task-sent

Useful fields:

  • uuid

  • name

  • root_id

  • parent_id

  • queue

  • routing_key

  • retries

  • eta

  • kwargs

Example:

{"captured_at":"2026-03-12T12:38:55.212243+00:00","event_type":"task-sent","uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","hostname":"gen10@b3d5810e88a8","payload":{"hostname":"gen10@b3d5810e88a8","utcoffset":0,"pid":10,"clock":8663,"uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","root_id":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","parent_id":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","name":"loom.job.HelloRetryJob","args":"()","kwargs":"{'payload': {'scenario': 'retry', 'count': 1, 'with_callbacks': True, 'delay_seconds': 2, 'message': 'second retry'}, 'params': None, 'trace_id': None}","retries":2,"eta":"2026-03-12T12:38:57.209763+00:00","expires":null,"queue":"default","exchange":"","routing_key":"default","timestamp":1773319135.2114224,"type":"task-sent","local_received":1773319135.2122357}}

task-received

Useful fields:

  • uuid

  • name

  • root_id

  • parent_id

  • retries

  • eta

  • kwargs

Example:

{"captured_at":"2026-03-12T12:38:55.213189+00:00","event_type":"task-received","uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","hostname":"celery@b3d5810e88a8","payload":{"hostname":"celery@b3d5810e88a8","utcoffset":0,"pid":1,"clock":997,"uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","name":"loom.job.HelloRetryJob","args":"()","kwargs":"{'payload': {'scenario': 'retry', 'count': 1, 'with_callbacks': True, 'delay_seconds': 2, 'message': 'second retry'}, 'params': None, 'trace_id': None}","root_id":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","parent_id":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","retries":2,"eta":"2026-03-12T12:38:57.209763+00:00","expires":null,"timestamp":1773319135.2123497,"type":"task-received","local_received":1773319135.2131822}}

task-started

Useful fields:

  • uuid

  • hostname

  • timestamp

Missing in this sample:

  • task name

  • queue

Example:

{"captured_at":"2026-03-12T12:38:57.220540+00:00","event_type":"task-started","uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","hostname":"celery@b3d5810e88a8","payload":{"hostname":"celery@b3d5810e88a8","utcoffset":0,"pid":1,"clock":1004,"uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","timestamp":1773319137.219105,"type":"task-started","local_received":1773319137.2205176}}

task-succeeded

Useful fields:

  • uuid

  • result

  • runtime

Missing in this sample:

  • task name

Example:

{"captured_at":"2026-03-12T12:38:54.219508+00:00","event_type":"task-succeeded","uuid":"f49a9e8a-47a6-4276-aa51-fcaf18354021","hostname":"celery@b3d5810e88a8","payload":{"hostname":"celery@b3d5810e88a8","utcoffset":0,"pid":1,"clock":995,"uuid":"f49a9e8a-47a6-4276-aa51-fcaf18354021","result":"{'scenario': 'success', 'message': 'second parallel #0'}","runtime":0.0008559160050936043,"timestamp":1773319134.2193074,"type":"task-succeeded","local_received":1773319134.219507}}

task-failed

Useful fields:

  • uuid

  • exception

  • traceback

Missing in this sample:

  • task name

Example:

{"captured_at":"2026-03-12T12:38:54.206817+00:00","event_type":"task-failed","uuid":"08aecfe3-2dc1-4b57-866f-7a1fc8e92b9d","hostname":"celery@b3d5810e88a8","payload":{"hostname":"celery@b3d5810e88a8","utcoffset":0,"pid":1,"clock":984,"uuid":"08aecfe3-2dc1-4b57-866f-7a1fc8e92b9d","exception":"RuntimeError('lab failure: second failure')","traceback":"Traceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/celery/app/trace.py\", line 479, in trace_task\n    R = retval = fun(*args, **kwargs)\n                 ^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/celery/app/trace.py\", line 779, in __protected_call__\n    return self.run(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/loom/celery/runner.py\", line 224, in _job_task\n    result = _run_corout","timestamp":1773319134.2061815,"type":"task-failed","local_received":1773319134.206814}}

task-retried

Useful fields:

  • uuid

  • exception

  • traceback

Missing in this sample:

  • task name

  • next ETA

Example:

{"captured_at":"2026-03-12T12:38:55.220849+00:00","event_type":"task-retried","uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","hostname":"celery@b3d5810e88a8","payload":{"hostname":"celery@b3d5810e88a8","utcoffset":0,"pid":1,"clock":998,"uuid":"9a21b4d7-ed57-47ab-b9f8-d3a9197c64ee","exception":"RuntimeError('lab retry: second retry')","traceback":"Traceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/loom/celery/runner.py\", line 224, in _job_task\n    result = _run_coroutine(\n             ^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/loom/celery/runner.py\", line 104, in _run_coroutine\n    return WorkerEventLoop.run(coro, timeout=timeout)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/loom/celery/event_loop.py\", line 127, in run\n    return ","timestamp":1773319135.2200992,"type":"task-retried","local_received":1773319135.220843}}

Suggested First Data Model Implications

  • Keep one append-only raw event stream for debugging.

  • Build a TaskIndex keyed by uuid.

  • Hydrate stable fields like name, queue, root_id, parent_id from task-sent and task-received.

  • Hydrate lifecycle timestamps from:

    • task-sent

    • task-received

    • task-started

    • terminal events: task-succeeded, task-failed

  • Track retry count from repeated task-retried for the same uuid.

  • Treat callbacks as first-class tasks, but expose a flag derived from name prefix:

    • loom.callback.

    • loom.callback_error.