Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add command to introspect job performance #15403

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from
Open

add command to introspect job performance #15403

wants to merge 1 commit into from

Conversation

kdelee
Copy link
Member

@kdelee kdelee commented Jul 26, 2024

This command outputs json with information about past job event processing delays, how long jobs spent in pending, and information about currently pending jobs.

To aid in investigations, the raw queries are supplied alongside data.

SUMMARY

Add management command useful in investigating current and historical state of job event processing and job scheduling behavior

ISSUE TYPE
  • Bug, Docs Fix or other nominal change
COMPONENT NAME
  • Other
AWX VERSION
devel
ADDITIONAL INFORMATION
[
    {
        "count": 10,
        "jobs_by_MAX_event_processing": [
            {
                "MAX_job_event_processing_delay": "0:00:10.261534",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 13:38:10.397878+00:00",
                "job_finished_time": "2024-07-24 13:38:25.863449+00:00",
                "job_id": 584,
                "job_name": "JobTemplate - PutTrack\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:10.149612",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:36:16.361685+00:00",
                "job_finished_time": "2024-07-18 18:36:31.266472+00:00",
                "job_id": 50,
                "job_name": "JobTemplate - ManSweet\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:10.005463",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:36:15.971977+00:00",
                "job_finished_time": "2024-07-18 18:36:30.640647+00:00",
                "job_id": 49,
                "job_name": "JobTemplate - ManSweet\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.976070",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:18.180791+00:00",
                "job_finished_time": "2024-07-18 18:35:33.364536+00:00",
                "job_id": 39,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.902972",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:17.765722+00:00",
                "job_finished_time": "2024-07-18 18:35:32.677648+00:00",
                "job_id": 38,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.777613",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:18.561147+00:00",
                "job_finished_time": "2024-07-18 18:35:34.271050+00:00",
                "job_id": 40,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.499218",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:21.496982+00:00",
                "job_finished_time": "2024-07-18 18:35:35.702630+00:00",
                "job_id": 46,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.491587",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:36:15.554096+00:00",
                "job_finished_time": "2024-07-18 18:36:28.969169+00:00",
                "job_id": 48,
                "job_name": "JobTemplate - ManSweet\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.450412",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:19.406193+00:00",
                "job_finished_time": "2024-07-18 18:35:35.016695+00:00",
                "job_id": 42,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "MAX_job_event_processing_delay": "0:00:09.349018",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:18.971760+00:00",
                "job_finished_time": "2024-07-18 18:35:34.669970+00:00",
                "job_id": 41,
                "job_name": "JobTemplate - BadStaff\ufffd"
            }
        ],
        "query": "SELECT job_id, MAX(A.modified - A.created) as job_event_processing_delay_MAX, B.name, B.created, B.finished, B.controller_node, B.execution_node FROM main_jobevent A RIGHT JOIN ( SELECT id, created, name, finished, controller_node, execution_node FROM main_unifiedjob WHERE created > NOW() - INTERVAL '10 days' AND created IS NOT null AND finished IS NOT null AND id IS NOT null AND name IS NOT null ) B ON A.job_id=B.id WHERE A.job_id is not null GROUP BY job_id, B.name, B.created, B.finished, B.controller_node, B.execution_node ORDER BY job_event_processing_delay_MAX DESC LIMIT 10;"
    },
    {
        "count": 10,
        "jobs_by_MIN_event_processing": [
            {
                "MIN_job_event_processing_delay": "0:00:02.347398",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:31.179743+00:00",
                "job_finished_time": "2024-07-24 19:28:17.417983+00:00",
                "job_id": 1057,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:02.053510",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:43.549823+00:00",
                "job_finished_time": "2024-07-24 19:28:29.096724+00:00",
                "job_id": 1153,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.741671",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:44.720354+00:00",
                "job_finished_time": "2024-07-24 19:28:32.481682+00:00",
                "job_id": 1173,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.655537",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:03.674955+00:00",
                "job_finished_time": "2024-07-24 19:28:30.644748+00:00",
                "job_id": 997,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.543285",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:42.992418+00:00",
                "job_finished_time": "2024-07-24 19:28:15.587033+00:00",
                "job_id": 1144,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.445362",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:26:52.848029+00:00",
                "job_finished_time": "2024-07-24 19:27:44.237333+00:00",
                "job_id": 910,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.349812",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:41.975834+00:00",
                "job_finished_time": "2024-07-24 19:29:10.050953+00:00",
                "job_id": 1127,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.338480",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-23 21:18:59.449823+00:00",
                "job_finished_time": "2024-07-24 12:29:22.224266+00:00",
                "job_id": 302,
                "job_name": "Demo Job Template"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.329901",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:43.683954+00:00",
                "job_finished_time": "2024-07-24 19:28:42.163584+00:00",
                "job_id": 1155,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "MIN_job_event_processing_delay": "0:00:01.284131",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 12:56:40.870287+00:00",
                "job_finished_time": "2024-07-24 13:01:46.498390+00:00",
                "job_id": 513,
                "job_name": "Demo Job Template @ 08:43:32"
            }
        ],
        "query": "SELECT job_id, MIN(A.modified - A.created) as job_event_processing_delay_MIN, B.name, B.created, B.finished, B.controller_node, B.execution_node FROM main_jobevent A RIGHT JOIN ( SELECT id, created, name, finished, controller_node, execution_node FROM main_unifiedjob WHERE created > NOW() - INTERVAL '10 days' AND created IS NOT null AND finished IS NOT null AND id IS NOT null AND name IS NOT null ) B ON A.job_id=B.id WHERE A.job_id is not null GROUP BY job_id, B.name, B.created, B.finished, B.controller_node, B.execution_node ORDER BY job_event_processing_delay_MIN DESC LIMIT 10;"
    },
    {
        "count": 10,
        "jobs_by_AVG_event_processing": [
            {
                "AVG_job_event_processing_delay": "0:00:05.577698",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:19.886428+00:00",
                "job_finished_time": "2024-07-18 18:35:35.202703+00:00",
                "job_id": 43,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.573398",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:18.971760+00:00",
                "job_finished_time": "2024-07-18 18:35:34.669970+00:00",
                "job_id": 41,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.556131",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:19.406193+00:00",
                "job_finished_time": "2024-07-18 18:35:35.016695+00:00",
                "job_id": 42,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.485858",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:18.561147+00:00",
                "job_finished_time": "2024-07-18 18:35:34.271050+00:00",
                "job_id": 40,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.473728",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:26:08.985438+00:00",
                "job_finished_time": "2024-07-24 19:26:33.982025+00:00",
                "job_id": 881,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.410182",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:20.396231+00:00",
                "job_finished_time": "2024-07-18 18:35:35.585298+00:00",
                "job_id": 44,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.383752",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:35:20.966478+00:00",
                "job_finished_time": "2024-07-18 18:35:35.680533+00:00",
                "job_id": 45,
                "job_name": "JobTemplate - BadStaff\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.306816",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 19:27:41.721681+00:00",
                "job_finished_time": "2024-07-24 19:29:09.724317+00:00",
                "job_id": 1123,
                "job_name": "allow simultaneous: true project no update"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.281719",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-24 13:38:12.440072+00:00",
                "job_finished_time": "2024-07-24 13:38:26.856656+00:00",
                "job_id": 588,
                "job_name": "JobTemplate - PutTrack\ufffd"
            },
            {
                "AVG_job_event_processing_delay": "0:00:05.269382",
                "controller_node": "awx-1",
                "execution_node": "awx-1",
                "job_created_time": "2024-07-18 18:36:17.209133+00:00",
                "job_finished_time": "2024-07-18 18:36:32.623870+00:00",
                "job_id": 52,
                "job_name": "JobTemplate - ManSweet\ufffd"
            }
        ],
        "query": "SELECT job_id, AVG(A.modified - A.created) as job_event_processing_delay_AVG, B.name, B.created, B.finished, B.controller_node, B.execution_node FROM main_jobevent A RIGHT JOIN ( SELECT id, created, name, finished, controller_node, execution_node FROM main_unifiedjob WHERE created > NOW() - INTERVAL '10 days' AND created IS NOT null AND finished IS NOT null AND id IS NOT null AND name IS NOT null ) B ON A.job_id=B.id WHERE A.job_id is not null GROUP BY job_id, B.name, B.created, B.finished, B.controller_node, B.execution_node ORDER BY job_event_processing_delay_AVG DESC LIMIT 10;"
    },
    {
        "completed_or_started_jobs_by_pending_duration": [
            {
                "job_created": "2024-07-23 21:18:59.579761+00:00",
                "job_id": 303,
                "job_name": "Demo Job Template",
                "pending_duration": "15:10:40.739042",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:59.449823+00:00",
                "job_id": 302,
                "job_name": "Demo Job Template",
                "pending_duration": "15:10:20.839018",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:59.356816+00:00",
                "job_id": 301,
                "job_name": "Demo Job Template",
                "pending_duration": "15:10:00.938920",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:59.272908+00:00",
                "job_id": 300,
                "job_name": "Demo Job Template",
                "pending_duration": "15:09:41.024923",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:59.185890+00:00",
                "job_id": 299,
                "job_name": "Demo Job Template",
                "pending_duration": "15:09:21.088701",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:59.094831+00:00",
                "job_id": 298,
                "job_name": "Demo Job Template",
                "pending_duration": "15:09:01.209673",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:59.009107+00:00",
                "job_id": 297,
                "job_name": "Demo Job Template",
                "pending_duration": "15:08:41.299657",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:58.923858+00:00",
                "job_id": 296,
                "job_name": "Demo Job Template",
                "pending_duration": "15:08:21.363387",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:58.828263+00:00",
                "job_id": 295,
                "job_name": "Demo Job Template",
                "pending_duration": "15:08:01.452935",
                "unified_job_template_id": 7
            },
            {
                "job_created": "2024-07-23 21:18:58.734428+00:00",
                "job_id": 294,
                "job_name": "Demo Job Template",
                "pending_duration": "15:07:41.574722",
                "unified_job_template_id": 7
            }
        ],
        "count": 10,
        "query": " SELECT name, id AS job_id, unified_job_template_id, created, started - created AS pending_duration FROM main_unifiedjob WHERE finished IS NOT null AND started IS NOT null AND cancel_flag IS NOT true AND created > NOW() - INTERVAL '10 days' AND started - created > INTERVAL '0 seconds' ORDER BY pending_duration DESC LIMIT 10;"
    },
    {
        "count": 10,
        "query": " SELECT date_trunc('hour', created) as day_and_hour, COUNT(created) as count_jobs_pending_greater_than_10_min FROM main_unifiedjob WHERE started IS NOT NULL AND started - created > INTERVAL '10 minutes' AND created > NOW() - INTERVAL '10 days' GROUP BY date_trunc('hour', created) ORDER BY count_jobs_pending_greater_than_10_min DESC LIMIT 10;",
        "times_of_day_pending_more_than_10": [
            {
                "count_jobs_pending_more_than_10_min": 94,
                "day_and_hour": "2024-07-26 13:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 72,
                "day_and_hour": "2024-07-23 15:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 48,
                "day_and_hour": "2024-07-24 18:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 38,
                "day_and_hour": "2024-07-26 14:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 26,
                "day_and_hour": "2024-07-23 21:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 23,
                "day_and_hour": "2024-07-23 14:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 20,
                "day_and_hour": "2024-07-24 12:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 17,
                "day_and_hour": "2024-07-26 15:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 15,
                "day_and_hour": "2024-07-23 16:00:00+00:00"
            },
            {
                "count_jobs_pending_more_than_10_min": 11,
                "day_and_hour": "2024-07-24 19:00:00+00:00"
            }
        ]
    }
]

This command outputs json with information about past job event
processing delays, how long jobs spent in pending, and what times of day
saw more pending jobs.

To aid in investigations, the raw queries are supplied alongside data.
Copy link

sonarcloud bot commented Jul 31, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant