-DIS THREAD

Alert ! Alert ! Alert - Something is wrong with the systems - There is a huge spike and the production jobs are choking on DB2 CPU. We are unable to find out what's going on : - well, these were my words long time ago on a routine system monitoring day.


A job that used to take 20 minutes to complete is still running and its been more than an hour now - Yes, its enough for me to raise an alarm as a DBA. I was monitoring the system ( though this is not my party cake ), I noticed a considerable increase in DB2 CPU time for application programs on CA-Insight. Huge elapsed and wait I/o followed the show - Nightmare on a weekend for me.


But this nightmare did not stay long, it turned into a knowledge base - The subsystems ( test and production ) resides on single LPAR, one of the test modules running in TEST subsystem encountered a loop and consumed DB2 CPU, but hey - TEST system, why will it impact production ?


Guess what, the CPU MIPS is shared between TEST and PRODUCTION environments - OK, this is not good and this information was something new to me - A small account with most of the tests run during '' non-business hours " , this should not hurt much . It so happened that the developer did not notice the loop nor cancelled his job and the program creeped into business hours causing massive delays.


How was this program traced ? 


DB2 COMMANDS -


-DIS THREAD(*) populates all the user(s), program(s) and resource claim(s) information as a token ID. You can hunt down the problem by cross verifying with CA-Insight for the DB2 CPU consumption. Issue a cancel thread command ( DBA auth requried ) against the agent token and save the day !!


No comments:

Post a Comment