We rely on creating automated processes to keep on top of tasks. DBAs have a lot of tasks to perform: backups, performance tuning, data movement, system monitoring, and of course, avoiding being noticed. Every day, there are many steps to perform to maintain the database infrastructure, including: checking physical structures, re-indexing tables where needed, backing up the databases, checking those backups, running the ETL, and preparing the daily reports and yes, all of these processes have to complete before you can call it a day, and probably before many others have started that same day.
Some of these tasks are just naturally complicated on their own. Other tasks become complicated because the database architecture is excessively rigid, and we often discover during “production testing” that certain processes need to be changed because the written requirements barely resembled the actual customer requirements. Then, with no time to change that rigid structure, we are forced to heap layer upon layer of code onto the problematic processes. Instead of a slight table change and a new index, we end up with 4 new ETL processes, 20 temp tables, 30 extra queries, and 1000 lines of SQL code. Report writers then need to build reports and make magical numbers appear from those toxic data structures that are overly complex and probably filled with inconsistent data. What starts out as a collection of fairly simple tasks turns into a Goldbergian nightmare of daily processes that are likely to cause your dinner to be interrupted by the smartphone doing the vibration dance that signifies trouble at the mill.
So what to do? Well, if it is at all possible, simplify the problem by either going into the code and refactoring the complex code to simple, or taking all of the processes and simplifying them into small, independent, easily-tested steps. The former approach usually requires an agreement on changing underlying structures that requires countless mind-numbing meetings; while the latter can generally be done to any complex process without the same frustration or anger, though it will still leave you with lots of steps to complete, the ability to test each step independently will definitely increase the quality of the overall process (and with each step reporting status back, finding an actual problem within the process will be definitely less unpleasant.)
We all know the principle behind simplifying a sequence of processes because we learned it in math classes in our early years of attending school, starting with elementary school. In my 4 years (ok, 9 years) of undergraduate work, I remember pretty much one thing from my many math classes that I apply daily to my career as a data architect, data programmer, and as an occasional indentured DBA: “show your work”. This process of showing your work was my first lesson in simplification. Each step in the process was in fact, far simpler than the entire process. When you were working an equation that took both sides of 4 sheets of paper, showing your work was important because the teacher could see every step, judge it, and mark it accordingly. So often I would make an error in the first few lines of a problem which meant that the rest of the work was actually moving me closer to a very wrong answer, no matter how correct the math was in the subsequent steps. Yet, when I got my grade back, I would sometimes be pleasantly surprised. I passed, yet missed every problem on the test. But why? While I got the fact that 1+1=2 wrong in every problem, the teacher could see that I was using the right process.
In a computer process, the process is very similar. We take complex processes, show our work by storing intermediate values, and test each step independently. When a process has 100 steps, each step becomes a simple step that is tested and verified, such that there will be 100 places where data is stored, validated, and can be checked off as complete. If you get step 1 of 100 wrong, you can fix it and be confident (that if you did your job of testing the other steps better than the one you had to repair,) that the rest of the process works. If you have 100 steps, and store the state of the process exactly once, the resulting testable chunk of code will be far more complex and finding the error will require checking all 100 steps as one, and usually it would be easier to find a specific needle in a stack of similarly shaped needles.
The goal is to strive for simplicity either in the solution, or at least by simplifying every process down to as many, independent, testable, simple tasks as possible. For the tasks that really can’t be done completely independently, minimally take those tasks and break them down into simpler steps that can be tested independently. Like working out division problems longhand, have each step of the larger problem verified and tested.