I've been tasked with looking if a combination of NoSQL (in this case possibly Neo4J) and SQL-Server
is a possible solution for a performance problem we've been having.
Part of my technical analysis is the maintainability and stability of the platform.
And this means I need to find a way to properly document, and maintain applications (the NoSQL) that are seldom used, in combination with an application we have specific and strict guidelines for (the SQL-Server).
My concern is mainly that when I review similar setups done in the past, one-shot applications tend to fall into black box situations, despite having originally been created in-house.
Those that aren't black boxes
tend to fall on the shoulders of a single, unfortunate person who ends up spending most of his working day maintaining and handling some access database antiquated technology that nobody wants to touch and is apparently mission critical.
How do you, as the lucky sysadmin that is tasked with the setup and documentation of this platform make sure that your monstrosity awesome setup stands the test of time?
How do people generally handle one-shot smaller applications in larger corporations, hopefully without permanently shackling their own name to the application?
My first essential building block is a desaster recovery plan for the awesome setup in question, ie. a step by step recipe for rebuilding it from scratch and restoring its data from the last backup. This includes a list of required components and where to get them, complete configuration, and the procedure for a full restore of the application data from the last backup. Writing and testing such a plan will uncover many of the most gaping holes in the operational concept, like data that should be included in the backup but isn't, forgotten dependencies, or software of which nobody knows where it came from.
The second essential building block is an operations manual covering the tasks occurring during regular operation. This needs to be tested, too, ideally by having an administrator who's unfamiliar with the awesome setup run it by that manual for a relevant period of time, with the regular administrator in the background taking notes and being available in case of emergencies.
If the awesome setup is mission critical then its desaster recovery plan will of course be included in the regular emergency exercises mandated by your ISMS. The regular admin's annual vacation is a natural opportunity for re-testing the operations manual. So both are periodically verified to be up to date.
Of course all this falls down if you don't succeed in getting the awesome setup classified as mission critical to begin with.
generally speaking I divide my documentation into various stages. The first stage is the development which is chronological and documents all the steps taken to create the application. I start with a clear mission statement, followed by a list of requirements. Every step of the development is then added as it is completed. I found video capturing my sessions a great memory prompting support. Personally I use a Ticket System (in my case RT) for this stage of the documentation.
The second stage comes after the application has been tested & released. Here I document the installation (required software, packages, environment). Everything needed to recreate the application. The last stage is to document the maintenance procedures. Only those procedures which are particularly complicated are documented.
Finally I publish the second & last Stage in a wiki. It seems like a lot of work but when documenting becomes part of your working routine, you don't really notice it anymore. And of course, the chances of being tied to a development are drastically reduced.
Hope this was helpful.