Hunting For Clues
Learn about database, RAID configurations, application server configurations, and Java thread dumps to find the root cause of the airline incident.
Checking database and RAID configurations
In the morning, fortified with quarts of coffee, I dug into the database cluster and RAID configurations. I was looking for common problems with clusters: not enough heartbeats, heartbeats going through switches that carry production traffic, servers set to use physical IP addresses instead of the virtual address, bad dependencies among managed packages, and so on. At that time, I didn’t carry a checklist. These were just problems that I’d seen more than once or heard about through the grapevine. I found nothing wrong. The engineering team had done a great job with the database cluster. In fact, some of the scripts appeared to be taken directly from Veritas’s own training materials.
Checking application server configurations
Next, it was time to move on ...