OBIEE cluster controller failover in action
Production cluster is 2x BI Server and 2x Presentation Services, with a BIG-IP F5 load balancer on the front.
Symptoms Users started reporting slow login times to BI. Our monitoring tool (Openview) reported that “BIServer01 may be down. Failed to contact it using ping.”. BIServer01 cannot be reached by ping or ssh from Windows network.
Diagnostics nqsserver and nqsclustercontroller on BIServer01 was logging these repeated errors:
[nQSError: 12002] Socket communication error at call=send: (Number=9) Bad file number