Scripts to monitor database is up

From NazimWIKI
Jump to navigation Jump to search

Scenario is I do not have access to Oracle Enterprise Manager or any other monitoring utility and I need to make sure the database is up. Simply checking pmon is not enough, so logging in and querying the database is in my opinion the only proper way to ensure it is running.


The solution in a nutshell for me is to cron a query that runs every 5 minutes and writes a timestamp to a logfile. Another script is then run every 8 minutes to check the contents of the logfile and alert if the last successful timestamp was more than 10 minutes ago.

Script 1 - Log on to Database and write the Sysdate to a File

############################################################################
#!/bin/ksh
#
# Description : This script logs into a database and writes a
#               timestamp to a file
#               An accompanying script will then check to see
#               if the database was successfully connected to
#
#               13/12/2017
#               nm
#
#
#Usage: Set the Database Environment on with the right database
############################################################################

# Set Database Environment
. /usr/local/env/

LOG=/oracle/logs/${ORACLE_SID}_database_upcheck.log

sqlplus -s "/ as sysdba"<<DATA | tee -a $LOG
set pagesize 0
set heading off
set underline off
set verify off
set termout off
set trimspool on
set feedback off
select to_char(sysdate,'YYYY-MM-DD HH24:MI') from dual;
exit;
DATA


Script 2 - Check the above logfile, Alert if last successful query of timestamp was more than 10 minutes ago and clean up the first 20 lines of the logfile to keep it clean

###########################################################################
#!/bin/ksh
#
# Description : This script checks the datestamp entry
#               of the last two lines of a file and alerts
#               if the last value exceed 10 minutes
#
#
#               13/12/2017
#               nm
#
#Usage: Set environment variable for the Database
############################################################################

# Set environment for database
. /usr/local/env/

LOG=/oracle/logs/${ORACLE_SID}_database_upcheck.log
LOGALERT=/oracle/logs/${ORACLE_SID}_dbalert.log

MAIL_LIST="email@company.com"


sd=`date +"%Y-%m-%d %H:%M"`

dt1=`sed 'x;$!d' $LOG`
dt2=`sed '$!d' $LOG`

echo -e "$sd Is the Current sysdate\n" > $LOGALERT
echo -e "$dt1 Was the second last successful connection to the database\n" >> $LOGALERT
echo -e "$dt2 Was the last successful connection to the database\n" >> $LOGALERT
echo -e "====================================================================\n" >> $LOGALERT

sdEpoch=`date -d "$sd" +%s`
dt1Epoch=`date -d "$dt1" +%s`
dt2Epoch=`date -d "$dt2" +%s`

#echo $sdEpoch
#echo $dt1Epoch
#echo $dt2Epoch

td1=$(($sdEpoch-$dt1Epoch))
td2=$(($sdEpoch-$dt2Epoch))

echo -e "Compared to sysdate the second last successful connection to the database was $td1 Seconds ago\n" >> $LOGALERT
echo -e "Compared to sysdate the last successful connection to the database was $td2 Seconds ago\n" >> $LOGALERT
echo -e "====================================================================\n"  >> $LOGALERT


if [ $td2 -gt 600 ]; then

# echo $td2
 mail -a /oracle/logs/${ORACLE_SID}_dbalert.log  -s "${ORACLE_SID} DB ALERT: DB Down" $MAIL_LIST < $LOGALERT

fi



# Clean up Log file
numlines=`wc -l $LOG | awk '{print $1}'`

#echo $numlines

if [ $numlines -gt 22 ]; then

  echo "cleaning log file"
  sed -i '1,20d' $LOG

fi