EEE Monitoring

From EnablerWiki

Jump to: navigation, search

Created: 28 September 2007. Last Reviewed: 27 January 2009


EEE Monitoring2

Contents

EEE Monitoring

Overview

The intention of the EEE monitoring is to ensure that all transactions occurring in stores are correctly processed in EEE with no errors, and also to provide an alert of problems that may have occurred. EEE monitoring is an ongoing function that should be performed each day to ensure any problems that occur are not compounded by a lack of immediate action.


If you are performing your own EEE Monitoring, it is essential that these procedures be followed to ensure your EEE environment contains accurate data. Failure to perform the monitoring procedures could leave you in a situation where your EEE data is not correct or a hardware/software failure occurs and you are unable to recover your data!


All the examples shown in this document relate to SQL Server 2005.


The alerts and weekly reports can be automatically emailed to one or more email addresses, and it is recommended that a minimum of 2 email addresses are alerted to provide coverage in the event of staff being absent.


Alert email addresses need to be configured in both EEE and SQL Server.


EEE


The email addresses can be configured from within EEE from the Utilities->Company Preferences menu as shown below.


EEE Monitoring Figure1.JPG


SQL Server


For each EEE database in your SQL Server, you will have a EEE operator, as shown on the screen below


EEE Monitoring Figure2.JPG


To change the email address for the EEE Operators, highlight the operator (in the example above it is EEE_Operator), and then click on Properties. You will see a screen similar to the one below.


EEE Monitoring Figure3.JPG


You can change email addresses by adjusting the E-Mail name field as shown above. Email addresses should be separated by a semi colon (;).


The main components providing the EEE Monitoring function are:-


  • Weekly reports. These reports provide various statistics on the EEE environment.
  • EEE Error logs. Whenever more than 10 errors are logged in the EEE errorlog, and alert email will be generated which will include a file detailing all the errors that have occurred. These should be attended to immediately.
  • EEE Data Loader. Various circumstances can cause the EEE dataloader to suspend processing, and when this occurs an email alert is generated.
  • Disk space Alerts. EEE can be configured to generate email alerts when the free space on nominated disk drives falls below a specified level.
  • Unacknowledged Data threshold. – When data has been exported to stores and EEE hasn’t received an acknowledgement for that data within 7 days, an alert email is generated.

Data Loader errors

Probably the most important area of EEE Monitoring is checking the EEE Error log for any records that have been rejected by EEE. Failure to monitor the EEE Error Log and fix any rejected records will lead to inaccurate data in EEE, resulting in incorrect sales information, etc.


It is most important that the EEE Error Log is monitored on a regular basis.


The EEE Web application is used to access the EEE Error Logs. They can be found from Utilities->Data Interchange Management->Data Loader Error Log and also Utilities->Data Interchange Management->Data Loader Activity Log, as shown below.


EEE Monitoring Figure4.JPG

Data Loader Activity Log

The Data Loader Activity Log shows the details of every file processed by Data Loader. The defaults for this screen are shown below. You can also filter the activity log so that only files containing errors are shown.

EEE Monitoring Figure5.JPG


A record that has a Y in the “Contains errors” column allows you to drill through to the error log records associated with the activity record.


EEE Monitoring Figure6.JPG


Data Loader Error Log

There are many types of data loader errors that can appear in the error log. It is not possible to discuss resolutions for all error types, so the most common errors have been addressed below.


It is extremely important that errorlog records are investigated ASAP. If the problem can’t be resolved, it should be referred to Magenta Retail Support (support@magentaretail.com.au) for assistance in resolving the problem.


Timeout Expired


The example below shows a number of timeout errors. These can occur when the SQL Server is very busy with other tasks such as running queries etc, and the timeout period for a transaction to occur is exceeded, so the transaction is aborted and the details of the record being inserted is added to the EEE errorlog table. To fix these types of errors you simple need to reprocess the records.


  • Click on the 1st record to be re-processed. The record ID number will be populated in the From ID and To Id fields towards the bottom of the screen. In the example below, the ID number is 26539.
  • Replace the number in the To ID box with the record ID number of the last record to be re-processed. In the example below, type 26553
  • Click on the REPROCESS button in the bottom right corner of the screen. A file will be created containing all these records in the Reprocess sub-folder of the data loader import folder (something like d:\eeedata\import\reprocess.
  • The records written to the reprocess file will then be flagged in the errorlog table as re-processed

EEE Monitoring Figure7.JPG


  • After the reprocess button is clicked, and the errorlog records updated, the above records will be removed from the screen as shown below.

EEE Monitoring Figure8.JPG


Transaction Deadlocked


Deadlocked transactions occur when an attempt is made to update the same record by 2 or more SQL processes as the same time. One of the updates will be selected as the deadlocked record, and the record will be written to the errorlog.


Deadlocked transactions should be reprocessed as with timed out records above.


EEE Monitoring Figure9.JPG


Foreign Key Constraint errors


Foreign Key constraint errors occur when a record contains key values such as product code, customer code, branch code, etc, that do not exist in EEE. Under normal circumstances, these shouldn’t occur so further investigation is required. Some instances that may cause these errors include:-


  • Product code created in EEE and then deleted after it had been exported to stores. The store performs a transaction with that product code before it received the deletion.
  • Customer code is missing. Customer was created in Enabler, but a communications failure prevented the file containing the new customer being correctly sent to Head Office. The customer then makes a purchase in the store and that transaction is sent to Head Office causing a Foreign Key error as the customer is not in EEE.

The screen below is an example of a number of Foreign Key Constraint error records.


EEE Monitoring Figure10.JPG


If you click on an error record so it is highlighted, and then click on the DETAIL button, you are given further information on the error as shown below.


EEE Monitoring Figure11.JPG


In the example above, you can see that the record hasn’t been processed because of a conflict in the productcode column of the Product table. This means that the product code in this record (93281355156943) is not in the product table.

This may have occurred because the product code was created in store instead of in EEE or the code may have been created in EEE and then deleted, and the deletion hasn’t been sent to the store as yet.


This record could be processed by replacing the invalid product code with the correct product code.


  • ESC from the above screen, and then click on EDIT.
  • You are then shown the original record, with the highlighted area of the screen to make any changes as shown below.
  • Replace the product code with the correct code and then click Resubmit.
  • The record is then re-processed and removed from the error log screen.

EEE Monitoring Figure12.JPG


Primary Key Constraint errors


Primary Key constraint errors occur because the data that is being inserted violates a primary key setting for the table. Primary key violations stop duplicate data being processed by EEE, and also prevent duplicate items such as duplicate gift voucher numbers being entered.


In the example below, a gift voucher transaction has generated a Primary Key violation

EEE Monitoring Figure13.JPG


Clicking on the Detail button gives the information about the error as shown below. This indicates that EEE cannot insert a duplicate key (voucher number) in the Voucher table

EEE Monitoring Figure14.JPG

The above situation can occur if Enabler is configured to issue manual gift voucher numbers and the same number is issued twice. A common scenario for this is the store redeems part of a gift voucher, and issues a gift voucher as change, but they reissue the same gift voucher number.


The Primary Key violation example below relates to duplicate transactions being rejected by EEE. If you look at the Error Record column on the right hand side, you will see the transactions occurred on 20/02/2007 but the Date column indicates that these errors were generated on 5/03/2007. These types of primary key errors generally indicate that something unusual such as database corruption, has occurred in Enabler on the store computer.


Primary Keys prevent the same record being processed more than once. As an example, on the Saleline table, primary keys exist on the fields saledate, saleline, reference2, and kititemnumber. As Enabler can’t generate duplicate combinations of these fields in one transaction, if a combination of these fields already exists in EEE, this will be a duplicate record and EEE will not process the duplicate record and will insert the record in the errorlog table.


EEE Monitoring Figure15.JPG


With the above error records, you should first attempt to send acknowledgements back to the store which will hopefully stop these records being resent by Enabler. If the same records appear in the errorlog a few days later, you will need to contact Magenta Retail Support (support@magentaretail.com.au) for further assistance.

To acknowledge these records, do the following.


  • Click on the 1st record to be acknowledged. The record ID number will be populated in the From ID and To Id fields towards the bottom of the screen.
  • Replace the number in the To ID box with the record ID number of the last record to be acknowledged. Use the arrows to move between screens if there is more than one screen of errors.
  • Click on the ACKNOWLEDGE button. Acknowledgement files are then created for sending back to the stores. Acknowledgement records when processed by Enabler, indicate that the record has been processed by EEE, so doesn’t need to be sent again by Enabler. After these records have been acknowledged, you need to flag them as ignored so they don’t appear in the errorlog.
  • Click on the 1st record to be ignored. The record ID number will be populated in the From ID and To Id fields towards the bottom of the screen.
  • Replace the number in the To ID box with the record ID number of the last record to be ignored. Use the arrows to move between screens if there is more than one screen of errors.
  • Click on the IGNORE button. The records will all be flagged as ignored and not shown on the errorlog screen.

Weekly emails

Each week, a scheduled task is run that produces a number of reports. These reports are emailed to the configured email addresses. The subject of these emails is “SQL Server Report”, and they will usually contain a zip file called WeeklyReportFiles.zip. Contained in this zip file will be:-


  • dbcc_out.txt
  • dbmaint.txt
  • errorlog.txt
  • mag_erlg.dat
  • sqlbackup.txt

The text of the email will look like:-


The email should contain the

drive sizes,
database sizes and
the latest errorlog.


companyname

--------------------------------------------------

Your Company - New Zealand


(1 rows affected)


The contents of each file is explained below.


dbcc_out.txt

This document is usually very large and contains information about performing an indexdefrag on all table indexes in EEE. This report is mainly for informational purposes only.


NBSQLSVR usp_dbm_dbINDEXDEFRAG Style 2007-09-03 01:00:25.037

2 0 0 Executing DBCC INDEXDEFRAG (eee,Style.idxModDate,2) - fragmentation currently 57%, Avg Page Density currently 0% 

NBSQLSVR usp_dbm_dbINDEXDEFRAG Style 2007-09-03 01:00:25.037

Pages Scanned Pages Moved Pages Removed Executing DBCC INDEXDEFRAG (eee,Style.idxModDate,2) - fragmentation currently 57%, Avg Page Density currently 0% 

NBSQLSVR usp_dbm_dbINDEXDEFRAG Style 2007-09-03 01:00:25.037

301 159 135 Executing DBCC INDEXDEFRAG (eee,Style.idxModDate,2) - fragmentation currently 57%, Avg Page Density currently 0% 

NBSQLSVR usp_dbm_dbINDEXDEFRAG Style 2007-09-03 01:00:25.037

Pages Scanned Pages Moved Pages Removed Executing DBCC INDEXDEFRAG (eee,Style.idxModDate,2) - fragmentation currently 57%, Avg Page Density currently 0% 

NBSQLSVR usp_dbm_dbINDEXDEFRAG Style 2007-09-03 01:00:25.037

Pages Scanned Pages Moved Pages Removed Executing DBCC INDEXDEFRAG (eee,Style.idxModDate,2) - fragmentation currently 57%, Avg Page Density currently 0% 

NBSQLSVR usp_dbm_dbINDEXDEFRAG Style 2007-09-03 01:00:25.037

-------------------- -------------------- -------------------- Executing DBCC INDEXDEFRAG (eee,Style.idxModDate,2) - fragmentation currently 57%, Avg Page Density currently 0% 



dbmaint.txt

This report provides information about the free space on all drives on the SQL Server, information about the file that each database/log uses, together with growth statistics, and also the total size and free space in each database and transaction log. This document is extremely useful for monitoring the changing size of your database and log files, and should be examined each month.

drive MB free

----- -----------

C 63036

D 2968

E 376335


(3 rows affected)

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

Name Fileid Filename Filegroup Size Maxsize Growth Usage

------------------------------ ----------- --------------------------------------------------------------------------- ---------- ------------- ------------- ---------- ----------

bedb_dat 1 C:\Program Files\Symantec\Backup Exec\Data\BEDB_dat.mdf PRIMARY 9536 KB Unlimited 80 KB data only

bedb_log 2 C:\Program Files\Symantec\Backup Exec\Data\BEDB_Log.ldf NULL 1024 KB Unlimited 10% log only

EEEData1 1 e:\eeedata\hongkong\data\EEEData1.mdf PRIMARY 5120000 KB Unlimited 512000 KB data only

EEEData1 1 e:\eeedata\NewZealand\data\EEEData1.mdf PRIMARY 5120000 KB Unlimited 512000 KB data only

EEEData1 1 e:\eeedata\production\data\EEEData1.mdf PRIMARY 10238976 KB Unlimited 512000 KB data only

EEEData1 1 e:\eeedata\test\data\EEEData1.mdf PRIMARY 3072000 KB Unlimited 512000 KB data only

EEELog1 2 e:\eeedata\test\data\EEELog1.ldf NULL 1024000 KB 2147483648 KB 512000 KB log only

EEELog1 2 e:\eeedata\production\data\EEELog1.ldf NULL 5120000 KB 2147483648 KB 512000 KB log only

EEELog1 2 e:\eeedata\NewZealand\data\EEELog1.ldf NULL 1024000 KB 2147483648 KB 102400 KB log only

EEELog1 2 e:\eeedata\hongkong\data\EEELog1.ldf NULL 1126400 KB 2147483648 KB 102400 KB log only

master 1 E:\sqldata\MSSQL.1\MSSQL\DATA\master.mdf PRIMARY 4096 KB Unlimited 10% data only

mastlog 2 E:\sqldata\MSSQL.1\MSSQL\DATA\mastlog.ldf NULL 1280 KB Unlimited 10% log only

modeldev 1 E:\sqldata\MSSQL.1\MSSQL\DATA\model.mdf PRIMARY 2240 KB Unlimited 1024 KB data only

modellog 2 E:\sqldata\MSSQL.1\MSSQL\DATA\modellog.ldf NULL 26816 KB Unlimited 10% log only

MSDBData 1 E:\sqldata\MSSQL.1\MSSQL\DATA\MSDBData.mdf PRIMARY 83072 KB Unlimited 10% data only

MSDBLog 2 E:\sqldata\MSSQL.1\MSSQL\DATA\MSDBLog.ldf NULL 35712 KB 2147483648 KB 10% log only

ReportServer 1 E:\sqldata\MSSQL.1\MSSQL\DATA\ReportServer.mdf PRIMARY 69824 KB Unlimited 1024 KB data only

ReportServer_log 2 E:\sqldata\MSSQL.1\MSSQL\DATA\ReportServer_log.LDF NULL 2624 KB 2147483648 KB 10% log only

ReportServerTempDB 1 E:\sqldata\MSSQL.1\MSSQL\DATA\ReportServerTempDB.mdf PRIMARY 11456 KB Unlimited 1024 KB data only

ReportServerTempDB_log 2 E:\sqldata\MSSQL.1\MSSQL\DATA\ReportServerTempDB_log.LDF NULL 3520 KB 2147483648 KB 10% log only

tempdev 1 E:\sqldata\MSSQL.1\MSSQL\DATA\tempdb.mdf PRIMARY 38208 KB Unlimited 10% data only

templog 2 E:\sqldata\MSSQL.1\MSSQL\DATA\templog.ldf NULL 15040 KB Unlimited 10% log only


LNm Tot Mb Used Mb Free MbData %Log DB Typ EntryDt

------------------------- --------------- --------------- ---------------- ------------------------- ----- -----------------------

ReportServerTempDB 11.1875 1.6250 9.5625 ReportServerTempDB Data 2007-09-03 05:00:01.020

ReportServer 68.1875 68.0625 .1250 ReportServer Data 2007-09-03 05:00:01.020

EEEData1 5000.0000 15.4375 4984.5625 NewZealand Data 2007-09-03 05:00:01.020

MSDBData 81.1250 74.9375 6.1875 msdb Data 2007-09-03 05:00:01.020

master 4.0000 2.9375 1.0625 master Data 2007-09-03 05:00:01.003

EEEData1 5000.0000 225.8125 4774.1875 HongKong Data 2007-09-03 05:00:01.020

EEEData1 3000.0000 207.7500 2792.2500 eeetest Data 2007-09-03 05:00:01.020

EEEData1 9999.0000 381.7500 9617.2500 eee Data 2007-09-03 05:00:01.020

bedb_dat 9.3125 9.1875 .1250 BEDB Data 2007-09-03 05:00:01.020

ReportServerTempDB 3.4297 1.0366 69.7751 ReportServerTempDB Log 2007-09-03 05:00:01.003

ReportServer 2.5547 .9551 62.6147 ReportServer Log 2007-09-03 05:00:01.003

NewZealand 999.9922 30.7939 96.9206 NewZealand Log 2007-09-03 05:00:01.003

msdb 34.8672 7.0820 79.6886 msdb Log 2007-09-03 05:00:01.003

master 1.2422 .6094 50.9434 master Log 2007-09-03 05:00:01.003

HongKong 1099.9922 53.8052 95.1086 HongKong Log 2007-09-03 05:00:01.003

eeetest 999.9922 505.6001 49.4396 eeetest Log 2007-09-03 05:00:01.003

eee 4999.9922 27.6509 99.4470 eee Log 2007-09-03 05:00:01.003

BEDB .9922 .5776 41.7815 BEDB Log 2007-09-03 05:00:01.003


errorlog.txt

The errorlog.txt contains the latest SQL Server log. Each week when the weekly reports are run, a new SQL Server error log is started and the previous log is included in this email.


mag_erlg.dat

This document contains outstanding error log records for the past week, so since the last EEE Errorlog alert was generated. Records in this file should be attended to immediately. Refer the section on “Data Loader Errors” for more information.


Id !ErrDate !ErrSource !ErrDescription !ErrData1 !ErrData2 !ErrRecord !ErrReprocessCount!EditedRecord !ErrIgnored!ActivityNumber!AddDate

-----------!-----------------------!------------------------------------------------------------!--------------------------------------

26180!2007-06-27 23:13:39.000!Microsoft OLE DB Provider for SQL Server !Timeout expired !IDispatch error #3121 !GL !"09098481|00|20070627|5932","GL",19,"00",04,"20070627","11:20:58","6688","00000400014881",2,1,24.3,48.6,"9319868102841","8924B00","0009",10,"",1,"","N",0 ! 0!NULL ! 0! 47453!2007-06-27 23:13:40.233

26181!2007-06-27 23:14:24.000!Microsoft OLE DB Provider for SQL Server !Timeout expired !IDispatch error #3121 !GL !"09098483|00|20070627|5914","GL",20,"00",04,"20070627","11:20:58","6688","00000400014881",1,1,24.3,24.3,"9319868102858","8924B00","0010",10,"",1,"","N",0 ! 0!NULL ! 0! 47453!2007-06-27 23:14:24.293

26182!2007-06-27 23:15:55.000!Microsoft OLE DB Provider for SQL Server !Timeout expired !IDispatch error #3121 !GL !"09098597|00|20070627|5710","GL",1,"00",04,"20070627","13:07:02","6688","00000400014882",1,1,70,70,"93281355163283","6971B00","0011",10,"",1,"","N",0 ! 0!NULL ! 0! 47453!2007-06-27 23:15:55.077


sqlbackup.txt

This report provides a snapshot of the last database backup date for every database on your SQL Server. This report should be examined each week to make sure that your databases are being backed up regularly.


Databases and backups on server NBSQLSVR


Database Backup Date Comment

---------------------------------------- ------------- -------------------------------

BEDB Sep 3 2007 Backup is current within a day

eee Sep 2 2007 Backup is current within a day

eee Sep 2 2007 Backup is current within a day

eeetest Sep 2 2007 Backup is current within a day

HongKong Sep 2 2007 Backup is current within a day

master Sep 3 2007 Backup is current within a day

model Sep 3 2007 Backup is current within a day

msdb Sep 3 2007 Backup is current within a day

NewZealand Sep 3 2007 Backup is current within a day

ReportServer Aug 31 2007 Backup is current within a week

ReportServerTempDB Aug 31 2007 Backup is current within a week


Alert emails

Alert emails are intended to assist you with monitoring your EEE environment. Do not rely on them as the only means of notifying you of problems. Various conditions external to EEE can prevent alert emails being delivered.

There are many circumstances that will cause EEE to generate an alert email. The most common have been explained in the section below. If you receive any other alerts, please advise Magenta Retail Support (support@magentaretail.com.au) immediately so you can be advised of what action to take.

Disk Space Alerts

EEE has a scheduled task where you can configure alerts to be received when the disk space on nominated drives falls below a set threshold. This task is called Magenta DBM Monitor Disk Space and is scheduled to run via SQL Agent. By default, this job runs every hour, so if you receive a disk space alert email, the emails will continue every hour until the free disk space is above the specified threshold.

This scheduled task will look similar to the script below, with one line for each disk being monitored.


exec usp_dbm_diskspace_monitor @drivename = c, @spacelimit = 5000

exec usp_dbm_diskspace_monitor @drivename = d, @spacelimit = 15000

exec usp_dbm_diskspace_monitor @drivename = e, @spacelimit = 25000


When an alert email is generated, the email will have the subject of:-


SQL Server Job System: 'EEE_Magenta DBM Monitor Disk Space' completed on [//MACHINE \\MACHINE]

where [//machine \\machine] is the name of the computer the alert is for.


The email will contain information similar to:-


JOB RUN:'EEE_Magenta DBM Monitor Disk Space' was run on 6/09/2007 at 12:28:30 PM

DURATION:0 hours, 0 minutes, 0 seconds

STATUS: Failed

MESSAGES:The job failed. The Job was invoked by User MACHINE\Trevor. The last step to run was step 1 (step1).


To get further information on the failure, you need to go to Microsoft SQL Server Management Studio, and under SQL Agent, expand Jobs as shown below.


EEE Monitoring Figure16.JPG


Right click on the Magenta DBM Monitor Disk Space job, and then select “View History”. You will see a screen similar to below.


EEE Monitoring Figure17.JPG


To get further information on the failed job, click on the + beside the job with the EEE Monitoring Figure18.JPG beside it. Click on the next line which will also have a EEE Monitoring Figure18.JPG on it, and you will see a screen similar to below.


EEE Monitoring Figure19.JPG

The screen above shows that C: drive is less than the specified limit of 13000MB, and the available space is 12634MB. This indicates that you need free space on C: drive so the available space is > the specified limit.


Data Loader Alerts

Whenever data loader encounters a situation that causes 50 errors within 5 seconds, a Data Loader alert email is generated with the subject of “Data Loader Error Count Exceeded”. This email will contain the following text.


Data Loader has exceeded 50 server errors within 5 seconds and it is suspending operations. Please review the Windows Application Event Log and take whatever steps are necessary to resolve the problem, then stop and restart the EEE Data Loader Service.


Data Loader will not process any data for any company/database until this problem has been resolved. Data flowing in from stores will be queued until Data Loader is restarted.


It is very important to realize that when you receive this email, if you take no action, then data loader will not process any further transactions into EEE!


Usually, when the above alert is received, the first action you should take is from the Services icon, in Administrative Tools, highlight the EEE Data Loader service and restart it. In the majority of cases this will fix the problem.


If data loader immediately generates another alert, then further investigation is usually required. To find the details of what is causing Data Loader to stop, go to the Windows Application Event Log where you will see a number of error messages relating to EEE Data Loader.


There can be many causes of these errors, so if you are unsure of what action to take, this should be escalated to Magenta Retail Support (support@magentaretail.com.au) as a matter of urgency.


EEE Error Log Alerts

When data loader inserts 10 or more records in the EEE errorlog table, an alert email is sent as a warning that this has occurred. The subject of the email will be:-


SQL Server Report for EEE Errorlog for Your Company


and the email will have text of:-


Threshold for Errorlog reached. This email should contain the

the latest errorlog.

There have been 17 errors from limit of 10.

companyname

--------------------------------------------------

Your Company


(1 rows affected)


This email will also contain an attachment called “mag_erlg.dat” which is a text file containing all the errors. These errors need to be immediately investigated to determine the cause. The problem maybe as simple as time outs where you need to re-process the records, or there maybe more serious issues. For further information on the types of errors you may find, see the section titled “Data Loader Errors” earlier in this document.


EEE Export Maintenance Threshold Alerts

EEE Exporter is the program that extracts data from your EEE database and creates an isl-file.dat to be imported at each store. The EEE Exporter is usually configured to export new and unacknowledged data and unacknowledged data is resent if an acknowledgement hasn’t been received from the store within a predefined time period.


When exporter extracts each record to be exporter to stores, a record is created in the exportmaintenance table in EEE. When an isl-file.dat is processed in each store, an acknowledgement record is generated for each record processed, and this acknowledgment record then removes the corresponding record from the exportmaintenance table.


Exporting unacknowledged records involves resending all records from exportmaintenance that our outside the resend time period.


When records have been in exportmaintenance for more than 7 days, an export maintenance threshold email alert is generated.


This email will have a subject of:-


SQL Server Alert System: 'UnAcknowledged Data Exceeded Threshold' occurred on \\MACHINE


And will contain text of:-


DATE/TIME:30/08/2007 8:00:00 AM


DESCRIPTION:Error: 60005 Severity: 16 State: 1 Branch 203 has 41 unacknowledged maintenance records > 7 days old.


COMMENT:Please Investigate.


JOB RUN:(None)


In addition to the above information, all export maintenance threshold alerts are also written to the SQL log as shown below.


EEE Monitoring Figure20.JPG


When these alerts occur, you need to determine why these alerts are being generated. Possible causes include:-


  • Isl-file.zip (or isl-file.dat) is still in the stores export directory. Possible communications failure.
  • Enabler is configured to only process import maintenance as a scheduled task, and Enabler is being shutdown at night preventing the Enabler scheduled tasks from running.
  • Isl-file.zip’s are not configured to append the date to the name and are overwriting existing files in the store. Use EEE Enterprise Configurator to turn on option to “Append date to export filenames” as shown below.

EEE Monitoring Figure21.JPG


  • Error occurring during import of host maintenance preventing acknowledgement records being created. Investigate cause of error.
  • Acknowledgement records being created by Enabler but not being sent to EEE. Possible communications problem or Enabler configuration issue.

While these errors persist, stores are most likely not receiving product maintenance, so these need to be investigate ASAP. If the reason cannot be determined, please forward details to Magenta Retail Support (support@magentaretail.com.au) for further investigation.


Database Backup Alerts

EEE has scheduled tasks which are used to perform database and transaction log backups. The backups can be configured to run on a 2 day cycle or a 7 day cycle. When one of these backups fails, an alert email is generated with a subject similar to:-


SQL Server Job System: 'eeehk_MWFSDBBackup' completed on [//DELL-TT \\DELL-TT]


And containing text similar to what is shown below.


JOB RUN:'eeehk_MWFSDBBackup' was run on 6/09/2007 at 4:37:23 PM

DURATION:0 hours, 0 minutes, 3 seconds

STATUS: Failed

MESSAGES:The job failed. The Job was invoked by User DELL-TT\Trevor. The last step to run was step 1 (Step1_RunBackup).


The above information indicates that the backup has failed, but you need to look at the scheduled task within Microsoft SQL Server Management Studio, and under SQL Agent, expand Jobs as shown below.


EEE Monitoring Figure22.JPG


Right click on the failed database backup job, and then select “View History”. You will see a screen similar to below.


EEE Monitoring Figure23.JPG


To get further information on the failed job, click on the + beside the job with the EEE Monitoring Figure18.JPG beside it. Click on the next line which will also have a EEE Monitoring Figure18.JPG on it, and you will see a screen similar to below.


EEE Monitoring Figure24.JPG


The message shown above will then assist you in determining why the backup failed. If you are backing up to a network location, a network error can result in a corrupt backup device. The easiest fix for a corrupt backup device is to delete the file MWFSDB.BAK or TTSDB.BAK, as these devices get re-created when the job is next run. If you have deleted a backup device, you should run another backup ASAP.


Log Backup Alerts

EEE has scheduled tasks which are used to perform database and transaction log backups. The backups can be configured to run on a 2 day cycle or a 7 day cycle. When one of these backups fails, an alert email is generated with a subject similar to:-


SQL Server Job System: 'eeehk_MWFSLogBackup' completed on [//DELL-TT \\DELL-TT]


And containing text similar to what is shown below.


JOB RUN:'eeehk_MWFSLogBackup' was run on 6/09/2007 at 4:52:02 PM

DURATION:0 hours, 0 minutes, 0 seconds

STATUS: Failed

MESSAGES:The job failed. The Job was invoked by User DELL-TT\Trevor. The last step to run was step 1 (Step1_RunBackup).


The procedure for investigating a log backup failure is the same as for a database backup failure detailed earlier.

Personal tools