Preparing the Data
Learn how to prepare your data for anomaly detection.
We'll cover the following
Data in server logs
Application servers such as Nginx, Apache, and IIS write very useful information to access logs. The data in these logs can be instrumental in identifying anomalies.
We will analyze logs of a web application, so the data we are most interested in is the timestamp and the status code of every response from the server. To illustrate the type of insight, we can draw from this data:
- A sudden increase in the
500
status code: You may have a problem with the server. Did you push a new version? Is there an external service you are using that started failing in unexpected ways? - A sudden increase in the
400
status code: You may have a problem with the client. Did you change some validation logic and forgot to update the client? Did you make a change and forgot to handle backward compatibility? - A sudden increase in the
404
status code: You may have an SEO problem. Did you move some pages and forgot to set up redirects? Is there some script kiddy running a scan on your site? A sudden increase in the200
status code: You either have some significant legit traffic coming in or are under a DOS attack. Either way, you probably want to check where it is coming from.
Get hands-on with 1400+ tech skills courses.