AI/Machine learning reference project

As a part of one of our projects we created an artificial intelligence module for a log management system.
Within the module we collect data from many devices such as servers, routers, workstations. We process a large amount of data about a few terabytes. Some data require indexing character strings for analysis.

The module offers the following possibilities:

- making predictions related to the use of hardware resources, e.g. the amount of memory, disk space, etc.;

- data clustering to detect anomalies.

Technology:
Apache Spark

AI Algorithms:
Multi Layer Perceptron, Linear Regression, Random Forest Regression, Simple/Geometrics Moving Average, Linear Regression Trend, Trend Line (with threshold point), k-means (ynsupervised learning).

Data sources:
Elasticsearch, relational databases, flat files