Architecture Overview - Machine Learning for Telecommunication

Architecture Overview

Deploying this solution builds the following environment in the AWS Cloud.

        Machine Learning for Telecommunication Architecture

Figure 1: Machine Learning for Telecommunication architecture on AWS

The AWS CloudFormation template deploys an Amazon Simple Storage Service (Amazon S3) bucket that includes a synthetic IP Data Record (IPDR) dataset in Abstract Syntax Notation One (ASN.1) format or call detail record (CDR) format. The template also deploys an AWS Glue job to convert the dataset from CSV to Parquet compressed format, and an SageMaker instance with Machine Learning (ML) Jupyter notebooks. The solution’s Jupyter notebooks include instructions that guide the user through the process of constructing additional data features for future models.

When the workflow is triggered in the Jupyter web interface, the solution ingests data from the Amazon S3 bucket into the SageMaker cluster which enables the user to run Jupyter notebooks on the dataset. Amazon S3 Select reads the Parquet compressed data that was processed by the AWS Glue job. The notebooks preprocess the data, extract features, and divide the data into training and testing. ML algorithms process the training dataset to develop a model to identify anomalies and predict future anomalies. The solution then tests the results of the model’s predictions against the testing dataset, identifies false positives and negatives, and retrains to fine-tune the model.