Understanding the Basics of AI at Edge

Edge applications are popular nowadays because of its advance available features. It seemed to much safer than using a cloud. AI has gained interest in a lot of people today by the advance features it provides.
The edge in simpler terms means local processing and is often used where low latency is required.
Low latency describes Computer Network that is optimized to process a very high volume of messages with minimal delay.

These networks provide real time access to rapidly changing data.

AI at edge is important because of:
1. AI at edge creates network impacts
2. It helps in Latency consideration
3. It is more secure compared to cloud. Sharing personal data at cloud is not safe as compared to sharing at edge.
4. We can optimize the data for local inference


1. It helps in Self- Driving Cars
2. It can be used in personal fitness tracker watch
3. It is used in robot doing surgery
4. It is even observed in tools like Alexa and Google home
We can build various Edge application such as People Counter App.

The logistics behind this app is:

1. Convert the Model to IR format
2. Use it With Inference Engine
3. Process the output to gather relevant statistics
4. Send the statistics to Server
5. Analyse the performance
6. Analyse further use case for Server

The Edge application would be built accordingly by the using this technique.
Types of Computer Vision Models
This article would discuss about various types of computer vision models and application based on them.

Different types of Computer Vision Models

1. Classification: It is used to determine the "class" of the output. It typically returns Yes/No depending upon Input amount thousands of classes.
2. Detection: This model is used to find object and their location in the input image. It returns the X and Y coordinate of the input image.
The various models such as Car detection in the traffic comes under this category.
3. Segmentation: This model is used to classify each and every pixel of the image. It helps to check every minute input of the image.

It can easily understand Simple Class Vs. Not a Class Vs. Tens of Class. It is used in post-processing and can remove or smoothen a very small area.

It is of two types:
1. Semantic Segmentation: In this all the objects of same classes are one.
2. Instance Segmentation: In this all the objects of a class are separate.

The framework developed using these models are:

1. SSD is an object detection network that combines classification with object detection through use of default boundary boxes at different layers and network levels.
2. RESNET utilizes residual layer to skip over section of layers and avoid vanishing gradient problem with deep neural network.
3. Mobile Net utilized the layers like 1*1 convolutions to cut down on computational complexity and network size leading to fast inference without substantial decrease in accuracy.

Model Optimizer in AI and its types
This article would tell you about Model optimizer and its types
Model optimizer converts model from multiple frameworks to IR for Inference Engine. It helps in shrinking and speeding up of the Model.

Model optimizer fails when we have to lower the precision because while performing this operation loss of accuracy taken place.

There are three optimization techniques:
1. Quantization: This technique reduces the precision value from FP 32 to FP 16 or INT 8.
This technique causes substantial loss inaccuracy. It leads to smaller and much faster models.
2. Freezing: This is basically used in TensorFlow Model. This technique is primarily used for freezing the individual layer.
3. Fusion: As the name goes this technique combines multiple-layer operations into one. This technique relatively useful in Parallel computation.

The various layers such as Batch Norm, Activation and Convolution is combined into one to produce a combined output.

Some of the frameworks are their developed officials are as follows:
1. Caffee - By UC Berkeley
2. MXNET - By Apache Software
3. TensorFlow- By Google
4. ONNX - By Facebook and Microsoft

Important Terms related to Artificial Intelligence in Edge Applications

It is a review to the important terms used in Edge Application in reference to Artificial Intelligence.
Model Optimizer
It is a command-line tool used for converting a model from one of the supported frameworks to an Intermediate Representation (IR), including certain performance optimizations, that is compatible with the Inference Engine.
Optimization Techniques
Optimization techniques adjust the original trained model in order to either reduce the size of or increase the speed of a model in performing inference.

The various optimization techniques preferred are:
1. Quantization
2. Freezing
3. Fusion

It is used to reduce precision of weights and biases (to lower precision floating point values or integers), thereby reducing compute time and size with some (often minimal) loss of accuracy. It is used for constraining an input from large set of values to discrete set.


In TensorFlow this removes metadata only needed for training, as well as converting variables to constants. It is used in training neural networks, where it often refers to freezing layers themselves in order to fine tune only a subset of layers.


The process of combining certain operations together into one operation and thereby needing less computational overhead. For example, a batch normalization layer, activation layer, and convolutional layer could be combined into a single operation. This can be particularly useful for GPU inference, where the separate operations may occur on separate GPU kernels, while a fused operation occurs on one kernel, thereby incurring less overhead in switching from one kernel to the next.

Supported Frameworks

Open Vino Toolkit is an application which currently supports models from five frameworks (which themselves may support additional model frameworks): Caffe, TensorFlow, MXNet, ONNX, and Kaldi.

CAFFE stands for “Convolutional Architecture for Fast Feature Embedding” used in Deep Learning. It is officially built at UC Berkley.

TensorFlow is an open-source deep learning library originally built at Google. As an Easter egg for anyone who has read this far into the glossary, this was also your instructor’s first deep learning framework they learned, back in 2016 (pre-V1!).

Apache MXNet is an open-source deep learning library built by Apache Software Foundation.

The “Open Neural Network Exchange” (ONNX) framework is an open-source deep learning library originally built by Facebook and Microsoft. PyTorch and Apple-ML models are able to be converted to ONNX models.

While still open-source like the other supported frameworks, Kaldi is mostly focused around speech recognition data, with the others being more generalized frameworks.

Intermediate Representation
A set of files converted from one of the supported frameworks, or available as one of the Pre-Trained Models. This has been optimized for inference through the Inference Engine, and may be at one of several different precision levels.

Made of two files:
· .xml - Describes the network topology or architecture
· .bin - Contains the weights and biases in a binary file

Supported Layers
These layers support for direct conversion from supported framework layers to intermediate representation layers through the Model Optimizer. While nearly every layer you will ever use is in the supported frameworks is supported, there is sometimes a need for handling Custom Layers.

Custom Layers

Custom layers are those outside of the list of known, supported layers, and are typically used rarely. Handling custom layers in a neural network for use with the Model Optimizer depends somewhat on the framework used; other than adding the custom layer as an extension, you otherwise have to follow instructions specific to the framework. It can be installed as an extension used whenever needed.

OpenCV: An open source Library supported by Intel

Open CV is an open source library for image processing and computer vision techniques. It is built upon highly optimized C++ Back-end. We could even work on it using Java and Python. It is used in edge applications. It offers built in computer vision techniques or handling image processing.

Open CV is used in:
1. Capturing and Reading frames / images from a Video Stream
2. Resizing an image to as per requirements
3. Changing colour from one stream to another
4. It offers various computer vision techniques such as Canny Edge Detection which is used to extract edges from an image. It provides support to various other extensions thereby promoting functions such as Face Detection.

Some of the useful CV Functions are as follows:
1. cv2.VideoCapture(): It is used for reading or processing frames from the video
2. cv2.resize(): It is used to resize the image as per the requirement or particular constraints
3. cv2.cvtColor(): It is used to change the colour format to BGR as Open Vino by default accepts RGB Format
4. cv2.rectangle(): It is used to draw the bounding boxes onto an output image
5. cv2.imwrite(): It is used for saving down a particular image
Terms to remember while deploying an app in OpenCV

A computer vision (CV) library filled with many different computer vision functions and other useful image and video processing and handling capabilities.

A publisher-subscriber protocol often used for IoT (Internet of Things) devices due to its lightweight nature. The “paho-mqtt” library is a common way of working with MQTT in Python.

Publish-Subscribe Architecture
A messaging architecture whereby it is made up of publishers, that send messages to some central broker, without knowing of the subscribers themselves. These messages can be posted on some given “topic”, which the subscribers can then listen to without having to know the publisher itself, just the “topic”.

In a publish-subscribe architecture, the entity that is sending data to a broker on a certain “topic”.

In a publish-subscribe architecture, the entity that is listening to data on a certain “topic” from a broker.

In a publish-subscribe architecture, data is published to a given topic, and subscribers to that topic can then receive that data.

Software that can help convert or stream audio and video. In the course, the related ffserver software is used to stream to a web server, which can then be queried by a Node server for viewing in a web browser. The Node Server is responsible for performing all the operations without displaying any information on the front end of the web browsers.

It is a python framework which is useful for web development and another potential option for video streaming to a web browser.

Node Server
A web server built with Node.js that can handle HTTP requests and/or serve up a webpage for viewing in a browser.

Let us understand the concept of Node.js with a help of Social Media site such as Instagram, Facebook etc. We often get our social media post based upon our recent search and clicks or our social media usage. Node at the backend is responsible for this. Node allows JavaScript at the backend to run outside of the browser and gather relevant post and send these post as per our requirement.

MLSA 2020 || Student || Connect with me at: www.drillitdown.com ||

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store