Projects | iVision

Deep Object Removal on Satellite Images

Accurate Removal of objects from a scene without their traces is a difficult task. Doing so with no background information is challenging. Existing methods using Convolutional neural networks (CNNs) do provide us with satisfactory solutions but have been observed to be inefficient when propagating information across distant spatial positions in images which results in blurred and distorted reconstruction in the missing areas. In the present study using Contextual attention-based approaches helps with the reconstruction of the background by using contextual loss to match the best matching texture for the patch. Our goal is to implement the best-suited technique to satellite images which perfectly conceals the desired data information to be concealed.

Military Target Detection and Classification in Satellite Images

Target detection in satellite images is a challenging problem due to the varying size, orientation and background of the target objects. In this project, we employ modern deep learning models including Region based Convolutional Neural Network and YOLO v3 object detector for detection and recognition of military targets in a scene. Military targets include aircrafts, armored vehicles and oil tanks. The deep learning based object detectors provides location and centroids of each target in satellite images and tracking results of moving targets in videos. The SAR images are pre-filtered in order to reduce the noise factor in the images to increase the recognition capabilities. Encouraging experimental results are obtained on benchmark datasets which show that deep learning provides robust and efficient solutions to learn representations from the massive satellite imagery.

Saliency based Visualization of Hyperspectral Images

The problem with visualization of hyperspectral images on tri-stimulus displays arises from the fact that they contain hundreds of spectral bands while generally used display devices support only three bands/channels namely blue, green and red. Therefore, for visualization a hyperspectral image has to be reduced to three bands. The main challenge while performing this band reduction is to retain and display the maximum information available in a hyper-spectral image. Human visual system focuses attention on certain regions in images called “salient regions”. Therefore to provide a comprehensive representation of hyper-spectral data on tri-stimulus displays we propose to use a weighted fusion method of saliency maps and hyper-spectral bands. The efficacy of the proposed algorithm has been demonstrated by tests on both urban and countryside images of AVIRIS and ROSIS sensors.

Artificial Intelligence-Based Video Analytics for Traffic Management (AiVAM)

Traffic regulations face enormous challenges due to rapid growth in the number of transportation vehicles. The detection of vehicle type is an important part of traffic surveillance as it helps in traffic monitoring as well as congestion handling. The growth in automobile industry has resulted in a wide variety of vehicles on the roads, including private use vehicles, public transport and vehicles for many other purposes. Traffic surveillance cameras capture videos from various angles, orientations in varying lighting and visibility conditions. Therefore, machine learning-based models trying to do vehicle classification need to be trained on a diverse dataset for robust performance. This research mainly focuses on the development and deployment of deep learning-based system to analyze vehicular traffic. The model uses vehicle‘s apparent features to detect, classify and count all the vehicles from live traffic feed. The multi-class (around 55 classes) model is able to classify vehicle make, model and year of manufacture.

Machine Learning based Classification of RADAR and LTE Signals in 5GHz Band

Citizen Broadband Radio Service (CBRS) has been adopted by FCC in USA for shared wireless broadband in 3550-3700 MHz band. It utilizes a Spectrum Access System (SAS) which is a cloud based highly automated database and controller that coordinates the frequency usage in the CBRS spectrum. This band is currently used by Naval Radars and satellite ground stations. For efficient utilization of this band by MNOs, we propose a CNN-based classifier that can distinguish between radar and LTE signals. Actual signals are utilized through USRPs for analyzing different interfering scenarios.

Leaf Water Content Estimation using Infrared Spectra

Accurate and timely measurement of Leaf Water Content (LWC) is crucial for vegetation health and productivity estimation, planning irrigation, forecasting drought and predicting woodland fire. This study is mainly focused on retrieval of LWC from Mid and Thermal Infrared (2.50 -14.0 μm) range of the electromagnetic spectrum using regression. The automatic representation learning property of deep neural networks is utilized for selection of spectral wavebands with high predictive performance resulting in high adjusted-R2 and low RMSE. The experiments show a high accuracy with adjusted-R2 of 0.93 and RMSE equal to 3.1%. The study also demonstrated that MIR is more sensitive to the variation in LWC as compared to TIR and the combined use of both spectra enhances the predictive performance in retrieval of LWC. The findings of this study will allow the future space missions to position wavebands at sensitive regions for characterizing vegetation stresses.

Ink Analysis for Forgery Detection in Hyperspectral Document Images

Ink analysis allows for determination of ink age and forgery and identification of pen or writer in document images. The spectral information of inks in hyperspectral document images provides valuable information about the underlying material and thus helps in discrimination of inks based on their unique spectral signatures even if they have the same color. In this project, the spectral responses of ink pixels are extracted from a hyperspectral document image, reshaped to a CNN-friendly image format and fed to the CNN for classification. The proposed method effectively identifies different ink types present in unbalanced proportions in a hyperspectral document image for forgery detection and achieves a very high accuracy on the UWA Writing Ink Hyperspectral Images (WIHSI) database. This research opens a new window for research on automated forgery detection in hyperspectral document images using deep learning.

Ligature Classification for Urdu Nastaleeq Text Recognition

Nasteleeq text recognition is one of the most challenging problem in Urdu document analysis and recognition. The cursive nature of Urdu text makes the segmentation of characters very difficult, however, segmentation free ligature based techniques are more promising. This study focuses on classification and recognition of Urdu ligatures using Convolutional Neural Network (CNN). The proposed system has been trained on our self-generated dataset comprising 18,000 Urdu Ligatures divided into 98 classes. CNN achieved very high classification rates as compared to the traditional methods.

Age Invariant and Disguise Invariant Facial Recognition

The facial appearance of a person changes considerably over time which induces significant intra-class variations, which makes face recognition a very challenging task. Most of the face recognition studies that have addressed the ageing problem in the past have employed complex models and handcrafted features with strong parametric assumptions. In this work, we propose a novel deep learning framework that extracts age-invariant and generalized features from facial images of the subjects. The proposed model trained on facial images from a minor part (20-30%) of lifespan of subjects correctly identifies them throughout their lifespan. A variety of pre-trained Convolutional Neural Networks (CNNs) are compared in terms of accuracy, time complexity and computational complexity to select the most suitable network for Age Invariant Face Recognition (AIFR). Extensive experimental results are carried out on the popular and challenging Face and Gesture Recognition Network (FG-NET) Ageing Dataset. The proposed method achieves promising results and outperforms the state-of-the-art AIFR models by achieving an accuracy of 99%, which proves the effectiveness of deep learning in facial ageing research.

Breast Cancer Detection and Tumor Growth Rate Estimation

Breast cancer is the most common type of cancer among women around the globe. Timely diagnosis of cancer is mandatory for its successful treatment. For this purpose, machine learning is used to classify images into four categories: normal, benign, in situ carcinoma, and invasive carcinoma. The ICIAR 2018 grand challenge on BreAst Cancer Histology (BACH) dataset is used in this project. A “patch-wise” network acts as an auto-encoder that extracts the most salient features of image patches while the second “image-wise” network performs classification of the whole image. Pretrained convolutional neural networks including AlexNet, GoogleNet, and ResNet are fine-tuned for image classification at multiple cellular and nuclei configurations. Promising classification results have been observed in which ResNet reported the highest accuracy of 85%. This work is being extended to increase its efficiency and reduce the human dependency. The design can also be used for automation of other medical image analysis methods.

Classification of Graphomotor Impressions for Automated Neuropsychological Screening Tests

Graphomotor impressions are a product of complex cognitive, perceptual and motor skills and are widely used as psychometric tools for the diagnosis of a variety of neuropsychological disorders. Apparent deformations in these responses are quantified as errors and are used are indicators of various conditions. Contrary to conventional assessment methods where manual analysis of impressions is carried out by trained clinicians, an automated scoring system is marked by several challenges. Prior to analysis, such computerized systems need to extract and recognize individual shapes drawn by subjects on a sheet of paper as an important pre-processing step. The aim of this study is to employ deep learning for automated and accurate recognition of visual structures of interest produced by subjects. Experiments on figures of Bender Gestalt Test (BGT), a screening test for visuo-spatial and visuo-constructive disorders, produced by 120 subjects, demonstrate that deep feature representation brings significant improvements over classical approaches. The study is being extended to discriminate coherent visual structures between produced figures and expected prototypes.