R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections

TonTon H.-D. Huang*, and Hung-Yu Kao**

Leopard Mobile (Cheetah Mobile Taiwan Agency)*
Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan* **
TonTon (at) TWMAN.ORG*

Android Color images: Malware / Benign

Real Case study


The materials of our research and experiment dataset: Train / Test


IEEE BigData 2018Seattle, WA, USA, Dec 10-13, 2018 | 2018/12/11

The influence of Deep Learning on image identification and natural language processing has attracted enormous attention globally. The convolution neural network that can learn without prior extraction of features fits well in response to the rapid iteration of Android malware. The traditional solution for detecting Android malware requires continuous learning through pre-extracted features to maintain high performance of identifying the malware. In order to reduce the manpower of feature engineering prior to the condition of not to extract pre-selected features, we have developed a coloR-inspired convolutional neuRal networks (CNN)-based AndroiD malware Detection (R2-D2) system. The system can convert the bytecode of classes.dex from Android archive file to RGB color code and store it as a color image with the fixed size. The color image is input to the convolutional neural network for automatic feature extraction and training. The data was collected from Jan. 2017 to Aug 2017. During the period of time, we have collected approximately 2 million of benign and malicious Android apps for our experiments with the help from our research partner Leopard Mobile Inc. Our experiment results demonstrate that the proposed system has accurate security analysis on contracts. Furthermore, we keep our research results and experiment materials on http://R2D2.TWMAN.ORG.

OWASP AppSec USA 2017, Orlando, Florida, September 19-22, 2017 | 2017/09/22, 10:30am~11:15am (GMT -4)

Machine Learning (ML) has found it particularly useful in malware detection. However, as the malware evolves very fast, the stability of the feature extracted from malware serves as a critical issue in malware detection. The recent success of deep learning in image recognition, natural language processing, and machine translation indicates a potential solution for stabilizing the malware detection effectiveness. We present a color-inspired convolutional neural network-based Android malware detection, R2-D2, which can detect malware without extracting pre-selected features (e.g., the control-flow of op-code, classes, methods of functions and the timing they are invoked etc.) from Android apps. In particular, we develop a color representation for translating Android apps into RGB color code and transform them to a fixed-sized encoded image. After that, the encoded image is fed to convolutional neural network for automatic feature extraction and learning, reducing the expert’s intervention. We have run our system over 800k malware samples and 800k benign samples through our back-end (60 million monthly active users and 10k new malware samples per day), showing that R2-D2 can effectively detect the malware. Furthermore, we will keep our research results on http://R2D2.TWMAN.ORG if there any update.

RuxCon 2017, Melbourne, Australia, October 21-22, 2017 | 2017/10/22, 15:00~16:00 (GMT +11)

Ransomware such as WannaCrypt and Petya have caused significant financial loss and even have endangered human life (e.g., ransomware attack on UK hospitals). Ransomware on the desktop has gained much attention from academic and industry. However, we see that the number of ransomware on Android phones remains steady increasing, but gains much less attention. As Android has been the most popular smartphone OS and a substantial number of credentials are kept only in smartphones, the data loss incurs serious inconvenience and loss. Here, we present our deep learning-based ransomware detection system, coloR-inspired convolutional neuRal network-based androiD ransomware Detection (R2D2). R2D2 was originally developed to sweep the malware, but we found it particularly useful in detecting ransomware. A unique feature is its end-to-end training, without human intervention. Such an end-to-end training points out a direction that we no longer need tedious search for roust ransomware features for detection. Most importantly, based on R2D2, we develop techniques to encode ransomware as so-called ransomware image, such that the ransomware from the same family exhibit the same pattern and even non-experts can detect and even determine the ransomware family with their the naked eye.

http://www.cmcm.com/blog/tw/security/2017-05-31/1062.html

https://arxiv.org/abs/1705.04448



Hunting the Ethereum Smart Contract: Color-inspired Inspection of Potential Attacks

TonTon H.-D. Huang*, and Hung-Yu Kao**

Leopard Mobile (Cheetah Mobile Taiwan Agency)*
Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan* **
TonTon (at) TWMAN.ORG*

AI Village, Defcon 26, Augest 9-12, 2018, Las Vegas, Nevada | 2018/08/11 16:20 (GMT -7)

Blockchain and Cryptocurrencies are gaining unprecedented popularity and understanding. Meanwhile, Ethereum is gaining a significant popularity in the blockchain community, mainly due to the fact that it is designed in a way that enables developers to write decentralized applications (Dapps) and smart contract. This new paradigm of applications opens the door to many possibilities and opportunities. However, the security of Ethereum smart contracts has not received much attention; several Ethereum smart contracts malfunctioning have recently been reported. Unlike many previous works that have applied static and dynamic analyses to find bugs in smart contracts, we do not attempt to define and extract any features; instead we focus on reducing the expert’s labor costs. We first present a new in-depth analysis of potential attacks methodology and then translate the bytecode of solidity into RGB color code. After that, we transform them to a fixed-sized encoded imag e. Finally, the encoded image is fed to convolutional neural network (CNN) for automatic feature extraction and learning, detecting security flaw of Ethereum smart contract.



This work would not have been possible without the valuable dataset offered by Cheetah Mobile and Security Master (a.k.a CM Security, an Android application). 

Special thanks to Dr. Chia-Mu Yu for his support on this research.