Build a Learning Model to detect IOT malware Detection

Cancelado Postado há 2 anos Pago na entrega
Cancelado Pago na entrega

Task 3: CDMC2021 IoT Malware Detection

Based on the control flow graphs (CFGs) generated by a static-analysis tool, Radare2, and labels that indicating whether the samples are malware programs, the participants are required to perform an IoT malware detection task to predict whether the samples in the test set are malware or not. The dataset consists of 54,829 samples, which are generated from the following procedure: (1) a collection of malicious and benign Linux programs in ELF format were collected from various sources; (2) each of these programs are fed to Radare2 to extract the CFG information; and (3) JSON output from Radare2 that can be interpreted as a list of directed-graph components are then reformulate as a single line in a text file. Please see the “File Format” section for more detail.

Label (1: malware, 0: benign ware) of the ELF files are determined by the state-of-art anti-virus engines.

List of Files

The [login to view URL] file contains feature information of 16,521files in the training set.

The [login to view URL] file contains label information of 16,521files in the training set.

The [login to view URL] file contains information of 38,550 files in the testing set.

File Format

Steps to formulate the features.

Radare2 outputs its analysis result for an ELF sample program as a JSON object looks like the following.

[{"name": "sym.__uClibc_main", "imports": ["[login to view URL]", "sym.__GI_memcpy", "sym._dl_aux_init", "sym.__uClibc_init"]}, {"name": "sym._fp_out_narrow", "imports": ["sym.__GI_strlen ", "sym._charpad", "sym.__stdio_fwrite"]}, …]

Then, each node in the list is represented as a list of function calls with the “name” field placed at first, followed by the function calls in the “import” field. The components in the list are separated by white spaces. The JSON object above is changed to a list of nodes as follows.

Node 1: "sym.__uClibc_main" "[login to view URL]" "sym.__GI_memcpy" "sym._dl_aux_init" "sym.__uClibc_init"

Node 2: "sym._fp_out_narrow" "sym.__GI_strlen" "sym._charpad" "sym.__stdio_fwrite"

Nodes 3~: …

All nodes in the JSON list are sequentially joined by semicolons to form a single line in a .data file. Now, each line in the .data file corresponds to a single file in the dataset.

Line 1: "sym.__uClibc_main" "[login to view URL]" "sym.__GI_memcpy" "sym._dl_aux_init" "sym.__uClibc_init";"sym._fp_out_narrow" "sym.__GI_strlen" "sym._charpad" "sym.__stdio_fwrite";…

Task

The participants are required to provide the prediction of labels of the test samples based on information provided in the task.

Machine Learning (ML) Deep Learning Linguagem Natural Processamento de dados

ID do Projeto: #31566368

Sobre o projeto

4 propostas Projeto remoto Ativo em há 2 anos

4 freelancers estão ofertando em média ₹5975 nesse trabalho

sajjadtaghvaeifr

Hi, I hope you are doing fine. I have almost 10 years of experience in machine learning algorithms. I can implement various types of artificial intelligence algorithms including yours with Matlab, Python and etc. I hav Mais

₹20000 INR in 7 dias
(8 Comentários)
4.1
mramkukadiya

Hello madam, I am a professional data entry operator and I am a professional photographer and writer I need a job My typing speed is 50 wpm and 95%accuracy So please hire me

₹1050 INR in 7 dias
(0 Comentários)
0.0
AbhiRam121

Answer small sums sh dollars than mams hang mama can msgs Einstein nieces junctions section money minded Mintra mines

₹600 INR in 5 dias
(0 Comentários)
0.0
DotcomIoTIndia

I'm IoT Geek having 10+ years of experience in embedded product development from scratch to market!!! I have the following skillset to make your vision sharp towards your requirement • IoT Architecture for the Smart C Mais

₹2250 INR in 30 dias
(0 Comentários)
0.0