Skip to main content
Malware Analysis and Detection
  1. Resources/
  2. Study Materials/
  3. Information & Communication Technology Engineering/
  4. ICT Semester 5/
  5. Cyber Security (4353204)/
  6. Cyber Security Slidev Presentations/

Malware Analysis and Detection

·
Milav Dabgar
Author
Milav Dabgar
Experienced lecturer in the electrical and electronic manufacturing industry. Skilled in Embedded Systems, Image Processing, Data Science, MATLAB, Python, STM32. Strong education professional with a Master’s degree in Communication Systems Engineering from L.D. College of Engineering - Ahmedabad.
Table of Contents

Malware Analysis and Detection
#

Unit II: Account & Data Security
#

Lecture 12: Understanding and Analyzing Malicious Software
#

Course: Cyber Security (4353204) | Semester V | Diploma ICT | Author: Milav Dabgar

layout: default
#

Understanding Malware
#

๐Ÿฆ  What is Malware?
#

Malware (Malicious Software) is any software intentionally designed to cause damage to computers, servers, clients, or computer networks.

๐ŸŽฏ Malware Characteristics
#

  • Malicious intent - Designed to harm
  • Unauthorized access - Operates without permission
  • Data theft or destruction capabilities
  • System disruption functionality
  • Self-replication or propagation

๐Ÿ“Š Malware Statistics (2024)
#

  • 5.6 billion malware attacks annually
  • 560,000 new malware samples daily
  • Ransomware attacks every 11 seconds
  • $6 trillion global cybercrime cost
  • 95% of attacks target Windows systems

๐Ÿฆ  Types of Malware
#

๐Ÿ’€ Primary Categories
#

graph TB
    A[Malware] --> B[Virus]
    A --> C[Worm]
    A --> D[Trojan]
    A --> E[Ransomware]
    A --> F[Spyware]
    A --> G[Adware]
    A --> H[Rootkit]
    A --> I[Botnet]
    
    style A fill:#ff6b6b
    style B fill:#ffd93d
    style C fill:#6bcf7f
    style D fill:#4dabf7
    style E fill:#f783ac

๐Ÿ” Malware Classification
#

  • By propagation method
  • By payload type
  • By target platform
  • By concealment technique
  • By infection vector

๐Ÿ“ˆ Evolution Timeline
#

  • 1970s: First computer worms
  • 1980s: Boot sector viruses
  • 1990s: Macro viruses
  • 2000s: Internet worms
  • 2010s: Advanced persistent threats
  • 2020s: AI-powered malware
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Malware Types in Detail
#

๐Ÿฆ  Virus
#

๐ŸŽฏ Virus Characteristics
#

  • Requires host program to execute
  • Self-replicating code
  • Infects other files/programs
  • Activated by user action

๐Ÿ“ Virus Types
#

File Infector Viruses:
  - Executable files (.exe, .com)
  - Dynamic Link Libraries (.dll)
  - Script files (.bat, .cmd)

Boot Sector Viruses:
  - Master Boot Record (MBR)
  - Boot sector infection
  - System startup infection

Macro Viruses:
  - Office documents
  - Email attachments
  - Template infection

๐Ÿ› Worm
#

๐ŸŽฏ Worm Characteristics
#

  • Self-contained program
  • Network propagation
  • No host program required
  • Automatic spreading

๐ŸŒ Famous Worms
#

  • Morris Worm (1988) - First internet worm
  • ILOVEYOU (2000) - Email worm
  • Code Red (2001) - IIS web server worm
  • Stuxnet (2010) - Industrial control systems
  • WannaCry (2017) - Ransomware worm

๐ŸŽ Trojan Horse
#

๐ŸŽฏ Trojan Characteristics
#

  • Disguised as legitimate software
  • Social engineering component
  • No self-replication
  • User installation required

๐Ÿ•ต๏ธ Trojan Categories
#

Remote Access Trojans (RAT):
  - System remote control
  - Data theft capabilities
  - Keylogger functionality
  
Banking Trojans:
  - Financial data theft
  - Transaction manipulation
  - Credential harvesting
  
Downloader Trojans:
  - Additional payload delivery
  - Multi-stage infection
  - Evasion techniques

๐Ÿ”’ Ransomware
#

๐ŸŽฏ Ransomware Operation
#

  1. Infection - Initial compromise
  2. Encryption - File encryption
  3. Notification - Ransom demand
  4. Payment - Cryptocurrency demanded
  5. Recovery - Files potentially restored

๐Ÿ’ฐ Ransomware Economics
#

  • Average ransom: $812,000
  • Downtime cost: $1.85 million
  • Recovery time: 287 days average
  • Success rate: 70% of victims pay
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Malware Analysis Fundamentals
#

๐Ÿ” Analysis Types
#

๐Ÿƒโ€โ™‚๏ธ Static Analysis
#

  • No code execution
  • File structure examination
  • String analysis
  • Signature identification
  • Metadata extraction

๐Ÿƒโ€โ™‚๏ธ Dynamic Analysis
#

  • Controlled execution
  • Behavior monitoring
  • System interaction observation
  • Network traffic analysis
  • Registry changes tracking

๐Ÿงฌ Hybrid Analysis
#

  • Combines both approaches
  • Automated sandboxing
  • Comprehensive results
  • Better evasion detection

๐Ÿ› ๏ธ Analysis Tools
#

๐Ÿ“Š Static Analysis Tools
#

Disassemblers:
  - IDA Pro: Advanced disassembly
  - Ghidra: NSA open-source tool
  - Radare2: Reverse engineering
  - x64dbg: Windows debugger

File Analysis:
  - PE Explorer: PE file analysis
  - Hex Workshop: Hex editor
  - Strings: Text extraction
  - File: File type identification

๐Ÿ”ฌ Dynamic Analysis Tools
#

Sandboxes:
  - Cuckoo Sandbox: Open source
  - Joe Sandbox: Commercial
  - Any.run: Cloud-based
  - Hybrid Analysis: Online

Monitoring Tools:
  - Process Monitor: System activity
  - Wireshark: Network traffic
  - RegShot: Registry changes
  - API Monitor: API calls

๐ŸŒ Online Analysis Services
#

  • VirusTotal - Multi-engine scanning
  • Hybrid Analysis - Automated sandbox
  • Joe Sandbox - Professional analysis
  • Any.run - Interactive analysis
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Static Analysis Techniques
#

๐Ÿ“‹ File Structure Analysis
#

๐Ÿ” PE Header Inspection
#

# Using pefile Python library
import pefile

pe = pefile.PE('malware.exe')

# Basic information
print(f"Architecture: {pe.FILE_HEADER.Machine}")
print(f"Timestamp: {pe.FILE_HEADER.TimeDateStamp}")
print(f"Entry Point: {hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)}")

# Sections
for section in pe.sections:
    print(f"Section: {section.Name.decode().strip()}")
    print(f"Virtual Size: {hex(section.Misc_VirtualSize)}")
    print(f"Characteristics: {hex(section.Characteristics)}")

๐Ÿงต String Extraction
#

# Extract readable strings
strings malware.exe | grep -E "(http|ftp|\.exe|\.dll|registry)"

# Python implementation
import re

def extract_strings(filename, min_length=4):
    with open(filename, 'rb') as f:
        data = f.read()
    
    # ASCII strings
    ascii_pattern = b'[!-~]{%d,}' % min_length
    strings = re.findall(ascii_pattern, data)
    
    return [s.decode('ascii', errors='ignore') for s in strings]

๐Ÿ” Cryptographic Analysis
#

๐Ÿ”‘ Entropy Calculation
#

import math
from collections import Counter

def calculate_entropy(data):
    if not data:
        return 0
    
    # Count byte frequencies
    byte_counts = Counter(data)
    data_len = len(data)
    
    # Calculate entropy
    entropy = 0
    for count in byte_counts.values():
        probability = count / data_len
        entropy -= probability * math.log2(probability)
    
    return entropy

# High entropy (>7.5) may indicate encryption/packing
entropy = calculate_entropy(file_data)
if entropy > 7.5:
    print("File may be packed or encrypted")

๐Ÿท๏ธ Hash Analysis
#

import hashlib

def generate_hashes(filename):
    hashes = {}
    
    with open(filename, 'rb') as f:
        data = f.read()
    
    hashes['md5'] = hashlib.md5(data).hexdigest()
    hashes['sha1'] = hashlib.sha1(data).hexdigest()
    hashes['sha256'] = hashlib.sha256(data).hexdigest()
    
    return hashes

# Check against threat intelligence databases
hashes = generate_hashes('suspicious_file.exe')

๐Ÿ“Š YARA Rules
#

rule Suspicious_Strings {
    meta:
        description = "Detects suspicious strings"
        author = "Security Analyst"
        
    strings:
        $s1 = "CreateRemoteThread"
        $s2 = "WriteProcessMemory"
        $s3 = "VirtualAllocEx"
        $s4 = "SetWindowsHookEx"
        
    condition:
        2 of ($s*)
}
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Dynamic Analysis Techniques
#

๐Ÿƒโ€โ™‚๏ธ Sandbox Analysis
#

๐Ÿ”ง Sandbox Setup
#

# Cuckoo Sandbox installation
pip install cuckoo

# Initialize Cuckoo
cuckoo init

# Configure virtual machines
cuckoo machine --add windows7 \
  --label "Windows 7" \
  --ip 192.168.1.100 \
  --snapshot "clean"

# Submit malware sample
cuckoo submit malware.exe --options "full-memory-dump=yes"

๐Ÿ“Š Behavioral Indicators
#

File System Activity:
  - File creation/modification
  - Directory traversal
  - Temporary file usage
  - System file modification

Registry Activity:
  - Startup entries
  - Configuration changes
  - Persistence mechanisms
  - Service installations

Network Activity:
  - DNS queries
  - HTTP/HTTPS requests
  - P2P communications
  - Command & control traffic

๐Ÿ” Runtime Monitoring
#

๐Ÿ–ฅ๏ธ Process Monitoring
#

import psutil
import time

def monitor_processes():
    initial_processes = set(p.pid for p in psutil.process_iter())
    
    time.sleep(5)  # Wait for malware execution
    
    current_processes = set(p.pid for p in psutil.process_iter())
    new_processes = current_processes - initial_processes
    
    for pid in new_processes:
        try:
            process = psutil.Process(pid)
            print(f"New Process: {process.name()} (PID: {pid})")
            print(f"Command Line: {process.cmdline()}")
            print(f"Parent PID: {process.ppid()}")
        except psutil.NoSuchProcess:
            continue

๐ŸŒ Network Analysis
#

# Monitor network connections
netstat -an | grep ESTABLISHED

# Capture traffic with tcpdump
tcpdump -i eth0 -w malware_traffic.pcap

# Wireshark analysis
wireshark malware_traffic.pcap

๐Ÿ“ API Call Monitoring
#

# Using API Monitor or similar tools
api_calls = [
    "CreateFile",
    "WriteFile", 
    "RegSetValue",
    "InternetConnect",
    "CreateProcess",
    "VirtualAlloc"
]

# Monitor these API calls during execution
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Malware Detection Techniques
#

๐Ÿ” Signature-Based Detection
#

๐Ÿงฌ Pattern Matching
#

class SignatureDetector:
    def __init__(self):
        self.signatures = {
            'virus_a': b'\x4d\x5a\x90\x00\x03\x00\x00\x00',
            'trojan_b': b'\xff\xd0\x68\x00\x40\x00\x00',
            'worm_c': b'\x55\x8b\xec\x83\xec\x10\x56\x57'
        }
    
    def scan_file(self, filepath):
        with open(filepath, 'rb') as f:
            content = f.read()
        
        threats = []
        for name, signature in self.signatures.items():
            if signature in content:
                threats.append(name)
        
        return threats

๐Ÿ“Š Hash-Based Detection
#

import hashlib
import requests

def check_virus_total(file_hash):
    url = f"https://www.virustotal.com/vtapi/v2/file/report"
    params = {
        'apikey': 'YOUR_API_KEY',
        'resource': file_hash
    }
    
    response = requests.get(url, params=params)
    return response.json()

# Calculate file hash
with open('suspicious_file.exe', 'rb') as f:
    file_hash = hashlib.sha256(f.read()).hexdigest()

result = check_virus_total(file_hash)

๐Ÿค– Heuristic Detection
#

๐Ÿง  Behavioral Heuristics
#

Suspicious Behaviors:
  File Operations:
    - Mass file encryption
    - System file modification
    - Hidden file creation
    
  Network Operations:
    - Unknown server connections
    - Data exfiltration patterns
    - Botnet communication
    
  System Operations:
    - Registry persistence
    - Service installation
    - Process injection

๐Ÿ”ฌ Machine Learning Detection
#

from sklearn.ensemble import RandomForestClassifier
import pandas as pd

# Feature extraction for ML
def extract_features(pe_file):
    features = {
        'file_size': pe_file.size,
        'num_sections': len(pe_file.sections),
        'num_imports': len(pe_file.imports),
        'has_debug_info': pe_file.has_debug,
        'entropy': calculate_entropy(pe_file.data),
        'num_strings': len(extract_strings(pe_file.data))
    }
    return features

# Train classifier
clf = RandomForestClassifier(n_estimators=100)
clf.fit(training_features, training_labels)

# Predict malware
prediction = clf.predict([file_features])
confidence = clf.predict_proba([file_features])

๐Ÿ“ˆ Anomaly Detection
#

from sklearn.svm import OneClassSVM

# Train on benign samples only
detector = OneClassSVM(nu=0.1)
detector.fit(benign_features)

# Detect anomalies (potential malware)
anomaly_score = detector.decision_function([test_features])
is_anomaly = detector.predict([test_features])
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Advanced Malware Evasion
#

๐Ÿ•ณ๏ธ Evasion Techniques
#

๐ŸŽญ Anti-Analysis Methods
#

Anti-Static Analysis:
  - Code obfuscation
  - String encryption
  - Control flow flattening
  - Dead code insertion

Anti-Dynamic Analysis:
  - Virtual machine detection
  - Debugger detection
  - Sandbox evasion
  - Time-delayed execution

Anti-Disassembly:
  - Opaque predicates
  - Junk code insertion
  - Self-modifying code
  - Code encryption

๐Ÿ” VM Detection Code
#

; Check for VMware
mov eax, 564D5868h    ; VMware magic value
mov ebx, 00000000h
mov ecx, 0000000Ah
mov edx, 5658h        ; VX port
in  eax, dx
cmp ebx, 564D5868h    ; Check response
je  vm_detected

; Check for VirtualBox
cpuid
mov eax, ebx
xor eax, 0x786c7265   ; "xler"
jz  vm_detected

๐Ÿ›ก๏ธ Counter-Evasion Techniques
#

๐Ÿ”ง Analysis Environment Hardening
#

# Modify VM characteristics
vmware-vmx -s "mainMem.hideHypervisorFromGuest = TRUE"
vmware-vmx -s "isolation.tools.getPtrLocation.disable = TRUE"
vmware-vmx -s "isolation.tools.setPtrLocation.disable = TRUE"

# Change MAC address prefix
ifconfig eth0 hw ether 00:11:22:33:44:55

# Modify registry entries (Windows)
reg add "HKLM\HARDWARE\DESCRIPTION\System" /v SystemBiosVersion /t REG_SZ /d "Custom BIOS"

๐ŸŽฏ Behavioral Forcing
#

# Force malware execution
import time
import threading

def create_user_activity():
    """Simulate user activity to trigger malware"""
    while True:
        # Simulate mouse movement
        # Simulate keyboard input
        # Create temporary files
        time.sleep(1)

# Start activity simulation
activity_thread = threading.Thread(target=create_user_activity)
activity_thread.daemon = True
activity_thread.start()

๐Ÿ”ฌ Advanced Monitoring
#

Kernel-Level Monitoring:
  - System call interception
  - Memory access monitoring
  - Hardware performance counters
  - Hypervisor-based analysis

Multi-Path Analysis:
  - Multiple execution paths
  - Different input values
  - Various system configurations
  - Time-delayed triggers
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Threat Intelligence Integration
#

๐ŸŒ Threat Intelligence Sources
#

๐Ÿ“Š Commercial Sources
#

Threat Intelligence Platforms:
  - CrowdStrike: Falcon Intelligence
  - FireEye: Mandiant Threat Intelligence
  - Recorded Future: Real-time intelligence
  - ThreatConnect: Collaborative platform

Malware Databases:
  - VirusTotal: Multi-engine scanning
  - Hybrid Analysis: Automated sandbox
  - Malware Bazaar: Sample sharing
  - URLVoid: URL reputation

๐Ÿ†“ Open Source Intelligence
#

# MISP integration
from pymisp import PyMISP

misp = PyMISP('https://misp.local', 'API_KEY')

# Search for IoCs
results = misp.search(
    eventinfo="malware campaign",
    type_attribute="sha256",
    to_ids=True
)

# Extract indicators
iocs = []
for event in results:
    for attribute in event['Attribute']:
        if attribute['type'] == 'sha256':
            iocs.append(attribute['value'])

๐Ÿ”„ Intelligence Automation
#

๐Ÿค– Automated Analysis Pipeline
#

class MalwareAnalysisPipeline:
    def __init__(self):
        self.static_analyzers = []
        self.dynamic_analyzers = []
        self.threat_intel = ThreatIntelligence()
    
    def analyze_sample(self, sample_path):
        results = {
            'static': {},
            'dynamic': {},
            'intelligence': {}
        }
        
        # Static analysis
        results['static'] = self.run_static_analysis(sample_path)
        
        # Dynamic analysis
        results['dynamic'] = self.run_dynamic_analysis(sample_path)
        
        # Threat intelligence lookup
        results['intelligence'] = self.threat_intel.lookup(
            results['static']['hashes']
        )
        
        # Generate report
        return self.generate_report(results)
    
    def generate_report(self, results):
        report = {
            'verdict': self.calculate_verdict(results),
            'confidence': self.calculate_confidence(results),
            'details': results
        }
        return report

๐Ÿ“ˆ Intelligence Scoring
#

def calculate_threat_score(indicators):
    score = 0
    weights = {
        'known_malware_hash': 100,
        'suspicious_behavior': 50,
        'malicious_domain': 75,
        'packer_detected': 25,
        'anti_analysis': 40
    }
    
    for indicator, present in indicators.items():
        if present and indicator in weights:
            score += weights[indicator]
    
    return min(score, 100)  # Cap at 100
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: default
#

Practical Exercise: Malware Analysis Lab
#

๐ŸŽฏ Hands-on Activity (35 minutes)
#

Lab Setup and Safety
#

โš ๏ธ IMPORTANT SAFETY NOTICE:

  • Use isolated virtual machines only
  • No real malware on production systems
  • Use harmless test files or simulators
  • Implement proper network isolation

Phase 1: Static Analysis (15 minutes)
#

Sample File: Download a test PE file or use a harmless executable

Tasks:

  1. File Analysis:

    • Calculate MD5, SHA1, SHA256 hashes
    • Check file size and type
    • Examine PE header information
  2. String Analysis:

    • Extract readable strings
    • Look for URLs, IP addresses, file paths
    • Identify suspicious API calls
  3. Online Lookup:

    • Check hashes on VirusTotal
    • Analyze reputation scores
    • Review community comments

Tools to Use:

# Command line tools
file sample.exe
strings sample.exe | head -20
md5sum sample.exe
sha256sum sample.exe

# Python analysis
python3 pe_analyzer.py sample.exe

Phase 2: Behavioral Analysis (15 minutes)
#

Simulation Tasks:

  1. Process Monitoring:

    • Monitor running processes before/after
    • Track new processes created
    • Check process relationships
  2. File System Monitoring:

    • Watch for file creation/modification
    • Monitor directory changes
    • Check temporary file usage
  3. Network Monitoring:

    • Monitor network connections
    • Capture DNS queries
    • Watch for unusual traffic patterns

Phase 3: Detection Rule Creation (5 minutes)
#

Create YARA Rule:

rule Your_Detection_Rule {
    meta:
        author = "Your Name"
        description = "Detects your analyzed sample"
        
    strings:
        $s1 = "suspicious_string_1"
        $s2 = "suspicious_string_2"
        
    condition:
        any of ($s*)
}

Deliverables:

  • Analysis report with findings
  • Hash values and file properties
  • String analysis results
  • Behavioral observations
  • Custom YARA detection rule
Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: center class: text-center
#

Questions & Discussion
#

๐Ÿค” Discussion Points:
#

  • What challenges did you face during the analysis lab?
  • How effective are signature-based vs. behavioral detection methods?
  • What role does threat intelligence play in malware analysis?

๐Ÿ’ก Lab Review
#

Share your analysis findings and detection rules

Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar

layout: center class: text-center
#

Thank You!
#

Next Lecture: Virus Protection Mechanisms
#

Building Effective Antivirus and Anti-malware Systems
#

Cyber Security (4353204) - Lecture 12 Complete

Know your enemy: Analyze to protect! ๐Ÿ”๐Ÿ›ก๏ธ

Course: Cyber Security (4353204) | Unit II | Lecture 12 | Author: Milav Dabgar