Friday, September 5, 2014

Sniffing BitTorrent DHT Traffic

Introduction

I've been playing with some of the protocols that power BitTorrent recently just for my own knowledge. While digging into the Distributed Hash Table, I decided to whip up a quick packet sniffer to decode the queries and responses. This gives a quick insight into how your client is interacting with the nodes around it.

The Source

#!/usr/bin/env python
"""
Sniff a specific port for Bit Torrent DHT traffic and print
requests/responses in human readable form.
Reference: http://www.bittorrent.org/beps/bep_0005.html
"""
from pcapy import open_live
from bencode import bdecode
from socket import inet_aton, inet_ntoa
import dpkt
import sys
# Defaults to 51413 (transmission's default port)
filter_port = 51413
# Callback function for parsing packets
def parse_udp(hdr, data):
global filter_port
try:
eth = dpkt.ethernet.Ethernet(data)
except Exception:
return
if eth.type != dpkt.ethernet.ETH_TYPE_IP:
return
ip = eth.data
if ip.p == dpkt.ip.IP_PROTO_UDP and filter_port in (ip.data.dport, ip.data.sport):
payload = ip.data.data
else:
return
# Print plain text bencoded request.
try:
data = bdecode(payload)
print "%s:%d -> %s:%d (%d bytes): %s\n" % (inet_ntoa(ip.src), ip.data.sport,
inet_ntoa(ip.dst), ip.data.dport, len(payload), data)
except Exception:
return
def main(argv):
global filter_port
if len(argv) == 1:
try:
filter_port = int(argv[0])
except ValueError:
print "Invalid port number"
sys.exit(1)
print "[+] Starting sniffer"
pcap_obj = open_live("eth0", 65536, False, True)
try:
pcap_obj.loop(-1, parse_udp)
except KeyboardInterrupt:
print "[!] Exiting"
sys.exit(0)
if __name__ == '__main__':
main(sys.argv[1:])
view raw dht_sniff.py hosted with ❤ by GitHub

Joining the Swarm

The code is available on github. The default monitoring port is 51413 (default for transmission). Consult your client's documentation or use lsof to find the listening port.

$ lsof -i | grep UDP
transmiss   999 debian-transmission   12u  IPv4 16474843      0t0  UDP *:51413
$ sudo python dht_sniff.py 51413
127.0.0.1:51413 -> 127.0.0.1:6969 (94 bytes): {'a': {'id': '\xab/Da\xcd\x7f\xbcI\xef[E\\\x88m6\xae\xab\xbd<\xd6', 'target': "\x12\x34\\'\xab5\xfbGj\x96M\x15\xce\xad\x91@\xb9' E"}, 'q': 'find_node', 't': 'fn\x00\x00', 'y': 'q'}

Going Beyond

I didn't implement it yet, but decoding the node list returned by find_node and get_peers is relatively straight forward. This would give an even more in depth look at how your client / nodes around you are communicating. Refer to the documentation above for how node lists are constructed and returned.