Its been said that the definition of marketing is convincing people that you have what they want . So I suppose that explains all of this silly chatter about deep packet inspection. It certainly sounds impressive. The problem is, its completely backwards in terms of what's necessary for effective bandwidth management.
What is Deep Packet Inspection?
The term Deep Packet Inspection, or DPI, simply means that packets are analyzed to try to figure out what protocol, application, or function they are related to. Vendors use fancy terms like "classification", or the foreboding, "Deep Packet Inspection", to make it sound more fantastic. The terms are a bit misleading, since much of the time this "Deep Inspection" is just looking at a byte or two in the packet header. The more complicated identifications may include connection states and decoding multiple levels of protocol headers. This might be necessary to determine, for example, what specific user is logged into a server, and the tracking of that users connections.When is DPI useful? The Protocol Method of Bandwidth Management
I have to admit, its pretty cool to see graphs of all of the protocols on a busy network. It makes you feel like you have a view of the world; a handle on what you're doing. The concept is that by identifying 100s of application streams, supposedly you can micro-manage a network and specify precisely how much bandwidth each application can get. But the truth is, virtually no-one can model how a large network should operate at a micro level. The dynamics change continuously. I've been designing products for over 20 years and I can't do it, so how can you expect a network technician with a marginal understanding of network protocols to do it? By trying to make something that is very complex appear simple, you get yourself into trouble. For example, I have customers frequently ask how they can prioritize HTTP to make browsing better. This is a dangerous and counterproductive way of thinking. If you think you can do that without possibly creating a lot of other problems, then you really don't understand the big picture.Why the Big Vendors Endorse DPI
Here is the big trick. There are 2 main reasons why the well-financed vendors endorse DPI as the method to use for bandwidth management:- Complexity: In order to code your application to support 100s of applications, and to track changes in all of those applications, you have to have resources. You need people to do the work, you need to buy many of the applications so you have them in your lab, and you need to have access to large client networks to test applications that can't easily be run in a lab. This protects these companies against small, more cost-efficient vendors coming in and competing directly.
- Revenue. Once you've committed to per protocol bandwidth management, you're dependent on keeping up to date. This means endless upgrades, and expensive support contracts. DPI is like Cholesterol drugs. What a great idea, to get generally healthy people to buy drugs for the rest of their lives to combat a theoretical problem. The same goes for DPI. Once you've made a $40K investment in a system designed to manage by DPI, you're pretty much stuck buying endless upgrades.
Why the Protocol Method Fails
The protocol method fails because it doesn't account for the one component of bandwidth management that matters most: volume. The reason that P2P protocols are considered abusive is because they are automated. What most people don't understand, is that most of the traffic generated by P2P applications is HTTP and ICMP traffic. Directory contents are exchanged with HTTP and servers are discovered with ICMP. The reason its abusive is not because of file downloads; its abusive because the application is automated; its generating traffic with a volume that is the equivalent of 100s of users. A protocol method that defines HTTP as a good protocol will not work as expected, because these applications increase the volume of HTTP to the point where the network's volume of HTTP is so high that you either have congestion, or you have to limit users who are innocently surfing the web. The protocol method is a losing battle that fails to solve the problem of network congestion.DPI is Easily Defeated
The biggest problem with DPI is that its easily defeated. The first way to defeat it is to make your protocols complicated, and to change them regularly. The P2P people do this with fervor. A way to absolutely defeat it is with encryption. How can you inspect a packet when you can't determine the contents? The truth is, you can't. You don't even have to use encryption; you can just scramble your headers or use variable codes. Packet shapers on high speed networks don't have the cpu capacity to be trying to decrypt thousands of packets per second. And you don't have to be an evil genius to defeat DPI; it can happen accidentally. For example, IPSEC traffic can't be managed with DPI or the protocol method. P2P applications can easily launch encrypted tunnels to defeat any control attempt by upstream packetshapers.It doesn't work on HTTPS connections
Perhaps you've noticed that more and more sites are forcing HTTPS access? Google and Facebook are the big ones. The truth is, you can't decode protocols in an HTTPS connection. You can guess based on the site, but all you have to do it do whatever you're doing in HTTPS and DPI fails. As more and more sites force DPI, less and less of your Deep Packet Inspection strategy will work.Fairness is Per User, NOT Per Protocol
Most ISPs and Universities are interested in providing fair access to bandwidth for its customers and users. The way to provide per user fairness is to manage by user. The beauty of per-user management is that you don t care what they're doing. You don t have to know about every protocol ever conceived. And you don t have to restrict access to some protocols altogether, since any customer running abusive protocols will only consume their own bandwidth. You don't need to upgrade every time something changes, and you don't need to buy expensive support. Per user controls also can't be defeated. Since you're controlling by Address or range of Addresses, tunneling, encryption, and header scrambling cannot be used to get around your controls. The customer/user has no choice but to use their assigned address, so you can always identify their traffic, and can manage the volume of their traffic as a single, simple, easily manageable entity.An added issue is that DPI consumes CPU resources. When using per user controls, you can manage a lot more traffic, and you don't have to worry about CPU resoures being consumed. This means that heavily utilized gigabit networks can be managed with a single system.