Sunday, May 6, 2012

Do VLANs Unduly Complicate the Network?

Simplifying the IT infrastructure is among the top priorities of IT executives. Various surveys claiming CIOs are beginning to embrace Software as a Service (SAAS) models is actually proof that simplification of the infrastructure is a key objective. The more complicated the IT infrastructure becomes, the more expensive it is to maintain it. It also will take significantly longer person-hours to fulfill change requests.

As an IT manager, simplifiying my IT support structure is also one of my key goals. One of the most common discussion points regarding this subject is the number of VLANs required for a branch office; or, if VLANs are ever needed in the branch office network design.

My answer is always 'yes'. Surprisingly, I often find myself defending that position.

Simplicity is a function of necessity. To simplify means to design the IT infrastructure to fulfill only the required service level and, to make it easy to scale the infrastructure to increasing demands of the near future. To implement more than the required service level and business need is unduly complicating the IT infrastructure. It is a step farther from simplification.

While there are no fast rules as to how many VLAN count is the optimal level to qualify a network as ‘simple’,  I believe quite the opposite. VLANs actually help in simplifying the IT infrastructure and ensures scalability for future expansions.  

The three main reasons that necessitate the creation of VLANs are listed below. You will notice that even in the branch office, we will find ourselves utilizing VLANs for any one of these three:

    1. to avoid mac-address-table and arp broadcast storms;
    2. to simplify network security;
    3. to simplify Quality of Service (QoS) implementations;

In a practical sense, a Local Area Network (LAN) is a group of computers whose IP addresses are within the same IP block. For example, if your computer has an IP address of 192.168.1.10, with subnet mask of 255.255.255.0, then all computers from 192.168.1.1-192.168.1.254 belongs to your LAN. That is, they belong to only one switching group, and communicate to each other by referencing each computer's unique network interface card (NIC) machine address, with the switchport where the NIC is directly connected to. This mapping of NIC mac-addresses to the switch's switchports are kept by the switch in a simple database called  a mac-address-table. Since they do not communicate via their IP addresses, then they should find a way to reference an IP address with a corresponding NIC mac-address which is the only information relevant to the switch. To do this, the switch, and all computers in the LAN maintain another common database called the arp (address resolution protocol) table.

This brings me to the first reason why VLANs are configured:

1. to avoid mac-address-table and arp broadcast storms


This is the main reason why routers and VLANs were invented, and why it is not possible to build the internet using only switches.

As discussed, computers in a local area network communicate to each other only via their mac-addresses using a map being maintained by the switch called a mac-address-table. To do this, there should be way for a computer to reference an IP address with a mac-address. They do this via an ARP (Address Resolution Protocol) request. Let us assume a computer 'A' wants to ping another computer 'B' with an IP address of 172.30.1.105/24. Seeing they are on the same subnet (Computer 'A' has IP address of 172.30.1.102/24), Computer 'A' immediately knows it should communicate with the switch, not through the router. To communicate with the switch, Computer 'A' needs the mac-address of the Computer 'B'.

Here's what happens:

Computer 'A' composes a message: "Hey switch, what is the Mac-Address' of someone with IP address of 172.30.1.105/24? Can you check in your arp table?"

Switch responds with: "I don't know but please wait, I'll send everyone an arp request broadcast". He then broadcasts an ARP request. After all computers in the LAN finished responding to this broadcast, he immediately does two things:

       (1) he sends an ARP response to Computer 'A': "I found it! Computer A, the mac-address of     Computer B with IP address 172.30.1.105/24 is 00-1e-e5-69-1f-55; then,

       (2) The switch completes his mac-address-table. After completing his mac-address-table, he again proceeds to do another broadcast, this time to inform all nearby switches that he has a more updated mac-address-table.

Computer 'A', completes his request and sends over packets containing the mac-address of Computer 'B' to the switch. The switch forwards the packet to the switchport using the mac-address-table he completed in the previous step.

The picture below summarizes how it works



Once the DHCP expires and everyone is sent a new IP address, the arp table on each computer also resets, and the entire request-broadcast-response-broadcast cycle is repeated.

To resolve the broadcast issue, we can either: (a) configure both the switch’s arp and mac-address-table expiration (Cisco refers to them as timeouts) to control the frequency of the MAC-Address-table and arp broadcasts; or, (b) separate the network into different segments via VLANs. For very large networks, we will need to do both.

If it still is not obvious how VLANs play a detrimental role in controlling the broadcasts, then imagine we have 200 nodes on DHCP and with multiple access-points over LAN and wifi. Whenever one user plugs his/her laptop from one area of the office to another,  the computer and/or the switch (whichever is applicable) will do an arp request broadcast, and a MAC-Address-Table broadcast. So imagine what it will do to our network if 50 of those users move for whatever reason. We have 200 X 50 broadcasts = 1,000 arp broadcasts, and another set for as many switches for the mac-address-table broadcast. Then, of course we have another set of broadcasts when they move back to their permanently assigned workstations. Quite small, but note that we also have other broadcasts going on in the network such as IGMP, servers, clusters, etc. The broadcast problem increases exponentially as more elements are added to the network.

VLANs communicate via the router. They are on different LANs. So if we split the network in two VLANs, then the broadcast storms are contained in only one VLAN, while the other is unaffected. Therefore, as the network grows, it becomes necessary to segment the network into VLANs and keep this broadcast storm in check. I suggest one VLAN per group of 200, at least (depending on usage, and degree of security required).

2. to simplify network security


Suppose that in our company, the Marketing Department is allowed social networking sites (twitter, facebook, google+, etc.), but the rest of the organization is not. How do we do it? We can either: (1) Configure a by AD user account/group account URL filtering (assuming our UTM appliance is equipped with that); or, (2) we can do it via VLANs (which every UTM and transparent proxy appliance can implement). Judging just by how many components are needed to configure both, then the latter (configure VLANs) is conclusively the simpler.

There are also other issues. Trunk traffic is passed from one switch to another via VLAN 1. Thus when VLAN 1 is compromised, everybody is compromised. What I do is I move the native VLAN to another VLAN, and simply make VLAN 1 a 'parking vlan' (in other words, a VLAN assigned to unused switchports).

And here is my favorite VLAN security implementation – I simply deny all inter-VLAN access (except to the servers). That way, no one from the other VLAN can sniff through the other. This is important when some techie staff tries to sneak into executives’ skype conversations, or files (yes, you can sniff through skype chat history of another person on the network provided you have access to their computer, check out how in my other blog here). This also servers as defense when one VLAN was compromised (malware, DoS attacks, etc.).

3. to simplify Quality of Service (QoS) implementations


In our company, we have normal traffic, and VoIP. The traditional way of doing QoS is by configuring DSCP diffserv tagging on every switchport (every switchport, not just where the VoIP ports are); then do prioritization and bandwidth shaping at the ingress direction of the router interface. But if the only traffic we need to tag for priority are VoIP traffic, then its simpler done via VLANs. We simply: (1) put all VoiP and similar traffic in one VLAN; (2) disallow inter-VLAN communication; then, (3) prioritize traffic on a per VLAN basis.

Therefore, even in small branch offices with less than 100 network elements, if we have any one of the reasons stated above, we will need to apply VLANs. I doubt if there is any type of enterprise level network, even in the branch-level, that would not benefit from implementing VLANs. I sincerely would love to hear your opinion on the matter so if you have one, please post a comment and let's start discussing.

4 comments:

  1. Hi Jeff, a basic network engineering concept very well demystified.

    ReplyDelete
  2. The usual feedback comes from two camps; (1) first camp attributes additional network structures to adding equipment hence CAPEX or even OPEX may increase, (2) misconception by uninformed infrastructure teams who think that the additional structures would induce technical complications.

    ReplyDelete
  3. From my point of view, VLANs offer three thing (which map to your thesis):

    1. Organizational clarity in the network structure, thereby maximizing investments in the switching and routing network
    2. Increased network security
    3. Number 1 and 2 produce network scalability, and flexibility then leads to productive capacity

    To summarize, VLANs were designed to simplify the network, both on the onset and as your network grows moving forward in time.

    ReplyDelete
  4. Hi Steve. Thanks. Well said. VLANs, when done properly actually simplifies network management. I do agree.

    ReplyDelete

Related Posts Plugin for WordPress, Blogger...