This is an article about some best practices I collected with advice from others and self experience. I focus mainly on connectivity and stability of the network. Less on security. I might write a different post later about security on network hardware.
When you are in a small office with one or two switches, you probably aren’t wondering how those things are configured. Usually you just connect the switches and you have connectivity. However as the size of your network grows and the number of devices grow, it will get more and more complex to have a stable network. And these days companies rely on data and network availability.
Proper configuration of your network hardware is key in keeping your network stable and available. In this article I’m not going into the architecture of the network, I will discuss the configuration of indivual switches or virtual chassis (stacks). Although when configuring a switch you always have to take into account where the switch will operate (which position in the network). I’m trying to keep it as general as possible, although I usually work with Juniper Hardware.
Make a design
I thought about this while I was writing this article. So I added this in the end. Maybe obvious, but maybe also easily forgotten.
Make a design. Even if you only have a few switches, it’s a good idea to make a design. Try to draw how and where you will place your switches, which (network) devices are connected and which vlans you need. Also try to come up with hostnames and IP addresses for your devices.
Having a thought-out design will make installing much easier. And will save you time eventually. You can reuse your design as documentation and as a help for troubleshooting.
Small example here how you could make it with Visio. There are many tools available and you could also just draw it on a piece of paper.
Management IP / VLAN
The basic configuration of a switch. I mention it because some people just get a switch out of the box, connect it and never look at it. You can configure switches usually via a console cable or by visiting either a preconfigured IP address or one that the switch learned through DHCP. If you configure any smart (or managed) switch it will ask you some basic questions like IP information, netmask, default gateway. It might also ask you for a management VLAN.
A lot of switches offer the ability to configure the management address via DHCP. This is usually for automatic upgrades or configuration. I would recommend against using this setting in a production environment (in staging it’s fine). If you ever have connectivity issues and your switches cannot get their IP addres from the DHCP server, it might just make problems worse.
Make your switch connectable from the network. Try to use SSH instead of telnet for security reasons. And if the switch has a web interface, also go for http instead of https. You can configure SNMP for monitoring and SNMP to react on events that happen with your switch. What is also very important, but often forgotten, is log collection. A syslog linux server is easily set up. Just make your switch forward it’s logs to this server. It can help you when you have a problem where your switch is rebooting and might loose it’s logs because of that.
Also configure the time options. Put the right timezone in. Configure NTP servers (AD server can be NTP servers). For troubleshooting issues later on this might come in very handy.
Try to setup a good ip and vlan plan. I will write another article on that, since it can get complex quite fast.
Already starting with just categorizing all types of equipment and separating that traffic in vlans is a good idea. Example: users, printers, voip, camera, ap’s, …
Just know that limiting your broadcast domain will enhance network performance. Also it will isolate problems within that domain. It’s also always a good idea to create a management vlan for inband management. You can protect that VLAN with a firewall in your network.
Loop prevention / protection
Loop prevention and protection is essential in a network. Especially when the size increases. Since we are using ethernet almost everywhere there is no build in loop protection. Ethernet is a broadcast protocol and if you make loops on the network those broadcasts can last untill the network cripples or devices cannot handle the load anymore.
Spanning Tree Protocol (STP, RSTP)
Every network course has to talk about spanning tree somewhere. I’m not going to discuss all the details here. But we are using spanning tree mainly for loop prevention and detection. It can also be used for redundancy, but for that I want to opt for LACP further in the article.
I want to configure STP or RSTP on all my edge ports. Edge ports are ports where “edge” devices are connected to, this means computers, wifi, printer, camera, ip phone, any real device and especially every interface that is accessible to users. All edge ports will send out BPDU packets. These BPDU packets contain information about the switch they originate from. It’s actually meant to communicate with another switch and to establish a spanning tree. However in our case we just want to send them out, so when we have a loop on our network we will actually receive our own BPDU’s. Now the switch knows there is a loop and will keep the link inactive.
Edge port is also a specific configuration that puts the port into a state where it cannot make a point to point connection with another switch. It will only connect other clients. For Cisco I think this is called PortFast. The port will also be faster available because it doesn’t need to learn if it has to connect with another bridge.
BPDU block on Edge
On top of the edge configuration you also should configure block on edge. What this does is, when the port receives a BPDU it will automatically disable or shut the interface. This to prevent a loop or to prevent other network devices to be connected on that port.
Usually if someone connects something on your network that triggers the BPDU block on edge mechanism, they are doing something wrong.
No spanning tree on uplink
I don’t configure spanning tree on the uplink. We had some problems with unstable networks in the past which were caused by flapping interfaces to switches. Now what happens is when a switch goes out of the spanning tree, it will generate a topology change. This is a mechanism to clear the mac tables so the switches will learn the new path the access the network. When this happens once or twice it’s possible you don’t even notice it. If it happens a lot you will notice it and you will get complaints (VOIP and video service might be sensitive for this). So limiting your spanning tree domain to the minimum (1 switch or stack) prevents this.
LACP on the uplinks
LACP is not only a protocol to make link aggregation but also to check the links that they are still active. If you make bundles without LACP, the switch won’t notice when the other side is unavailable. LACP will send check packets every second (in fast mode) to verify if the other side is responding. It will also check via the chassis id’s that all links in the aggregate are connected to the same chassis. So if you misconnect the cables, it won’t take your network down.
If LACP is only configured on one of the two sides, the link just doesn’t become active. So no risk there to create a loop or a bad link. And if it’s configured with LACP then all links should go to the same chassis or stack.
Try to keep your switches on the recommended software or at least on software that is in support. When you encounter a problem with devices with outdated or unsupported software, you might not get help from the vendor solving your problem.
Also newer software should have bugfixes for known problems (sometimes they also introduce new problems).
However, try not to run the latest version of the software. Unless there are features that you really need. There always will be bugs in the latest software which will get fixed in later releases. Even the recommended software can have bugs in it. Try to follow that up a bit or check with your integrator which version they have the best experience with. You don’t want to waste a maintenance window and a lot of time with installing bad software. You will have to fix it later anyway.
Disable unused ports / remove default vlan
A lot of security guides advise to disable unused ports for security reasons. This is of course a good idea. However a lot of switches are setup to provide connectivity for a whole building or floor. So usually the administrators don’t bother with disabling the ports. However think about when some stranger walks in with a laptop and connects to your network… There are better ways to secure your Edge switches from intruders then just disabling ports. Like 802.1X.
For a datacenter there is no excuses to disable those unused ports. The risk that malicious users would connect something is smaller, but still colleagues or partners could be connecting things on your switches that you don’t know about. Better to disable those ports and get notified about changes they need.
Anyway I’ll do a post about security later on.
It’s also a good idea to remove the default vlan. I usually don’t use it at all. Some vendors put the ports in the default vlan if you put no configuration on them. So watch out for that. You might waste a lot of time working out why your connections don’t work, while the port was in the default vlan all along, not connected to the rest of your network.
So these are some advices I would like to give when installing or maintaining switches in your network. I’m sure many network administrators have gathered more tips and advices during the years. You can always leave a comment below. We can expand these few tips. Also feel free to comment on what I am suggesting here. For sure this is open for discussion.
Thanks for reading and commenting!