Sunday, 28 August 2022

Q-in-Q (mis)adventures

    So, sometimes we offer Q-in-Q network services to our customers. Sometimes those services have to go trough subcontractors. And sometimes it can be quite  difficult to get those 3 things to work(Q-in-Q, customers, subcontractors).

What is Q-in-Q

    Q-in-Q(802.1ad) is a layer 2 service which allows you to essentially put  VLANs inside a VLAN.  VLANs are Virtual LANs, in a sense it is a networking layer 2 technology that separates switched traffic, not routed traffic. It is sometimes also called VLAN tag double  stacking. Because that is what it does, it puts another VLAN tag on Ethernet frames that already have a VLAN tag. It is often used in internet service provider networks, so that providers can offer layer2 services to their customers and their customers can use  the full 4096 spread of VLANs. In that respect, the outer VLAN is usually called the S-VLAN or the service provider VLAN, while the multitude of inner VLANs are called C-VLANs, or customer VLANs.



But, I have a story

    So, one customer orders a Q-in-Q service from us. Due to us not being  present at the B location, we have to involve a subcontractor. This has not been  the first nor the last time this customer has ordered a Q-in-Q  service from us. The subcontractor is a weird one. Some of their Q-in-Q tunnels allow native VLAN traffic over the S-VLAN. Others of their Q-in-Q tunnels do not allow native VLAN traffic over the S-VLAN. This is determined by network equipment vendor implementation as well as configuration. There are good reasons why you should not allow native traffic over a S-VLAN and why you should, but that is beyond the point of my story. 

    One day my customer calls me and tells me that their new service is not working. He says he can't reach the devices on the other end. OK, I double check, and I see no faults on my configuration. I also don't see any mac addresses coming from the direction of the  subcontractor on that service. So I call the subcontractor. I eventually after several dosenms of emails and an hour on the phone get the subcontractor to send out their technician. They replace their local device, test everything out, some issue was found and fixed. 

    I call the customer, and it is still not working. The customer said that they told him he had  to put a VLAN on his side, saying that can't be right. He said that if he has to put a VLAN on his side then the access port towards him is in trunk mode. And even when he set the VLAN which happenedd to be a VLAN with the same tag as the S-VLAN tag, it still didn't work.

    That is when it clicked for me. This customer had previously had some Q-in-Q services over us and that same subcontractor, which just worked when he was trying to send native untagged frames over the S-VLAN. As I said earlier, on Q-in-Q this sometimes works and sometimes it does not work. This  customer must have got used to this working.

    I  quickly confirmed with the customer which C-VLAN ID was he testing with on the B-side of the connection. For a test I terminated the S-VLAN on one of our routers, I put the correct C-VLAN on our router, and voila it worked. On this Q-in-Q service there can be no native S-VLAN traffic.

    It was  the customers fault. Doubly so because this customer is properly technologically educated. Nothing, I call the customer, I explain the situation to him. Just don't send native traffic over a Q-in-Q service. It-s a Q-in-Q service, send VLANs trough it.

The lessons

- Always double check  your customers.

- Working with subcontractors is still a pain.

-Avoid sending native traffic over Q-in-Q services if not necessary. It is not a best practice.

No comments:

Post a Comment

User story 1: The client has to pay because their IT guy refuses to replace two patch cables.

 Introduction Actors: $dude - DevOps hired by the client company. $colleague - My colleague, stuck in the same quagmire as I am. To be short...