前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >learning vnet:L2 vSwitch

learning vnet:L2 vSwitch

作者头像
dpdk-vpp源码解读
发布2024-04-28 14:56:52
1010
发布2024-04-28 14:56:52
举报
文章被收录于专栏:DPDK VPP源码分析DPDK VPP源码分析

在前面文章《learning:vpp实现dot1q终结功能配置》介绍了L2 vSwitch一些基本概念BD(Bridge Domain)、BDI (Bridge Domain interface)等等概念,本文主要学习二层的转发流程。前面文章中介绍了在腾讯云主机中搭建DPDK&VPP的学习环境,下面就在腾讯云主机搭建L2 vswitch环境。具体配置如下图所示:

首先在linux内核创建2个命名空间PC1和PC2.

代码语言:javascript
复制
ip netns add pc1
ip netns add pc2

然后通过vpp命令行创建2个tap接口及BD,两个tap接口加入到BD中。具体配置如下. 我们可以复制下面内容,自动生成一个l2_conf文件

代码语言:javascript
复制
cat << EOF > /root/l2_conf 
#创建二层BD域 1
create bridge-domain 1
 #创建tap接口tap1 内核名称tap1 内核所有netns pc1 内核接口地址192.168.1.1/24
 creat tap id 1 host-ns pc1 host-ip4-addr 192.168.1.1/24 host-if-name tap1

 creat tap id 2 host-ns pc2 host-ip4-addr 192.168.1.2/24 host-if-name tap2

 set interface state tap1 up
set interface state tap2 up
#tap接口加入bd 1
set interface l2 bridge tap1 1
set interface l2 bridge tap2 1
EOF

接下来进入vpp命令行视图执行exec /root/l2_conf既可以生成相应的接口

代码语言:javascript
复制
dpdk-vpp源码分析: exec /root/l2_conf
dpdk-vpp源码分析:
dpdk-vpp源码分析:
dpdk-vpp源码分析: show interface addr
local0 (dn):
tap1 (up):
  L2 bridge bd-id 1 idx 1 shg 0
tap2 (up):
  L2 bridge bd-id 1 idx 1 shg 0
dpdk-vpp源码分析: show bridge-domain 1 detail
  BD-ID   Index   BSN  Age(min)  Learning  U-Forwrd   UU-Flood   Flooding  ARP-Term  arp-ufwd Learn-co Learn-li   BVI-Intf
    1       1      0     off        on        on       flood        on       off       off        2    16777216     N/A
span-l2-input l2-input-classify l2-input-feat-arc l2-policer-classify l2-input-acl vpath-input-l2 l2-ip-qos-record l2-input-vtr l2-learn l2-rw l2-fwd l2-flood l2-flood l2-output

           Interface           If-idx ISN  SHG  BVI  TxFlood        VLAN-Tag-Rewrite
             tap1                1     1    0    -      *                 none
             tap2                2     1    0    -      *                 none
dpdk-vpp源码分析: show l2fib all
    Mac-Address     BD-Idx If-Idx BSN-ISN Age(min) static filter bvi         Interface-Name
 02:fe:25:07:17:52    1      2      0/1      -       -      -     -               tap2
 02:fe:7b:06:08:36    1      1      0/1      -       -      -     -               tap1
L2FIB total/learned entries: 2/2  Last scan time: 0.0000e0sec  Learn limit: 16777216                       
dpdk-vpp源码分析: clear l2fib
dpdk-vpp源码分析: show l2fib all
no l2fib entries
dpdk-vpp源码分析:

为了抓取arp请求及回应报文的trace流程,我们执行cleat l2fib清楚mac表。然后设置trace add virtio-input 1 来抓取arp请求及回应流程。在内核执行ip netns exec pc1 ping 192.168.1.2 。

代码语言:javascript
复制
Packet 1

00:04:05:084044: virtio-input
  virtio: hw_if_index 1 next-index 4 vring 0 len 42
    hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1
00:04:05:084057: ethernet-input
  frame: flags 0x1, hw-if-index 1, sw-if-index 1
  ARP: 02:fe:81:57:ec:8e -> ff:ff:ff:ff:ff:ff
00:04:05:084069: l2-input
  l2-input: sw_if_index 1 dst ff:ff:ff:ff:ff:ff src 02:fe:81:57:ec:8e [l2-learn l2-flood ]
00:04:05:084073: l2-learn
  l2-learn: sw_if_index 1 dst ff:ff:ff:ff:ff:ff src 02:fe:81:57:ec:8e bd_index 1
00:04:05:084081: l2-flood
  l2-flood: sw_if_index 1 dst ff:ff:ff:ff:ff:ff src 02:fe:81:57:ec:8e bd_index 1
00:04:05:084085: l2-output
  l2-output: sw_if_index 2 dst ff:ff:ff:ff:ff:ff src 02:fe:81:57:ec:8e data 08 06 00 01 08 00 06 04 00 01 02 fe
00:04:05:084089: tap2-output
  tap2 flags 0x00180005
  ARP: 02:fe:81:57:ec:8e -> ff:ff:ff:ff:ff:ff
  request, type ethernet/IP4, address size 6/4
  02:fe:81:57:ec:8e/192.168.1.1 -> 00:00:00:00:00:00/192.168.1.2
00:04:05:084095: tap2-tx
    buffer 0x9f7d3: current data 0, length 42, buffer-pool 0, ref-count 1, trace handle 0x0
                    l2-hdr-offset 0 l3-hdr-offset 14
  hdr-sz 0 l2-hdr-offset 0 l3-hdr-offset 14 l4-hdr-offset 0 l4-hdr-sz 0
  ARP: 02:fe:81:57:ec:8e -> ff:ff:ff:ff:ff:ff
  request, type ethernet/IP4, address size 6/4
  02:fe:81:57:ec:8e/192.168.1.1 -> 00:00:00:00:00:00/192.168.1.2

Packet 2

00:04:05:084154: virtio-input
  virtio: hw_if_index 2 next-index 4 vring 0 len 42
    hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1
00:04:05:084156: ethernet-input
  frame: flags 0x1, hw-if-index 2, sw-if-index 2
  ARP: 02:fe:cf:7a:3e:a8 -> 02:fe:81:57:ec:8e
00:04:05:084159: l2-input
  l2-input: sw_if_index 2 dst 02:fe:81:57:ec:8e src 02:fe:cf:7a:3e:a8 [l2-learn l2-fwd l2-flood l2-flood ]
00:04:05:084160: l2-learn
  l2-learn: sw_if_index 2 dst 02:fe:81:57:ec:8e src 02:fe:cf:7a:3e:a8 bd_index 1
00:04:05:084168: l2-fwd
  l2-fwd:   sw_if_index 2 dst 02:fe:81:57:ec:8e src 02:fe:cf:7a:3e:a8 bd_index 1 result [0x1040000000001, 1] none
00:04:05:084171: l2-output
  l2-output: sw_if_index 1 dst 02:fe:81:57:ec:8e src 02:fe:cf:7a:3e:a8 data 08 06 00 01 08 00 06 04 00 02 02 fe
00:04:05:084173: tap1-output
  tap1 flags 0x00180005
  ARP: 02:fe:cf:7a:3e:a8 -> 02:fe:81:57:ec:8e
  reply, type ethernet/IP4, address size 6/4
  02:fe:cf:7a:3e:a8/192.168.1.2 -> 02:fe:81:57:ec:8e/192.168.1.1
00:04:05:084175: tap1-tx
    buffer 0x9d0d3: current data 0, length 42, buffer-pool 0, ref-count 1, trace handle 0x1
                    l2-hdr-offset 0 l3-hdr-offset 14
  hdr-sz 0 l2-hdr-offset 0 l3-hdr-offset 14 l4-hdr-offset 0 l4-hdr-sz 0
  ARP: 02:fe:cf:7a:3e:a8 -> 02:fe:81:57:ec:8e
  reply, type ethernet/IP4, address size 6/4
  02:fe:cf:7a:3e:a8/192.168.1.2 -> 02:fe:81:57:ec:8e/192.168.1.1

上述trace流程中,可以看到arp请求报文和回应报文走的node流程是存在一些差异的。arp请求报文时,携带目的mac地址时广播mac,l2-learn触发了mac表学习,记录内核tap1接口mac及对应接口;l2-flood在二层BD域内报文泛宏从tap2发送出去。arp回应报文使用单播报文,l2-learn节点出发mac表学习,记录内核tap2接口mac及对应接口。由于在请求阶段已经学习到mac表,所以命中l2fib表,报文送到l2-fwd节点查询到出接口信息,从tap1接口转发出去。下面的查询内核tap1和tap2接口mac地址及vpp l2fib表信息:

代码语言:javascript
复制
dpdk-vpp源码分析: show l2fib bd_id 1
    Mac-Address     BD-Idx If-Idx BSN-ISN Age(min) static filter bvi         Interface-Name
 02:fe:25:07:17:52    1      2      0/1      -       -      -     -               tap2
 02:fe:7b:06:08:36    1      1      0/1      -       -      -     -               tap1
L2FIB total/learned entries: 2/2  Last scan time: 0.0000e0sec  Learn limit: 16777216
dpdk-vpp源码分析: quit
root@learning-vpp:~# ip netns exec pc1 ifconfig
tap1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::fe:7bff:fe06:836  prefixlen 64  scopeid 0x20<link>
        ether 02:fe:7b:06:08:36  txqueuelen 1000  (Ethernet)
        RX packets 17  bytes 1286 (1.2 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 17  bytes 1286 (1.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@learning-vpp:~# ip netns exec pc2 ifconfig
tap2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.2  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::fe:25ff:fe07:1752  prefixlen 64  scopeid 0x20<link>
        ether 02:fe:25:07:17:52  txqueuelen 1000  (Ethernet)
        RX packets 17  bytes 1286 (1.2 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 17  bytes 1286 (1.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

至此,我们查询在vpp上已经学习到内核上tap1和tap2的mac表,接下来我们在linux上执行ip netns exec pc1 ping 192.168.1.2查询后续报文转发流程图

代码语言:javascript
复制
00:30:46:742722: virtio-input
  virtio: hw_if_index 1 next-index 4 vring 0 len 98
    hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1
00:30:46:742739: ethernet-input
  frame: flags 0x1, hw-if-index 1, sw-if-index 1
  IP4: 02:fe:7b:06:08:36 -> 02:fe:25:07:17:52
00:30:46:742751: l2-input
  l2-input: sw_if_index 1 dst 02:fe:25:07:17:52 src 02:fe:7b:06:08:36 [l2-learn l2-fwd l2-flood l2-flood ]
00:30:46:742755: l2-learn
  l2-learn: sw_if_index 1 dst 02:fe:25:07:17:52 src 02:fe:7b:06:08:36 bd_index 1
00:30:46:742761: l2-fwd
  l2-fwd:   sw_if_index 1 dst 02:fe:25:07:17:52 src 02:fe:7b:06:08:36 bd_index 1 result [0x1010000000002, 2] none
00:30:46:742766: l2-output
  l2-output: sw_if_index 2 dst 02:fe:25:07:17:52 src 02:fe:7b:06:08:36 data 08 00 45 00 00 54 c3 da 40 00 40 01
00:30:46:742770: tap2-output
  tap2 flags 0x00180005
  IP4: 02:fe:7b:06:08:36 -> 02:fe:25:07:17:52
  ICMP: 192.168.1.1 -> 192.168.1.2
    tos 0x00, ttl 64, length 84, checksum 0xf37a dscp CS0 ecn NON_ECN
    fragment id 0xc3da, flags DONT_FRAGMENT
  ICMP echo_request checksum 0x6521 id 24956
00:30:46:742779: tap2-tx
    buffer 0x9fc65: current data 0, length 98, buffer-pool 0, ref-count 1, trace handle 0x0
                    l2-hdr-offset 0 l3-hdr-offset 14
  hdr-sz 0 l2-hdr-offset 0 l3-hdr-offset 14 l4-hdr-offset 0 l4-hdr-sz 0
  IP4: 02:fe:7b:06:08:36 -> 02:fe:25:07:17:52
  ICMP: 192.168.1.1 -> 192.168.1.2
    tos 0x00, ttl 64, length 84, checksum 0xf37a dscp CS0 ecn NON_ECN
    fragment id 0xc3da, flags DONT_FRAGMENT
  ICMP echo_request checksum 0x6521 id 24956

Packet 2

00:30:46:742829: virtio-input
  virtio: hw_if_index 2 next-index 4 vring 0 len 98
    hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1
00:30:46:742831: ethernet-input
  frame: flags 0x1, hw-if-index 2, sw-if-index 2
  IP4: 02:fe:25:07:17:52 -> 02:fe:7b:06:08:36
00:30:46:742833: l2-input
  l2-input: sw_if_index 2 dst 02:fe:7b:06:08:36 src 02:fe:25:07:17:52 [l2-learn l2-fwd l2-flood l2-flood ]
00:30:46:742835: l2-learn
  l2-learn: sw_if_index 2 dst 02:fe:7b:06:08:36 src 02:fe:25:07:17:52 bd_index 1
00:30:46:742836: l2-fwd
  l2-fwd:   sw_if_index 2 dst 02:fe:7b:06:08:36 src 02:fe:25:07:17:52 bd_index 1 result [0x1010000000001, 1] none
00:30:46:742838: l2-output
  l2-output: sw_if_index 1 dst 02:fe:7b:06:08:36 src 02:fe:25:07:17:52 data 08 00 45 00 00 54 06 8b 00 00 40 01
00:30:46:742840: tap1-output
  tap1 flags 0x00180005
  IP4: 02:fe:25:07:17:52 -> 02:fe:7b:06:08:36
  ICMP: 192.168.1.2 -> 192.168.1.1
    tos 0x00, ttl 64, length 84, checksum 0xf0ca dscp CS0 ecn NON_ECN
    fragment id 0x068b
  ICMP echo_reply checksum 0x6d21 id 24956
00:30:46:742842: tap1-tx
    buffer 0x9d58c: current data 0, length 98, buffer-pool 0, ref-count 1, trace handle 0x1
                    l2-hdr-offset 0 l3-hdr-offset 14
  hdr-sz 0 l2-hdr-offset 0 l3-hdr-offset 14 l4-hdr-offset 0 l4-hdr-sz 0
  IP4: 02:fe:25:07:17:52 -> 02:fe:7b:06:08:36
  ICMP: 192.168.1.2 -> 192.168.1.1
    tos 0x00, ttl 64, length 84, checksum 0xf0ca dscp CS0 ecn NON_ECN
    fragment id 0x068b
  ICMP echo_reply checksum 0x6d21 id 24956

通过上面流程图,ping请求及回应报文都查询到mac表,走已知单播转发流程。至此我们学习二层广播报文及单播报文的转发流程。后面再详细分析代码。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2024-04-23,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DPDK VPP源码分析 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
云服务器
云服务器(Cloud Virtual Machine,CVM)提供安全可靠的弹性计算服务。 您可以实时扩展或缩减计算资源,适应变化的业务需求,并只需按实际使用的资源计费。使用 CVM 可以极大降低您的软硬件采购成本,简化 IT 运维工作。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档