docker使用的与Linux网络有关的主要技术:
- Network Namespace
- Veth 设备对
- Iptables/Netfilter
- 网桥
- 路由
<1> 网络命令空间
- namespace的本质就是把原来所有进程全局共享的资源拆分成了很多个一组一组进程共享的资源
- 当一个namespace里面的所有进程都退出时,namespace也会被销毁,所以抛开进程谈namespace没有意义
- Linux内核中的7种类型的namespace
- Cgroup
- IPC
- Network
- Mount
- PID
- User
- UTS
root@backup:~# ls -l /proc/$$/nstotal 0lrwxrwxrwx 1 root root 0 May 21 16:59 ipc -> ipc:[4026531839]lrwxrwxrwx 1 root root 0 May 21 16:59 mnt -> mnt:[4026531840]lrwxrwxrwx 1 root root 0 May 21 16:59 net -> net:[4026531957]lrwxrwxrwx 1 root root 0 May 21 16:59 pid -> pid:[4026531836]lrwxrwxrwx 1 root root 0 May 21 16:59 user -> user:[4026531837]lrwxrwxrwx 1 root root 0 May 21 16:59 uts -> uts:[4026531838]
root@karl-v1:~# ip netns add netns1 ## ip netns addroot@karl-v1:~# ip netns exec netns1 ip link show ## ip netns exec 1: lo: mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00root@karl-v1:~# ip netns exec netns1 bash ## ip netns exec bashroot@karl-v1:~# ip link show1: lo: mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00root@karl-v1:~# exitexitroot@karl-v1:~# ip link show1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:002: eth0: mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 00:50:56:8d:1e:55 brd ff:ff:ff:ff:ff:ff3: docker0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:59:33:fc:fc brd ff:ff:ff:ff:ff:ffroot@karl-v1:~# ip link set br0 netns netns1Cannot find device "br0"root@karl-v1:~# ip link set lo netns netns1RTNETLINK answers: Invalid argumentroot@karl-v1:~# ethtool -k lo |grep netns ## 查看设备是否可转移命名空间netns-local: on [fixed]
<2> Veth 设备对
- 实现了不同网络命名空间的通信
root@karl-v1:~# ip link add veth0 type veth peer name veth1root@karl-v1:~# ip link show1: lo:mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:002: eth0: mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 00:50:56:8d:1e:55 brd ff:ff:ff:ff:ff:ff3: docker0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:59:33:fc:fc brd ff:ff:ff:ff:ff:ff154: veth1@veth0: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 5e:00:3d:72:27:76 brd ff:ff:ff:ff:ff:ff155: veth0@veth1: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:89:03:d6:ab:29 brd ff:ff:ff:ff:ff:ffroot@karl-v1:~#root@karl-v1:~#root@karl-v1:~# ip link set veth1 netns netns1root@karl-v1:~#root@karl-v1:~# ip link show1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:002: eth0: mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 00:50:56:8d:1e:55 brd ff:ff:ff:ff:ff:ff3: docker0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:59:33:fc:fc brd ff:ff:ff:ff:ff:ff155: veth0@if154: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:89:03:d6:ab:29 brd ff:ff:ff:ff:ff:ffroot@karl-v1:~# ip netns exec netns1 ip link show1: lo: mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00154: veth1@if155: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 5e:00:3d:72:27:76 brd ff:ff:ff:ff:ff:ffroot@karl-v1:~#root@karl-v1:~# ip netns exec netns1 ip addr add 10.1.1.1/24 dev veth1root@karl-v1:~#root@karl-v1:~# ip addr add 10.1.1.2/24 dev veth0root@karl-v1:~#root@karl-v1:~# ip netns exec netns1 ip link set dev veth1 uproot@karl-v1:~# ip link set dev veth0 uproot@karl-v1:~# ip link show |grep veth0155: veth0@if154: mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000root@karl-v1:~#root@karl-v1:~# ping 10.1.1.1PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.107 ms64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=0.042 ms64 bytes from 10.1.1.1: icmp_seq=3 ttl=64 time=0.042 msroot@karl-v1:~# ip netns exec netns1 ping 10.1.1.2
root@karl-v1:~# ip netns exec netns1 ethtool -S veth1NIC statistics: peer_ifindex: 155root@karl-v1:~#root@karl-v1:~# ip link show |grep 155155: veth0@if154:mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000root@karl-v1:~#
<3> 网桥
- 网桥是一个二层的虚拟网络设备,把若干网口“连接”起来,使得网口间的报文能互相转发
- 与单纯的交换机不同,交换机只是一个二层设备,对于接收到的报文,要么转发,要么丢弃
- 网桥,除了转发和丢弃,还能提交到协议栈上层(网络层),既可将其看作二层设备,也可看作三层设备
- 网桥可以有一个IP地址,一个网桥(br0)可以绑定多个以太网接口(如eth0和eth1)
root@karl-v1:~#root@karl-v1:~# ip link add veth999 type veth peer name veth998root@karl-v1:~#root@karl-v1:~#root@karl-v1:~# brctl addbr br999root@karl-v1:~#root@karl-v1:~# ip link |grep veth999159: veth998@veth999:mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000160: veth999@veth998: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000root@karl-v1:~#root@karl-v1:~#root@karl-v1:~# brctl addif br999 veth999root@karl-v1:~#root@karl-v1:~#root@karl-v1:~# ip link |grep veth999159: veth998@veth999: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000160: veth999@veth998: mtu 1500 qdisc noop master br999 state DOWN mode DEFAULT group default qlen 1000root@karl-v1:~#root@karl-v1:~# ifconfig br999 172.119.119.119root@karl-v1:~# ifconfig br999br999 Link encap:Ethernet HWaddr ea:e8:d7:21:0c:42 inet addr:172.119.119.119 Bcast:172.119.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)root@karl-v1:~#
<4> Iptables/Netfilter
- Linux提供了一套机制来为用户实现自定义的数据包处理过程
- 在Linux网络协议栈中有一组回调函数挂接点,通过这些挂接点挂接的钩子函数可在Linux网络协议栈处理数据包的过程中对数据包进行一些操作,例如过滤、修改、丢弃等。整个挂接点技术叫做 Netfilter 和 Iptables
- Netfilter负责在内核中执行各种挂接的规则,运行在内核模式中;而Iptables是在用户模式下运行的进程,负责维护内核中Netfiler的各种规则表。通过两者的配合来实现整个Linux网络协议栈中灵活的数据包处理机制
- Netfilter可以挂接的规则点有5个:PREROUTING , INPUT , FORWARD , OUTPUT , POSTROUTING
- 我们可在不同类型的Table中加入我们的规则,目前主要支持的Table类型如下:
- RAW
- MANGLE
- NAT
- FILTER
- 上述4个Table(规则链)的优先级是RAW最高,FILTER最低。
- Iptables命令用于协助用户维护各种规则,查看系统已有的规则有如下两种方法:
- iptables-save: 按照命令行的方式打印Iptables的内容
- iptables -nvL: 已另一种格式显示Netfilter表的内容
<5> 路由
- Linux的路由表至少包括两个表(当启用策略路由时,还会有其他表):一个是LOCAL,另一个是MAIN。
- LOCAL表中包含多有本地设备地址,是在配置网络设备地址时自动创建的,LOCAL表用于linux协议栈识别本地地址,以及进行本地不同网口间的数据转发
- MAIN表用于各类网络IP地址的转发。可以使用静态配置生成,也可使用动态路由发现协议生成。
LOCAL表的查看root@karl-v1:~# ip route show table local type local10.1.1.2 dev veth0 proto kernel scope host src 10.1.1.2127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1127.0.0.1 dev lo proto kernel scope host src 127.0.0.1172.17.0.1 dev docker0 proto kernel scope host src 172.17.0.1172.21.1.11 dev eth0 proto kernel scope host src 172.21.1.11172.119.119.119 dev br999 proto kernel scope host src 172.119.119.119root@karl-v1:~#路由表的查看1)ip route listroot@karl-v1:~# ip route listdefault via 172.21.1.14 dev eth010.1.1.0/24 dev veth0 proto kernel scope link src 10.1.1.2172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1172.21.1.8/29 dev eth0 proto kernel scope link src 172.21.1.11172.119.0.0/16 dev br999 proto kernel scope link src 172.119.119.119root@karl-v1:~#2) netstat -rnroot@karl-v1:~# netstat -rnKernel IP routing tableDestination Gateway Genmask Flags MSS Window irtt Iface0.0.0.0 172.21.1.14 0.0.0.0 UG 0 0 0 eth010.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 veth0172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0172.21.1.8 0.0.0.0 255.255.255.248 U 0 0 0 eth0172.119.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br999root@karl-v1:~#