popexizhi: 遇到的问题

显示标签为“遇到的问题”的博文。显示所有博文

2020年10月20日星期二

tcpdump抓的tcp包超过了mtu问题

本来是处理tcpreplay mtu重新分割问题，最后发现可以tcpdump抓包处理mtu- tcpdump抓的tcp包超过了mtu问题

1.关闭gso

参考:

https://www.cnblogs.com/jiangz/archive/2012/12/25/2831862.html [这里讲了要关闭gso但没有说原因]

http://wsfdl.com/%E8%B8%A9%E5%9D%91%E6%9D%82%E8%AE%B0/2016/07/12/tcp_package_large_then_MTU.html [这里说了原因，但没有说如何关闭，但推荐了https://stackoverflow.com/questions/2350985/length-of-captured-packets-more-than-mtu/2351026#2351026 的地址]

原文:

为了降低 CPU 的负载，提高网络的出口带宽，TSO 提供一些较大的缓冲区来缓存 TCP 发送的包，然后由网卡负责把缓存的大包拆分成多个小于 MTU 的包。tcpdump 或者 wireshare 抓取的是网卡上层的包，所以我们可能会观察到大小超过 MTU 的包：

[popexizhi: tcpdump是从网卡上层抓的包]

2. 从generic segmentation offload 关闭，测试发现还是有1500 以上的mtu，查看详细的原文如下：分析如果是有网卡层的缓存，是不是看看协议是否也有类似的缓存设置

参考: https://stackoverflow.com/questions/2350985/length-of-captured-packets-more-than-mtu/2351026#2351026

打开

https://lists.openwall.net/netdev/2008/11/14/20

这里看到如下:

On 13-11-2008 15:29, Sami Farin wrote:

...

> Oh, I had old ethtool..

> These with v 6:

> # ethtool -k eth0

> Offload parameters for eth0:

> rx-checksumming: off

> tx-checksumming: off

> scatter-gather: off

> tcp segmentation offload: off

> udp fragmentation offload: off

> generic segmentation offload: on #这个就是要关闭的gso

> Wow. I turned gso off and now it works just like before.

> No packets over size of mtu anymore, either.

3. 测试了segmentation 果然有，tcp-segmentation-offload, 测试了一下 tso off

再次查看如下，重新测试抓包ok了

[root@localhost post_http]# ethtool -k eth0|grep segmentation

tcp-segmentation-offload: off

tx-tcp-segmentation: off

tx-tcp-ecn-segmentation: off

tx-tcp6-segmentation: off

tx-tcp-mangleid-segmentation: off

generic-segmentation-offload: off

tx-fcoe-segmentation: off [fixed]

tx-gre-segmentation: off [fixed]

tx-ipip-segmentation: off [fixed]

tx-sit-segmentation: off [fixed]

tx-udp_tnl-segmentation: off [fixed]

tx-gre-csum-segmentation: off [fixed]

tx-udp_tnl-csum-segmentation: off [fixed]

tx-sctp-segmentation: off [fixed]

2020年6月30日星期二

问题:两个真实主机之间，pcap回放udp server 不接收问题II

问题:两个真实主机之间，pcap回放udp server 不接受问题

查看协议栈的统计情况

netstat -s --udp

对比发现：

upd 的 InCsumErrors 一直在发包时增长

https://www.cnblogs.com/qianyuliang/p/10542747.html

这里解释为:

. UDP Errors

type: Graph

Unit: short

Label: Datagrams out (-) / in (+)

InCsumErrors - 具有校验和错误的 UDP 数据包的平均数（5分钟内）

metrics:

irate(node_netstat_Udp_InCsumErrors{instance=~"$node:$port",job=~"$job"}[5m])

[next]这个校验和如何关闭呢？

https://cizixs.com/2018/01/13/linux-udp-packet-drop-debug/

UDP 报文错误

如果在传输过程中UDP 报文被修改，会导致 checksum 错误，或者长度错误，linux 在接收到 UDP 报文时会对此进行校验，一旦发明错误会把报文丢弃。

如果希望 UDP 报文 checksum 及时有错也要发送给应用程序，可以在通过 socket 参数禁用 UDP checksum 检查：

int disable = 1;

setsockopt(sock_fd, SOL_SOCKET, SO_NO_CHECK, (void*)&disable, sizeof(disable)

[next]

python中如何实现呢？

https://docs.python.org/2/library/socket.html

There is a socket flag to set, in order to prevent this, socket.SO_REUSEADDR:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

s.bind((HOST, PORT))

[try]

s.setsockopt(socket.SOL_SOCKET, socket.SO_NO_CHECK, 1)

-----[?]测试效果还是不可以，why?

2020年6月22日星期一

虚拟网络中回放pcap和socket监听数据

参考地址:

https://cizixs.com/2017/02/10/network-virtualization-network-namespace/

[添加虚拟网络]

ip netns add ns1

ip netns #查看效果

[添虚拟网卡]

ip link add vens0 type veth peer name vens1

[把虚拟网卡分配给ns1域]

ip link set vens1 netns ns1

ip netns exec ns1 ip a #查看分配效果

[给虚拟网卡分配地址]

ip netns exec ns1 ifconfig vens1 10.0.0.2

Ps: 这时 vensI 查看是NO-CARRIER状态

如果 pair 的一端接口处于 DOWN 状态，另一端能自动检测到这个信息，并把自己的状态设置为 NO-CARRIER。

所以将与其连接的vens0 启动 : ifconfig vens0 up就可以了

[修改pcap]

805 ip a s

806 tcprewrite --enet-dmac=96:b8:54:1f:96:61 -C --srcipmap=10.0.0.1:10.0.0.2 --dstipmap=192.168.91.50:10.0.0.1 --infile=1000w.pcap --outfile=1000wo.pcap

[回放方式]

ip netns exec ns1 tcpreplay -i vens1 -M 10 1000wo.pcap

[在主机上监听socket就可以接受到了]

当前问题:

不用虚拟监听方式，在两个真实主机间udp的pcap回放都不可以使用socket监听

这是why?

2020年6月2日星期二

问题记录: jenkins +robot 报告处理失败

奇怪问题记录:

jenkins +robot

生成结果报告提示

Robot results publisher started...
-Parsing output xml:
Done!
-Copying log files to build dir:
Failed!

在对应的目录可以看到生成的报告，但是在jenkins无法访问，这个是统计过程出现问题引起的吗？

2020年4月13日星期一

如何处理surface "the computer restarted unexpectedly or encountered an unexpected error"

参考:http://salutleo.blogspot.com/2016/09/asuswindows-7_15.html

surface 做系统重置后，重启提示
the computer restarted unexpectedly or encountered an unexpected error. windows installation cannot proceed. To install Windows, click "OK" to restart the computer, and then restart the installation.

解決方法：

曾找了微軟網站，但完全找不到解決之道，還好找到以下影片可以解決以上問題，若你也有類似問題，請參考以下影片：

https://youtu.be/tgpw9OKh6SQ

过程参考:

https://blog.pcrisk.com/windows/12886-the-computer-restarted-unexpectedly-or-encountered-an-unexpected-error

但自己没有亲自试试，mark一下。

2019年6月4日星期二

unittest 遇到的问题记录

同目录中的unittest如何一次全部调用运行？当前popexizhi使用的方式是在每个级别中加test.sh，再在总目录下加驱动脚本，不知有更好的方式吗？

unittest的测试桩如何使用可以更好的将test从被测试环境分离出来吗？eg:es使用;mysql使用;ssh使用

2019年3月13日星期三

解决:Cannot open include file: 'ntddk.h'

问题1:

KernelResumeInject code编译时，安装了ddk但是依然提示

"Cannot open include file: 'ntddk.h'"

解决1 :

You need to add WDK headers path to your vcxproj include directories:
vcxproj properties -> C/C++ -> General -> Additional Include Directories

C:\Program Files (x86)\Windows Kits\10\Include\10.0.14393.0\km\

P.S.: Make sure you install SDK 10 together with WDK 10.
P.P.S: Without SDK you will get Cannot open include file: 'ntdef.h' error

参考:
https://stackoverflow.com/questions/35777922/cannot-open-include-file-ntddk-h

问题2:

这次真的提示"无法打开包括文件:“ntdef.h”"

和原文提到的一样，检测了说的SDK版本和WDK版本问题，不是说的啊

解决2：

之后在源程序中#include <ntddk.h>，编译提示“No such file or directory”，所以我右击项目->“属性”->“配置属性”->“VC++目录”->“包含目录”里添加了“C:\WinDDK\7600.16385.1\inc\ddk”。这个问题解决了，但再编译又提示“无法打开包括文件:“ntdef.h”: No such file or directory”，所以我按相同的方法添加了“C:\WinDDK\7600.16385.1\inc\api”，再编译...我擦...错误见下图，最后还来了句“错误计数超过 100；正在停止编译”。

参考:
https://bbs.csdn.net/topics/390197706

问题3:

好吧之后我碰到和这个作者一样的大量语法错误问题。

待解决:
和世请教了一下，他推荐的easysys 生成工程编译sys驱动，否则很多huan3的语法错误，next试试这个

PS： windows驱动，不，是windows安全这个code路上坑不少，但是也是个练习解决能力的好机会，祝你越挫越勇，康庄之路的未来:)