FPGA PCIE端点可防止主机重新启动
我正在努力实现FPGA PCIE端点,以原型一个项目的接口。
我正在使用的FPGA平台是Synopsys HAPS DX7 S6,其配备Xilinx Virtex-7 980T设备。此外,我正在使用Xilinx电缆通过JTAG接口对FPGA进行编程。我远程登录到连接到电缆和FPGA的Linux主机进行编程和实验,因为我正在进行远程实习。
目前,我从PCIE(XDMA)IP的Xilinx DMA子系统开始,编译示例设计,下载Bitstream,然后重新启动我的Linux主机以枚举设备。详细说明,
- 实例化PCIE的DMA子系统(看起来我不允许在获得10个声誉之前嵌入图像...请单击链接)
a href =“ https://i.sstatic.net/slth9.png” rel =“ nofollow noreferrer”>我的XDMA DMA配置
- 我修改了顶部设计和约束文件。
一个。 debug_clk_p/n用作我的VIO核心提供重置信号的时钟,因为我如上所述远程访问板,并且无法手动控制任何开关/按钮。因此,评论了sys_rst_n。
b。 debug_clk_p/n被限制在机上时钟源。 sys_clk_p和sys_clk_n对被限制在PCIE库上的包装销(D7和D8)。
通过JTAG
- 下载到FPGA。由于我可以控制VIO核心,因此我可以确认BOTSTREAM已成功加载。
一个。我运行命令'lspci | grep xilinx',但找不到设备
b。我运行命令'echo 1> /SYS/BUS/PCI/RESCAN'试图重新整理PCI巴士,但无法使用
c。下一步应该是“重新启动主机”,以列举端点并分配内存。然而,问题出现了。
问题说明:
Linux主机无法正常启动;我无法像往常一样登录主机。我们可以确认根本原因是FPGA PCIE端点BITSTREAM,因为当Bitstream具有与PCIE的内容无关紧要时可以重新启动。
。我的同事来实验室亲自检查。他说,可以启动主机,但如图所示,以太网受到影响。我们可以看到,在标准案例中设置了ETH0(FPGA为空白或使用非PCIE设计编程),而ETH0无法正常工作(使用PCIE设计编程FPGA时)。这是为什么我们不能正常登录的原因。
我的同事手动将FPGA与主机断开并重新启动主机。我们发现我们可以再次登录,但是IP地址已更改。换句话说,我必须将另一个地址远程向主机进行。
重要的其他点:
我认为我们之间的连通性和主机之间的连接还可以,因为当我启动时,可以检测到FPGA PCIE端点。但是,Bitstream是使用Synopsys ProdoCompiler从Synopsys示例项目中编译的。不幸的是,我的同事现在无法立即找到原始的bitstream,由于工具版本问题,我无法编译示例项目。
我尝试了类似配置(主要是默认和示例设计)和约束的其他IP内核的其他示例设计,例如PCIE和PCIE端点的AXI。但是,问题是相似的。我无法登录。
由于无法重新启动主机,而且我的同事仍在主要在家工作,因此我们不能非常有效地对其进行故障排除。现在,我们将JTAG连接到另一台个人笔记本电脑,以防万一原始Linux主机无法意外启动。
坦白地说,以太网问题对我来说很奇怪,因为我理解这个终点的方式将是某种外围设备,例如我们的USB鼠标或键盘。即使存在一些问题,也不会对其他设备产生太大影响。其他一些在线帖子提到了条设置问题,但我的1MB bar0不应导致问题。
感谢您提前提出任何可能的建议(实施或调试)或提前推理!请让我知道是否有任何详细信息有帮助,以便我可以更新我的帖子。
最好的, 陶
I am working on implementing an FPGA PCIe endpoint to prototype the interface for one project.
The FPGA platform I am using is Synopsys HAPS DX7 S6 featuring a Xilinx Virtex-7 980T device. Besides, I am using a Xilinx cable to program the FPGA via the JTAG interface. I remotely login to the Linux host connected to the cable and FPGA for programming and experiments since I am doing a remote internship.
Currently, I start with the Xilinx DMA subsystem for PCIe (XDMA) IP, compile the example design, download the bitstream, and reboot my Linux host to enumerate the device. In detail,
- instantiate DMA subsystem for PCIe (It looks like I am not allowed to embed images before earning 10 reputations... please click the links)
- I modify the top design and constraint file.
a. Debug_clk_p/n is used as the clock for my VIO core providing the reset signal because I remotely access the board as mentioned above, and cannot manually control any switches/buttons. Therefore, sys_rst_n is commented out.
b. Debug_clk_p/n are constrained to the onboard clock source. Sys_clk_p and sys_clk_n pair are constrained to package pins (D7 and D8) on the PCIe bank.
FPGA platform PCIe bank pinout
- Bitstream is downloaded to FPGA via JTAG. I can confirm the bitstream is successfully loaded since I can control the VIO core.
a. I run the command 'lspci | grep Xilinx' but did not find the device
b. I run the command 'echo 1 > /sys/bus/pci/rescan' trying to re-enumerate the PCI bus but did not work
c. The next step is supposed to be 'reboot the host' to enumerate the endpoint and allocate the memory. Nevertheless, issues came up.
Issues description:
The Linux host cannot boot up normally; I cannot log in to the host as usual with my credentials. We can confirm the underlying reason is the FPGA PCIe endpoint bitstream because the reboot can be finished when a bitstream has no matter with PCIe stuff is programmed.
My colleague came to the lab and checked in person. He said that the host can be booted up but ethernet is impacted as shown in the figures. We can see eth0 is set up in standard cases (FPGA is blank or programmed with non-PCIe design) while eth0 fails to work (when FPGA is programmed with the PCIe design). It is deduced this is why we cannot log in normally.
My colleague manually disconnects FPGA from the host and reboots the host. We find we can log in again but the IP address has changed. In other words, I have to ssh another address remotely to the host.
Important additional points:
I think our connectivity between FPGA PCIe and host is okay because when I start, the FPGA PCIe endpoint can be detected. However, the bitstream is compiled from the Synopsys example project using Synopsys Protocompiler. Unfortunately, my colleague cannot locate the original bitstream now and I cannot compile the example project due to the tool version issues.
I tried other example designs of IP cores such as AXI for PCIe and PCIe endpoint following the similar configuration (mostly default and example design) and constraints. However, the issues are similar; I cannot log in.
As the host cannot be rebooted and my colleague is still mainly working from home, we cannot troubleshoot it very efficiently. Right now, we connect the JTAG to another personal laptop to debug by any chance in case the original Linux host fails to start unexpectedly.
To be honest, the ethernet issue is kinda weird to me, since the way I understand this endpoint would be some kind of peripheral like our USB mouse or keyboard. Even if it has some issues, should not impact much on other devices. Some other online posts mentioned the BAR setting problem but my 1MB BAR0 should not result in problems.
Thanks for any possible suggestions (on implementation or debugging) or reasoning from you in advance! Please let me know if any details might be helpful so I can update my post.
Best,
Tao
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在进行PCIE挑选时,您确实需要一个已知的外部主机来进行JTAG,并在启动时捕获Linux序列输出。远程访问的电源板非常方便,您当然可以引起有趣的锁定。如果您无法访问PCIE分析仪,则可能很难查明这些核心问题。
通常,我会禁用DMA核心,直到您列举了基本的栏并正确响应。
没有物理访问交换机似乎是理由是“注释”
sys_rst_n
- 这需要正确绑定才能使内核工作。故障的核心肯定会降低点对点链接,虽然您希望PCIE Bridges可以隔离单个行为不当的设备,但您的结果表明可能并非如此。我不遵循您的调试时钟与PCIE核心的交互方式,但是我的建议是按照董事会供应商的默认值离开PCIE核心时钟和重置安排,直到您真的知道您在做什么,并且可以制作和测试孤立的小型小更改。
When doing pcie bringup, you really need an known-good external host for JTAG, and to capture linux serial output at boot time. A remotely accessible power strip is handy as well as you can certainly cause interesting lock-ups. If you don't have access to a pcie analyser it can be very difficult to pinpoint problems with these cores.
Generally, I would disable the DMA core until you have the basic BAR enumerating and responding properly.
Not having physical access to switches doesn't seem like justification to "comment out"
sys_rst_n
- this needs to be tied properly for the cores to work. A faulty core can certainly bring down point-to-point links and while you would hope that the PCIe bridges would isolate single misbehaving devices your results demonstrate this might not be the case.I don't follow how your debug clock is interacting with the pcie core, but my advice would be to leave the pcie core clocking and reset arrangements per the board vendors defaults until you really know what you are doing and can make and test isolated small changes.