htmlunit页面不包含' src'标签

发布于 2025-01-30 13:09:56 字数 990 浏览 1 评论 0 原文

我正在尝试获取指向此页面的 .mp4 https://www.clippituser.tv/c/xqbnrq

在Chrome DevTools中:

<video playsinline="playsinline" webkit-playsinline="" class="vjs-tech" id="vjs_video_3_html5_api" tabindex="-1" preload="auto" autoplay="" src="https://clips.clippit.tv/xqbnrq/360.mp4"></video>

我的代码是:

Page page = null;
try {
  webClient.waitForBackgroundJavaScript(5000);
  page = webClient.getPage(url);
} catch (IOException e) {
  e.printStackTrace();
}

DomNodeList<DomElement> source = ((HtmlPage) page).getElementsByTagName("video");
String videoUrl = source.get(0).getAttribute("src");

source.get(0).asxml()与缺少 src 相同,其中 .mp4

<video playsinline="playsinline" webkit-playsinline="" class="vjs-tech" id="vjs_video_3_html5_api" tabindex="-1" preload="auto" autoplay="autoplay"/>

:代码可以从其他网站获取视频,因此不确定我在做什么错。

I'm trying to get the link to the .mp4 of this page https://www.clippituser.tv/c/xqbnrq

In Chrome devtools I can see it fine:

<video playsinline="playsinline" webkit-playsinline="" class="vjs-tech" id="vjs_video_3_html5_api" tabindex="-1" preload="auto" autoplay="" src="https://clips.clippit.tv/xqbnrq/360.mp4"></video>

My code is:

Page page = null;
try {
  webClient.waitForBackgroundJavaScript(5000);
  page = webClient.getPage(url);
} catch (IOException e) {
  e.printStackTrace();
}

DomNodeList<DomElement> source = ((HtmlPage) page).getElementsByTagName("video");
String videoUrl = source.get(0).getAttribute("src");

source.get(0).asXml() is the same apart from missing src where the .mp4 is:

<video playsinline="playsinline" webkit-playsinline="" class="vjs-tech" id="vjs_video_3_html5_api" tabindex="-1" preload="auto" autoplay="autoplay"/>

This code works fine for getting videos from other websites so not sure what I'm doing wrong.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

夏天碎花小短裙 2025-02-06 13:09:56

首先-WebClient.WaitForBackgroundJavaScript(5000);不是一个选择。检索页面后,您必须打电话给它。

从HTMLUNIT 2.61.0开始,XMLHTTPREQUEST处理中有一个错误,导致ArrayIndIndexOutOfBounds异常。现在已修复,并且很快将提供新的快照版本。

但是,在修复程序后,页面仍然报告

VIDEOJS: "ERROR:" "(CODE:4 MEDIA_ERR_SRC_NOT_SUPPORTED)"
No compatible source was found for this media.
{"code":4,"message":"No compatible source was found for this media."

似乎有一些JS代码检查“浏览器”,以弄清楚在添加源之前是否可以播放视频。但是页面的JS代码很复杂,很难弄清楚哪些检查失败。

如果您想解决此问题,请在Github上为HTMLUNIT打开一个问题,然后尝试隔离问题()。

At first - webClient.waitForBackgroundJavaScript(5000); is not an option. You have to call this AFTER retrieving the page.

As of HtmlUnit 2.61.0 there is a bug in the XMLHttpRequest handling, that leads to an ArrayIndexOutOfBounds exception. This is now fixed and a new Snapshot version will be available soon.

But after the fix the page still reports

VIDEOJS: "ERROR:" "(CODE:4 MEDIA_ERR_SRC_NOT_SUPPORTED)"
No compatible source was found for this media.
{"code":4,"message":"No compatible source was found for this media."

Looks like there is some js code that checks the 'browser' to figure out if the video is playable before adding the source. But the js code for the page is complex, it is not that easy to figure out which check fails.

If you like to get this also fixed, please open an issue for HtmlUnit at github and try to isolate the problem (https://htmlunit.sourceforge.io/submittingJSBugs.html).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文