wget 未完全处理 http 调用

发布于 2024-10-12 12:50:35 字数 2480 浏览 6 评论 0原文

这是一个 wget 命令,它执行由第三方托管的 HTML / PHP 堆栈报告套件 - 我们无法控制 PHP 或 HTML 页面

wget --no-check-certificate --http-user=/myacc --http-password=mypass -O /tmp/myoutput.csv“https://myserver.mydomain.com/mymodule.php?myrepcode=9999&action=exportcsv&admin=myappuserid&password=myappuserpass&startdate=2011- 01-16&enddate=2011-01-16&reportby=mypreferredview"

所有元素都工作正常:

--http-user / --http-pass 由浏览器标准弹出窗口提供用户名和密码提示 -O /tmp/myoutput.csv - 感兴趣的输出文件

https://myserver.mydomain.com/mymodule.php?myrepcode=9999&action=exportcsv&admin=myappuserid& ;password=myappuserpass&startdate=2011-01-16&enddate=2011-01-16&reportby=mypreferredview"

由参数 myrepcode=9999 动态生成的文件

- 对相关报告的引用 action=exportcsv 内部写在函数中 admin=myappuserid 第三方使用 SSL 来访问网站 - 然后将内部用户名和密码存储在数据库中以访问网站的功能) 密码=myapp用户密码 startdate=2011-01-16 此数据和结束数据是特定于报告 9999 的参数 结束日期=2011-01-16 reportby=mypreferredview 这是报告中的一个选项,可促进不同级别的详细信息或聚合

问题是,reportby 参数是 5 个选项列表中的单选按钮选择(当然我足够了,默认值是最高级别的聚合,我想要最后一个是最详细的)

这是reportby选项的HTML页面代码示例

HTML中的标签没有列入白名单 - 所以如果需要我会发送示例

<td>View by</td>
<td>
   <input class="naf-radio" name="reportby" id="reportby[thedefault]" value="thedefault" type="radio">The Default                    
   <input class="naf-radio" name="reportby" id="reportby[myleastpreferred]" value="myleastpreferred" type="radio">My Least Preferred
   <input class="naf-radio" name="reportby" id="reportby[mysecondleastpreferred]" value="mysecondleastpreferred" type="radio">My Second Least Preferred
   <input class="naf-radio" name="reportby" id="reportby[mythirdleastpreferred]" value="mythirdleastpreferred" type="radio">My Third Least Preferred
   <input class="naf-radio" name="reportby" id="reportby[mypreferred]" value="mypreferred" type="radio">My Preferred
</td>

无论我选择哪个reportby项目wget 语句 - 始终执行默认值。

问题

1) 有没有人在 HTML 中遇到过这个符号 (id=inputname[inputelement]) 我与一位高级 Web 开发人员交谈过,他从未见过这种输入符号 (id=inputname[inputelement]) - 并且基于广泛的搜索,w3schools 似乎对此也不熟悉

2) wget 命令可以选择非默认无线电吗执行命令时的项目?

最初可能会收到“使用 CURL”响应,但是 wget 方法在我所操作的有限环境中效果很好,特别是当我需要下载 10000 个此类项目时。

感谢提前回复

Here is a wget command that executes a HTML / PHP stack report suite that is hosted by a third party - we don't have control over the PHP or HTML page

wget --no-check-certificate --http-user=/myacc --http-password=mypass -O /tmp/myoutput.csv "https://myserver.mydomain.com/mymodule.php?myrepcode=9999&action=exportcsv&admin=myappuserid&password=myappuserpass&startdate=2011-01-16&enddate=2011-01-16&reportby=mypreferredview"

All the elements are working perfectly:

--http-user / --http-pass as offered by a browsers standard popup for username and password prompt
-O /tmp/myoutput.csv - the output file of interest

https://myserver.mydomain.com/mymodule.php?myrepcode=9999&action=exportcsv&admin=myappuserid&password=myappuserpass&startdate=2011-01-16&enddate=2011-01-16&reportby=mypreferredview"

The file generated on the fly by the parameters

myrepcode=9999 - a reference to the report in question
action=exportcsv internally written in the function
admin=myappuserid the third party operats SSL to access the site - then internal username and password stored in a database to access the functions of the site)
password=myappuserpass
startdate=2011-01-16 this and end data are parameters specific to the report 9999
enddate=2011-01-16
reportby=mypreferredview This is an option in the report that facilitates different levels of detail or aggregation

The problem is that the reportby parameter is a radio button selection in a list of 5 selections (sure I enough the default is highest level of aggregation , I want the last one which is the most detailed)

Here is a sample of the HTML page code for the options of reportby

The tags in the HTML are not whitelisted - so I will send the sample if requested

<td>View by</td>
<td>
   <input class="naf-radio" name="reportby" id="reportby[thedefault]" value="thedefault" type="radio">The Default                    
   <input class="naf-radio" name="reportby" id="reportby[myleastpreferred]" value="myleastpreferred" type="radio">My Least Preferred
   <input class="naf-radio" name="reportby" id="reportby[mysecondleastpreferred]" value="mysecondleastpreferred" type="radio">My Second Least Preferred
   <input class="naf-radio" name="reportby" id="reportby[mythirdleastpreferred]" value="mythirdleastpreferred" type="radio">My Third Least Preferred
   <input class="naf-radio" name="reportby" id="reportby[mypreferred]" value="mypreferred" type="radio">My Preferred
</td>

No matter which of the reportby items I select in the wget statement - thedefault is always executed.

Questions

1) Has anyone come across this notation in HTML (id=inputname[inputelement])
I spoke to a senior web developer and he has never seen this notation for inputs (id=inputname[inputelement]) - and w3schools do not appear familiar with this either based on an extensive search

2) Can a wget command select a none default radio item when executing the command ?

This probably will be initially received with a "Use CURL" response- however the wget approach works very well in the limited environment I am operating in - particularly as I need to download 10000 of these such items.

Thanks ahead of response

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

—━☆沉默づ 2024-10-19 12:50:35

单选按钮只是另一个表单元素,通常可以通过查询字符串传递。某些应用程序会要求参数作为 POST 数据传递,但根据我的经验,这种情况并不常见。

您需要做的是找到单选按钮的名称以及所需选项上的值。然后,您只需将 &name=value 添加到当前网址,它的行为就像选择该单选按钮一样。

符号 id=inputname[inputelement] 可以存在于 javascript 或类似语言中,但不能存在于 HTML 中。在 HTML 中,它只是 url 中声明的 name=value 类型(以及属性和其他内容)。在这种情况下,我假设该字符串应该在生成 HTML 的任何代码中进行解释,而不是呈现到屏幕上。

此外,您还需要确保对放入 url 中的任何值进行 urlencode,以确保它们不包含任何非法字符(例如 & 或 = 会完全混淆)。

如果 querystring 方法不起作用,那么 wget 有一个 --post-data 开关,允许您指定要发布的数据,这就是表单的功能。如果您使用 --post-data=reportby=mypreferred 我希望您能取得更大的成功。

如果仍然失败,那么我将使用一些工具来查看您的 wget 请求以及通过浏览器的请求,并比较标头和数据以查看它们的不同之处。此类工具之一是 fiddler (http://www.fiddler2.com/fiddler2/),尽管我确信还有许多其他工具存在。

A radio button is just another form element and can be passed through the querystring usually. Some applications will demand that parameters are passed as POST data but this isn't that common in my experience.

What you'll need to do is find the name of the radio buttonand the value that is on the desired option. You then just add &name=value to your current url and it should act like selecting that radio button.

The notation id=inputname[inputelement] could exist in javascript or similar languages but not in HTML. In HTML its just name=value type of declarations in urls (and attribtues and other things). In this case I would assume that this string was meant to have been interpreted in whatever code generated the HTML rather than rendered to screen.

Also you need to make sure you urlencode any values that you are putting in the url to make sure they don't contain any illegal characters (eg an & or = will confuse it completely).

If the querystring method doesn't work then wget has a --post-data switch that allows you to specify data to be posted whcih is what a form would do. If you use --post-data=reportby=mypreferred I hope you should have more success with that.

If this still fails then I would use some tool to view your wget request as well as your request through a browser and compare the headers and data to see what is different about them. one such tool for this is fiddler (http://www.fiddler2.com/fiddler2/) though I'm sure many others exist.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文