如何设置 QNetworkReply 属性以获得正确的 NCBI 页面?

发布于 2024-08-27 11:28:01 字数 2391 浏览 13 评论 0原文

我尝试使用 downloadURL 函数获取以下网址,如下所示:

http://www. ncbi.nlm.nih.gov/nuccore/27884304

但是数据并不是我们通过浏览器看到的那样,现在我知道这是因为需要一些正确的信息(例如浏览器类型)。我如何知道需要设置哪些信息,如何设置? (通过setHeader函数或其他方式??)

在VC++中,我们可以使用CInternetSession和CHttpConnection对象来获取正确的数据,而无需设置任何其他详细信息,Qt或其他跨平台C++网络库中是否有类似的方法? (是的,我需要跨平台属性。)

QNetworkReply::NetworkError downloadURL(const QUrl &url, QByteArray &data) {
    QNetworkAccessManager manager;
    QNetworkRequest request(url);
    request.setHeader(QNetworkRequest::ContentTypeHeader ,"Mozilla/5.0 (Windows; U; Windows NT
6.0; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729)");
    QNetworkReply *reply = manager.get(request);

    QEventLoop loop;
    QObject::connect(reply, SIGNAL(finished()), &loop, SLOT(quit()));
    loop.exec();


    QVariant statusCodeV = reply->attribute(QNetworkRequest::RedirectionTargetAttribute);
    QUrl redirectTo = statusCodeV.toUrl();

    if (!redirectTo.isEmpty())
    {
        if (redirectTo.host().isEmpty())
        {
            const QByteArray newaddr = ("http://"+url.host()+redirectTo.encodedPath()).toAscii();
            redirectTo.setEncodedUrl(newaddr);
            redirectTo.setHost(url.host());
        }
        return (downloadURL(redirectTo, data));
    }

    if (reply->error() != QNetworkReply::NoError)
    {
        return reply->error();
    }
    data = reply->readAll();
    delete reply;
    return QNetworkReply::NoError; }

通过VC,我们可以这样做,然后正确的数据就在CHttpFile中。

CString downloadURL (CString sGetFromURL)
{
    // create an internet session 
    CInternetSession csiSession;

    int pos;
    BOOL neof;

    // parse URL to get server/object/port 

    DWORD dwServiceType;
    CString sServerName;
    CString sObject;
    INTERNET_PORT nPort;
    CHttpConnection* pHTTPServer = NULL; 
    CHttpFile* pFile = NULL;


        AfxParseURL ( sGetFromURL, dwServiceType, sServerName, sObject, nPort );

        // open HTTP connection 
        pHTTPServer = csiSession.GetHttpConnection ( sServerName, nPort ); 

        // get HTTP object 
        pFile = pHTTPServer->OpenRequest ( CHttpConnection::HTTP_VERB_GET, sObject, NULL, 1, NULL, NULL, INTERNET_FLAG_RELOAD ); 

        pFile->SendRequest();

}

I try to get this following url using the downloadURL function as follows:

http://www.ncbi.nlm.nih.gov/nuccore/27884304

But the data is not as what we can see through the browser, now I know it's because some correct information (such as browser type) is needed. How can I know what kind of information I need to set, and how can I set it? (By setHeader function or some other way??)

In VC++, we can use CInternetSession and CHttpConnection Object to get the correct data without setting any other detail information, is there any similar way in Qt or other cross-platform C++ network lib?? (Yes, I need the the cross-platform property.)

QNetworkReply::NetworkError downloadURL(const QUrl &url, QByteArray &data) {
    QNetworkAccessManager manager;
    QNetworkRequest request(url);
    request.setHeader(QNetworkRequest::ContentTypeHeader ,"Mozilla/5.0 (Windows; U; Windows NT
6.0; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729)");
    QNetworkReply *reply = manager.get(request);

    QEventLoop loop;
    QObject::connect(reply, SIGNAL(finished()), &loop, SLOT(quit()));
    loop.exec();


    QVariant statusCodeV = reply->attribute(QNetworkRequest::RedirectionTargetAttribute);
    QUrl redirectTo = statusCodeV.toUrl();

    if (!redirectTo.isEmpty())
    {
        if (redirectTo.host().isEmpty())
        {
            const QByteArray newaddr = ("http://"+url.host()+redirectTo.encodedPath()).toAscii();
            redirectTo.setEncodedUrl(newaddr);
            redirectTo.setHost(url.host());
        }
        return (downloadURL(redirectTo, data));
    }

    if (reply->error() != QNetworkReply::NoError)
    {
        return reply->error();
    }
    data = reply->readAll();
    delete reply;
    return QNetworkReply::NoError; }

By VC, we can just do this, then the correct data is in the CHttpFile.

CString downloadURL (CString sGetFromURL)
{
    // create an internet session 
    CInternetSession csiSession;

    int pos;
    BOOL neof;

    // parse URL to get server/object/port 

    DWORD dwServiceType;
    CString sServerName;
    CString sObject;
    INTERNET_PORT nPort;
    CHttpConnection* pHTTPServer = NULL; 
    CHttpFile* pFile = NULL;


        AfxParseURL ( sGetFromURL, dwServiceType, sServerName, sObject, nPort );

        // open HTTP connection 
        pHTTPServer = csiSession.GetHttpConnection ( sServerName, nPort ); 

        // get HTTP object 
        pFile = pHTTPServer->OpenRequest ( CHttpConnection::HTTP_VERB_GET, sObject, NULL, 1, NULL, NULL, INTERNET_FLAG_RELOAD ); 

        pFile->SendRequest();

}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

酸甜透明夹心 2024-09-03 11:28:01

您设置了错误的 Content-Type 标头。您提供的值更适合 User-Agent 标头

You set wrong Content-Type header. The value you provided fits more User-Agent header

靑春怀旧 2024-09-03 11:28:01

关闭,但您没有设置正确的标头。你需要做:

request.setRawHeader("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729)" );

Close, but you aren't setting the correct header. You need to do:

request.setRawHeader("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729)" );
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文