C# 如何检查 URL 是否存在/有效?

发布于 2024-07-21 22:23:54 字数 331 浏览 5 评论 0原文

我正在使用 Visual C# 2005 编写一个简单的程序,用于在 Yahoo! 上查找股票代码。 金融,下载历史数据,然后绘制指定股票代码的价格历史记录。

我知道获取数据所需的确切 URL,并且如果用户输入现有的股票代码(或至少包含雅虎财经上的数据),则它可以正常工作。 但是,如果用户编写了一个股票代码,那么我会遇到运行时错误,因为程序尝试从不存在的网页中提取数据。

我正在使用 WebClient 类,并使用 DownloadString 函数。 我查看了 WebClient 类的所有其他成员函数,但没有看到任何可以用来测试 URL 的内容。

我怎样才能做到这一点?

I am making a simple program in visual c# 2005 that looks up a stock symbol on Yahoo! Finance, downloads the historical data, and then plots the price history for the specified ticker symbol.

I know the exact URL that I need to acquire the data, and if the user inputs an existing ticker symbol (or at least one with data on Yahoo! Finance) it works perfectly fine. However, I have a run-time error if the user makes up a ticker symbol, as the program tries to pull data from a non-existent web page.

I am using the WebClient class, and using the DownloadString function. I looked through all the other member functions of the WebClient class, but didn't see anything I could use to test a URL.

How can I do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

阳光下慵懒的猫 2024-07-28 22:23:54

这是此解决方案的另一个实现:

using System.Net;

///
/// Checks the file exists or not.
///
/// The URL of the remote file.
/// True : If the file exits, False if file not exists
private bool RemoteFileExists(string url)
{
    try
    {
        //Creating the HttpWebRequest
        HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
        //Setting the Request method HEAD, you can also use GET too.
        request.Method = "HEAD";
        //Getting the Web Response.
        HttpWebResponse response = request.GetResponse() as HttpWebResponse;
        //Returns TRUE if the Status code == 200
        response.Close();
        return (response.StatusCode == HttpStatusCode.OK);
    }
    catch
    {
        //Any exception will returns false.
        return false;
    }
}

来自: http://www.dotnetthoughts.net/2009/10/14/how-to-check-remote-file-exists-using-c/

Here is another implementation of this solution:

using System.Net;

///
/// Checks the file exists or not.
///
/// The URL of the remote file.
/// True : If the file exits, False if file not exists
private bool RemoteFileExists(string url)
{
    try
    {
        //Creating the HttpWebRequest
        HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
        //Setting the Request method HEAD, you can also use GET too.
        request.Method = "HEAD";
        //Getting the Web Response.
        HttpWebResponse response = request.GetResponse() as HttpWebResponse;
        //Returns TRUE if the Status code == 200
        response.Close();
        return (response.StatusCode == HttpStatusCode.OK);
    }
    catch
    {
        //Any exception will returns false.
        return false;
    }
}

From: http://www.dotnetthoughts.net/2009/10/14/how-to-check-remote-file-exists-using-c/

不寐倦长更 2024-07-28 22:23:54

您可以发出 "HEAD" 请求,而不是 "得到”?
因此,要测试 URL 而无需下载内容:

// using MyClient from linked post
using(var client = new MyClient()) {
    client.HeadOnly = true;
    // fine, no content downloaded
    string s1 = client.DownloadString("http://google.com");
    // throws 404
    string s2 = client.DownloadString("http://google.com/silly");
}

您可以在 DownloadString 周围 try/catch 来检查错误; 没有错误? 它存在...


使用 C# 2.0 (VS2005):

private bool headOnly;
public bool HeadOnly {
    get {return headOnly;}
    set {headOnly = value;}
}

using(WebClient client = new MyClient())
{
    // code as before
}

You could issue a "HEAD" request rather than a "GET"?
So to test a URL without the cost of downloading the content:

// using MyClient from linked post
using(var client = new MyClient()) {
    client.HeadOnly = true;
    // fine, no content downloaded
    string s1 = client.DownloadString("http://google.com");
    // throws 404
    string s2 = client.DownloadString("http://google.com/silly");
}

You would try/catch around the DownloadString to check for errors; no error? It exists...


With C# 2.0 (VS2005):

private bool headOnly;
public bool HeadOnly {
    get {return headOnly;}
    set {headOnly = value;}
}

and

using(WebClient client = new MyClient())
{
    // code as before
}
人│生佛魔见 2024-07-28 22:23:54

这些解决方案非常好,但他们忘记了除了 200 OK 之外还可能有其他状态代码。 这是我在生产环境中用于状态监控等的解决方案。

如果目标页面上存在 url 重定向或其他条件,则使用此方法返回 true。 此外,GetResponse() 将引发异常,因此您将无法获得它的 StatusCode。 您需要捕获异常并检查 ProtocolError。

任何 400 或 500 状态代码都将返回 false。 其他所有都返回 true。
可以轻松修改此代码以满足您对特定状态代码的需求。

/// <summary>
/// This method will check a url to see that it does not return server or protocol errors
/// </summary>
/// <param name="url">The path to check</param>
/// <returns></returns>
public bool UrlIsValid(string url)
{
    try
    {
        HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
        request.Timeout = 5000; //set the timeout to 5 seconds to keep the user from waiting too long for the page to load
        request.Method = "HEAD"; //Get only the header information -- no need to download any content

        using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
        {
            int statusCode = (int)response.StatusCode;
            if (statusCode >= 100 && statusCode < 400) //Good requests
            {
                return true;
            }
            else if (statusCode >= 500 && statusCode <= 510) //Server Errors
            {
                //log.Warn(String.Format("The remote server has thrown an internal error. Url is not valid: {0}", url));
                Debug.WriteLine(String.Format("The remote server has thrown an internal error. Url is not valid: {0}", url));
                return false;
            }
        }
    }
    catch (WebException ex)
    {
        if (ex.Status == WebExceptionStatus.ProtocolError) //400 errors
        {
            return false;
        }
        else
        {
            log.Warn(String.Format("Unhandled status [{0}] returned for url: {1}", ex.Status, url), ex);
        }
    }
    catch (Exception ex)
    {
        log.Error(String.Format("Could not test url {0}.", url), ex);
    }
    return false;
}

These solutions are pretty good, but they are forgetting that there may be other status codes than 200 OK. This is a solution that I've used on production environments for status monitoring and such.

If there is a url redirect or some other condition on the target page, the return will be true using this method. Also, GetResponse() will throw an exception and hence you will not get a StatusCode for it. You need to trap the exception and check for a ProtocolError.

Any 400 or 500 status code will return false. All others return true.
This code is easily modified to suit your needs for specific status codes.

/// <summary>
/// This method will check a url to see that it does not return server or protocol errors
/// </summary>
/// <param name="url">The path to check</param>
/// <returns></returns>
public bool UrlIsValid(string url)
{
    try
    {
        HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
        request.Timeout = 5000; //set the timeout to 5 seconds to keep the user from waiting too long for the page to load
        request.Method = "HEAD"; //Get only the header information -- no need to download any content

        using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
        {
            int statusCode = (int)response.StatusCode;
            if (statusCode >= 100 && statusCode < 400) //Good requests
            {
                return true;
            }
            else if (statusCode >= 500 && statusCode <= 510) //Server Errors
            {
                //log.Warn(String.Format("The remote server has thrown an internal error. Url is not valid: {0}", url));
                Debug.WriteLine(String.Format("The remote server has thrown an internal error. Url is not valid: {0}", url));
                return false;
            }
        }
    }
    catch (WebException ex)
    {
        if (ex.Status == WebExceptionStatus.ProtocolError) //400 errors
        {
            return false;
        }
        else
        {
            log.Warn(String.Format("Unhandled status [{0}] returned for url: {1}", ex.Status, url), ex);
        }
    }
    catch (Exception ex)
    {
        log.Error(String.Format("Could not test url {0}.", url), ex);
    }
    return false;
}
贵在坚持 2024-07-28 22:23:54

很多答案都比 HttpClient 更旧(我认为它是在 Visual Studio 2013 中引入的)或者没有异步/等待功能,所以我决定发布我自己的解决方案:

private static async Task<bool> DoesUrlExists(String url)
{
    try
    {
        using (HttpClient client = new HttpClient())
        {
            //Do only Head request to avoid download full file
            var response = await client.SendAsync(new HttpRequestMessage(HttpMethod.Head, url));

            if (response.IsSuccessStatusCode) {
                //Url is available is we have a SuccessStatusCode
                return true;
            }
            return false;
        }                
    } catch {
            return false;
    }
}

我使用 HttpClient.SendAsync 和 < code>HttpMethod.Head 仅发出 head 请求,而不下载整个文件。 就像 David 和 Marc 已经说过的那样,不仅有 http 200 就可以了,所以我使用 IsSuccessStatusCode 来允许所有成功状态代码。

A lot of the answers are older than HttpClient (I think it was introduced in Visual Studio 2013) or without async/await functionality, so I decided to post my own solution:

private static async Task<bool> DoesUrlExists(String url)
{
    try
    {
        using (HttpClient client = new HttpClient())
        {
            //Do only Head request to avoid download full file
            var response = await client.SendAsync(new HttpRequestMessage(HttpMethod.Head, url));

            if (response.IsSuccessStatusCode) {
                //Url is available is we have a SuccessStatusCode
                return true;
            }
            return false;
        }                
    } catch {
            return false;
    }
}

I use HttpClient.SendAsync with HttpMethod.Head to make only a head request, and not downlaod the whole file. Like David and Marc already say there is not only http 200 for ok, so I use IsSuccessStatusCode to allow all Sucess Status codes.

嗼ふ静 2024-07-28 22:23:54

如果我正确理解你的问题,你可以使用这样的小方法来给你 URL 测试的结果:

WebRequest webRequest = WebRequest.Create(url);  
WebResponse webResponse;
try 
{
  webResponse = webRequest.GetResponse();
}
catch //If exception thrown then couldn't get response from address
{
  return 0;
} 
return 1;

你可以将上面的代码包装在一个方法中并使用它来执行验证。 我希望这能回答您所问的问题。

If I understand your question correctly, you could use a small method like this to give you the results of your URL test:

WebRequest webRequest = WebRequest.Create(url);  
WebResponse webResponse;
try 
{
  webResponse = webRequest.GetResponse();
}
catch //If exception thrown then couldn't get response from address
{
  return 0;
} 
return 1;

You could wrap the above code in a method and use it to perform validation. I hope this answers the question you were asking.

尬尬 2024-07-28 22:23:54

我一直发现异常的处理速度要慢得多。

也许强度较低的方法会产生更好、更快的结果?

public bool IsValidUri(Uri uri)
{

    using (HttpClient Client = new HttpClient())
    {

    HttpResponseMessage result = Client.GetAsync(uri).Result;
    HttpStatusCode StatusCode = result.StatusCode;

    switch (StatusCode)
    {

        case HttpStatusCode.Accepted:
            return true;
        case HttpStatusCode.OK:
            return true;
         default:
            return false;
        }
    }
}

然后只需使用:

IsValidUri(new Uri("http://www.google.com/censorship_algorithm"));

I have always found Exceptions are much slower to be handled.

Perhaps a less intensive way would yeild a better, faster, result?

public bool IsValidUri(Uri uri)
{

    using (HttpClient Client = new HttpClient())
    {

    HttpResponseMessage result = Client.GetAsync(uri).Result;
    HttpStatusCode StatusCode = result.StatusCode;

    switch (StatusCode)
    {

        case HttpStatusCode.Accepted:
            return true;
        case HttpStatusCode.OK:
            return true;
         default:
            return false;
        }
    }
}

Then just use:

IsValidUri(new Uri("http://www.google.com/censorship_algorithm"));
硪扪都還晓 2024-07-28 22:23:54

试试这个(确保您使用 System.Net):

public bool checkWebsite(string URL) {
   try {
      WebClient wc = new WebClient();
      string HTMLSource = wc.DownloadString(URL);
      return true;
   }
   catch (Exception) {
      return false;
   }
}

当调用 checkWebsite() 函数时,它会尝试获取以下内容的源代码
传入其中的 URL。 如果获取到源代码,则返回 true。 如果不,
它返回 false。

代码示例:

//The checkWebsite command will return true:
bool websiteExists = this.checkWebsite("https://www.google.com");

//The checkWebsite command will return false:
bool websiteExists = this.checkWebsite("https://www.thisisnotarealwebsite.com/fakepage.html");

Try this (Make sure you use System.Net):

public bool checkWebsite(string URL) {
   try {
      WebClient wc = new WebClient();
      string HTMLSource = wc.DownloadString(URL);
      return true;
   }
   catch (Exception) {
      return false;
   }
}

When the checkWebsite() function gets called, it tries to get the source code of
the URL passed into it. If it gets the source code, it returns true. If not,
it returns false.

Code Example:

//The checkWebsite command will return true:
bool websiteExists = this.checkWebsite("https://www.google.com");

//The checkWebsite command will return false:
bool websiteExists = this.checkWebsite("https://www.thisisnotarealwebsite.com/fakepage.html");
梨涡少年 2024-07-28 22:23:54
WebRequest request = WebRequest.Create("http://www.google.com");
try
{
     request.GetResponse();
}
catch //If exception thrown then couldn't get response from address
{
     MessageBox.Show("The URL is incorrect");`
}
WebRequest request = WebRequest.Create("http://www.google.com");
try
{
     request.GetResponse();
}
catch //If exception thrown then couldn't get response from address
{
     MessageBox.Show("The URL is incorrect");`
}
欢你一世 2024-07-28 22:23:54

这个解决方案似乎很容易遵循:

public static bool isValidURL(string url) {
    WebRequest webRequest = WebRequest.Create(url);
    WebResponse webResponse;
    try
    {
        webResponse = webRequest.GetResponse();
    }
    catch //If exception thrown then couldn't get response from address
    {
        return false ;
    }
    return true ;
}

This solution seems easy to follow:

public static bool isValidURL(string url) {
    WebRequest webRequest = WebRequest.Create(url);
    WebResponse webResponse;
    try
    {
        webResponse = webRequest.GetResponse();
    }
    catch //If exception thrown then couldn't get response from address
    {
        return false ;
    }
    return true ;
}
浮世清欢 2024-07-28 22:23:54

这是另一个选择

public static bool UrlIsValid(string url)
{
    bool br = false;
    try {
        IPHostEntry ipHost = Dns.Resolve(url);
        br = true;
    }
    catch (SocketException se) {
        br = false;
    }
    return br;
}

Here is another option

public static bool UrlIsValid(string url)
{
    bool br = false;
    try {
        IPHostEntry ipHost = Dns.Resolve(url);
        br = true;
    }
    catch (SocketException se) {
        br = false;
    }
    return br;
}
铁轨上的流浪者 2024-07-28 22:23:54

许多其他答案都在使用 WebRequest,但它现在已经过时了。

这是一个具有最少代码并使用当前最新的类和方法的方法。

我还测试了其他投票最多的函数,这些函数可能会产生误报。
我使用这些 URL 进行了测试,这些 URL 指向 Visual Studio 社区安装程序 在此页面找到

//Valid URL
https://aka.ms/vs/17/release/vs_community.exe

//Invalid URL, redirects. Produces false positive on other methods.
https://aka.ms/vs/14/release/vs_community.exe
using System.Net;
using System.Net.Http;

//HttpClient is not meant to be created and disposed frequently.
//Declare it staticly in the class to be reused.
static HttpClient client = new HttpClient();

/// <summary>
/// Checks if a remote file at the <paramref name="url"/> exists, and if access is not restricted.
/// </summary>
/// <param name="url">URL to a remote file.</param>
/// <returns>True if the file at the <paramref name="url"/> is able to be downloaded, false if the file does not exist, or if the file is restricted.</returns>
public static bool IsRemoteFileAvailable(string url)
{
    //Checking if URI is well formed is optional
    Uri uri = new Uri(url);
    if (!uri.IsWellFormedOriginalString())
        return false;

    try
    {
        using (HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Head, uri))
        using (HttpResponseMessage response = client.Send(request))
        {
            return response.IsSuccessStatusCode  && response.Content.Headers.ContentLength > 0;
        }
    }
    catch
    {
        return false;
    }
}

请注意,这不适用于 .NET Framework,因为 HttpClient.Send 不存在。
要使其在 .NET Framework 上运行,您需要将 client.Send(request) 更改为 client.SendAsync(request).Result

A lot of other answers are using WebRequest which is now obsolete.

Here is a method that has minimal code and uses currently up-to-date classes and methods.

I have also tested the other most up-voted functions which can produce false positives.
I tested with these URLs, which points to the Visual Studio Community Installer, found on this page.

//Valid URL
https://aka.ms/vs/17/release/vs_community.exe

//Invalid URL, redirects. Produces false positive on other methods.
https://aka.ms/vs/14/release/vs_community.exe
using System.Net;
using System.Net.Http;

//HttpClient is not meant to be created and disposed frequently.
//Declare it staticly in the class to be reused.
static HttpClient client = new HttpClient();

/// <summary>
/// Checks if a remote file at the <paramref name="url"/> exists, and if access is not restricted.
/// </summary>
/// <param name="url">URL to a remote file.</param>
/// <returns>True if the file at the <paramref name="url"/> is able to be downloaded, false if the file does not exist, or if the file is restricted.</returns>
public static bool IsRemoteFileAvailable(string url)
{
    //Checking if URI is well formed is optional
    Uri uri = new Uri(url);
    if (!uri.IsWellFormedOriginalString())
        return false;

    try
    {
        using (HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Head, uri))
        using (HttpResponseMessage response = client.Send(request))
        {
            return response.IsSuccessStatusCode  && response.Content.Headers.ContentLength > 0;
        }
    }
    catch
    {
        return false;
    }
}

Just note that this will not work with .NET Framework, as HttpClient.Send does not exist.
To get it working on .NET Framework you will need to change client.Send(request) to client.SendAsync(request).Result.

沐歌 2024-07-28 22:23:54

Web 服务器以 HTTP 状态代码进行响应,指示请求的结果,例如 200(有时 202)表示成功,404 - 未找到等(请参阅 此处)。 假设 URL 的服务器地址部分是正确的,并且您没有收到套接字超时,则异常很可能会告诉您 HTTP 状态代码不是 200。我建议检查异常的类并查看异常是否携带HTTP 状态代码。

IIRC - 有问题的调用抛出 WebException 或后代。 检查类名以查看是哪个类,并将调用包装在 try 块中以捕获条件。

Web servers respond with a HTTP status code indicating the outcome of the request e.g. 200 (sometimes 202) means success, 404 - not found etc (see here). Assuming the server address part of the URL is correct and you are not getting a socket timeout, the exception is most likely telling you the HTTP status code was other than 200. I would suggest checking the class of the exception and seeing if the exception carries the HTTP status code.

IIRC - The call in question throws a WebException or a descendant. Check the class name to see which one and wrap the call in a try block to trap the condition.

沐歌 2024-07-28 22:23:54

我有一个更简单的方法来确定 url 是否有效。

if (Uri.IsWellFormedUriString(uriString, UriKind.RelativeOrAbsolute))
{
   //...
}

i have a more simple way to determine weather a url is valid.

if (Uri.IsWellFormedUriString(uriString, UriKind.RelativeOrAbsolute))
{
   //...
}
浪漫人生路 2024-07-28 22:23:54

根据已经给出的示例,我想说,最好的做法是将响应包装在这样的使用中

    public bool IsValidUrl(string url)
    {
         try
         {
             var request = WebRequest.Create(url);
             request.Timeout = 5000;
             request.Method = "HEAD";

             using (var response = (HttpWebResponse)request.GetResponse())
             {
                response.Close();
                return response.StatusCode == HttpStatusCode.OK;
            }
        }
        catch (Exception exception)
        { 
            return false;
        }
   }

Following on from the examples already given, I'd say, it's best practice to also wrap the response in a using like this

    public bool IsValidUrl(string url)
    {
         try
         {
             var request = WebRequest.Create(url);
             request.Timeout = 5000;
             request.Method = "HEAD";

             using (var response = (HttpWebResponse)request.GetResponse())
             {
                response.Close();
                return response.StatusCode == HttpStatusCode.OK;
            }
        }
        catch (Exception exception)
        { 
            return false;
        }
   }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文