使用套接字下载网站

发布于 2024-12-09 05:12:23 字数 1405 浏览 0 评论 0原文

我正在尝试使用套接字下载该网站的源代码。目前我可以下载标头,之后我只是终止连接,因为我不知道应该接收数据多长时间。这是代码:

    private void HandleConnect(SocketAsyncEventArgs e)
    {
        if (e.ConnectSocket != null)
        {
            // simply start sending
            bool completesAsynchronously = e.ConnectSocket.SendAsync(e);

            // check if the completed event will be raised.
            // if not, invoke the handler manually.
            if (!completesAsynchronously)
            {
                SocketAsyncEventArgs_Completed(e.ConnectSocket, e);
            }
        }
    }

    private void HandleReceive(SocketAsyncEventArgs e)
    {
        string responseL = Encoding.UTF8.GetString(e.Buffer, 0, e.Buffer.Length);
        response += responseL;
        temp += responseL;

        string[] lines = Regex.Split(response, "\r\n\r\n");
        if (lines.Length > 1 && header == "")
        {
            header = lines[0].ToString() + "\r\n";
            lines[0] = "";
            response = lines.ToString();
        }
        if (header == "")
        {
            bool completesAsynchronously = e.ConnectSocket.ReceiveAsync(e);
        }
        else
        {
            System.Windows.Deployment.Current.Dispatcher.BeginInvoke(delegate()
            {
                _callback(false, this);
            });
        }
    }

我试图搜索 \r\n 但没有帮助:/

请帮忙!

先感谢您 :)

I'm trying to download source of the site using sockets. Currently i can download headers and after that i just terminate connection because i don't know how long should I receive data. This is the code:

    private void HandleConnect(SocketAsyncEventArgs e)
    {
        if (e.ConnectSocket != null)
        {
            // simply start sending
            bool completesAsynchronously = e.ConnectSocket.SendAsync(e);

            // check if the completed event will be raised.
            // if not, invoke the handler manually.
            if (!completesAsynchronously)
            {
                SocketAsyncEventArgs_Completed(e.ConnectSocket, e);
            }
        }
    }

    private void HandleReceive(SocketAsyncEventArgs e)
    {
        string responseL = Encoding.UTF8.GetString(e.Buffer, 0, e.Buffer.Length);
        response += responseL;
        temp += responseL;

        string[] lines = Regex.Split(response, "\r\n\r\n");
        if (lines.Length > 1 && header == "")
        {
            header = lines[0].ToString() + "\r\n";
            lines[0] = "";
            response = lines.ToString();
        }
        if (header == "")
        {
            bool completesAsynchronously = e.ConnectSocket.ReceiveAsync(e);
        }
        else
        {
            System.Windows.Deployment.Current.Dispatcher.BeginInvoke(delegate()
            {
                _callback(false, this);
            });
        }
    }

I was trying to search for \r\n but it didn't help :/

Please help!

Thank you in advance :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蹲墙角沉默 2024-12-16 05:12:23

我使用此代码将标头发送到网站,然后读取其内容。我希望你觉得它有用。

        ReadStateObject stateObject; //Info below
        mytcpclient = new TcpClient();
        mytcpclient.Connect(host, port);
        mysocket = mytcpclient.Client;
        SendHeader(mysocket);//Info below
        ns = mytcpclient.GetStream();
        if (ns.CanRead)
        {
            stateObject = new ReadStateObject(ns, 1024);
            ns.BeginRead(stateObject.ReadBuffer, 0, stateObject.ReadBuffer.Length, new AsyncCallback(ReadCallBack), stateObject);
        }

StateObject 是一个小类,用于在 BeginRead 方法中表示 AsyncState 对象:

class ReadStateObject
{
    public NetworkStream Stream {get; set;}
    public byte[] ReadBuffer;

    public ReadStateObject(NetworkStream _stream, int bufferSize)
    {
        Stream = _stream;
        ReadBuffer = new byte[bufferSize];
    }
}

这是 BeginRead 方法中使用的回调方法。

    private void ReadCallBack(IAsyncResult result)
    {
        ReadStateObject stateObject = (ReadStateObject)result.AsyncState;
        NetworkStream myNetworkStream = stateObject.Stream;
        int numberofbytesread = 0;
        StringBuilder sb = new StringBuilder();
        numberofbytesread = myNetworkStream.EndRead(result);
        sb.Append(Encoding.ASCII.GetString(stateObject.ReadBuffer, 0, numberofbytesread));

        /*It seems, if there is no delay, the DataAvailable may not be true even when there are still data to be received from the site, so I added this delay. Any suggestions, how to avoid this are welcome*/

        Thread.Sleep(500);

            while (myNetworkStream.DataAvailable)
            {
                byte[] mydata = new byte[1024];
                numberofbytesread = myNetworkStream.Read(mydata, 0, mydata.Length);
                sb.Append(Encoding.ASCII.GetString(mydata, 0, numberofbytesread)); 

            }

      Console.Writeln(sb.ToString());
        mytcpclient.Close();
    }

这是将标头发送到站点的地方

    public void SendHeader(Socket mySocket)
    {
        String sBuffer = "";
        sBuffer = sBuffer + "GET /"+pathquery+" HTTP/1.1" + "\r\n";
        sBuffer = sBuffer + "Host: "+ hostname + "\r\n";
        sBuffer = sBuffer + "Content-Type: text/html\r\n";
        sBuffer = sBuffer + "\r\n";
        Byte[] bSendData = Encoding.ASCII.GetBytes(sBuffer);
        mySocket.Send(Encoding.ASCII.GetBytes(sBuffer), Encoding.ASCII.GetBytes(sBuffer).Length, 0);
    }

I use this code to send headers to the site and then read its content. I hope you find it useful.

        ReadStateObject stateObject; //Info below
        mytcpclient = new TcpClient();
        mytcpclient.Connect(host, port);
        mysocket = mytcpclient.Client;
        SendHeader(mysocket);//Info below
        ns = mytcpclient.GetStream();
        if (ns.CanRead)
        {
            stateObject = new ReadStateObject(ns, 1024);
            ns.BeginRead(stateObject.ReadBuffer, 0, stateObject.ReadBuffer.Length, new AsyncCallback(ReadCallBack), stateObject);
        }

StateObject is small class used to represent the AsyncState object in BeginRead method:

class ReadStateObject
{
    public NetworkStream Stream {get; set;}
    public byte[] ReadBuffer;

    public ReadStateObject(NetworkStream _stream, int bufferSize)
    {
        Stream = _stream;
        ReadBuffer = new byte[bufferSize];
    }
}

And this is a Callback Method used in BeginRead method.

    private void ReadCallBack(IAsyncResult result)
    {
        ReadStateObject stateObject = (ReadStateObject)result.AsyncState;
        NetworkStream myNetworkStream = stateObject.Stream;
        int numberofbytesread = 0;
        StringBuilder sb = new StringBuilder();
        numberofbytesread = myNetworkStream.EndRead(result);
        sb.Append(Encoding.ASCII.GetString(stateObject.ReadBuffer, 0, numberofbytesread));

        /*It seems, if there is no delay, the DataAvailable may not be true even when there are still data to be received from the site, so I added this delay. Any suggestions, how to avoid this are welcome*/

        Thread.Sleep(500);

            while (myNetworkStream.DataAvailable)
            {
                byte[] mydata = new byte[1024];
                numberofbytesread = myNetworkStream.Read(mydata, 0, mydata.Length);
                sb.Append(Encoding.ASCII.GetString(mydata, 0, numberofbytesread)); 

            }

      Console.Writeln(sb.ToString());
        mytcpclient.Close();
    }

And this is where Headers are sent to the site

    public void SendHeader(Socket mySocket)
    {
        String sBuffer = "";
        sBuffer = sBuffer + "GET /"+pathquery+" HTTP/1.1" + "\r\n";
        sBuffer = sBuffer + "Host: "+ hostname + "\r\n";
        sBuffer = sBuffer + "Content-Type: text/html\r\n";
        sBuffer = sBuffer + "\r\n";
        Byte[] bSendData = Encoding.ASCII.GetBytes(sBuffer);
        mySocket.Send(Encoding.ASCII.GetBytes(sBuffer), Encoding.ASCII.GetBytes(sBuffer).Length, 0);
    }
陪你到最终 2024-12-16 05:12:23

也许,您应该使用 WebClientHttpWebRequest 而不是套接字。
使用套接字和解释 Http 协议可能会很痛苦。

Maybe, you should use WebClient or HttpWebRequest instead of sockets.
Using sockets and interpreting Http protocol can be painful.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文