3

How do you login to a webpage and retrieve its content in C#?

andynormancx
  • 13,421
  • 6
  • 36
  • 52
user73330
  • 117
  • 2
  • 5

7 Answers7

5

That depends on what's required to log in. You could use a webclient to send the login credentials to the server's login page (via whatever method is required, GET or POST), but that wouldn't persist a cookie. There is a way to get a webclient to handle cookies, so you could just POST the login info to the server, then request the page you want with the same webclient, then do whatever you want with the page.

Alex Fort
  • 18,459
  • 5
  • 42
  • 51
3

Look at System.Net.WebClient, or for more advanced requirements System.Net.HttpWebRequest/System.Net.HttpWebResponse.

As for actually applying these: you'll have to study the html source of each page you want to scrape in order to learn exactly what Http requests it's expecting.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
2

How do you mean "login"?

If the subfolder is protected on the OS level, and the browser pops of a login dialog when you go there, you will need to set the Credentials property on the HttpWebRequest.

If the website has it's own cookie-based membership/login system, you will have to use HttpWebRequest to first response to the login form.

James Curran
  • 101,701
  • 37
  • 181
  • 258
2
string postData = "userid=ducon";
            postData += "&username=camarche" ;
            byte[] data = Encoding.ASCII.GetBytes(postData);
            WebRequest req = WebRequest.Create(
                URL);
            req.Method = "POST";
            req.ContentType = "application/x-www-form-urlencoded";
            req.ContentLength = data.Length;
            Stream newStream = req.GetRequestStream();
            newStream.Write(data, 0, data.Length);
            newStream.Close();
            StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream(), System.Text.Encoding.GetEncoding("iso-8859-1"));
            string coco = reader.ReadToEnd();
user73330
  • 117
  • 2
  • 5
1

Use the WebClient class.

Dim Html As String

Using Client As New System.Net.WebClient()
    Html = Client.DownloadString("http://www.google.com")
End Using
Josh Stodola
  • 81,538
  • 47
  • 180
  • 227
  • He asked for C# code, probably (it wasn't me that downvoted it) – JohnFx Mar 04 '09 at 15:46
  • The C# is almost identical- just a parenthese, braces, a semi-colon, and change the case of a few keywords. – Joel Coehoorn Mar 04 '09 at 16:08
  • Come on, people. If you can read/write C#, you can read/write VB. Open your mind! – Josh Stodola Mar 04 '09 at 16:09
  • I downvoted because he mentioned that he needs to login before he downloads the page. The stock webclient wouldn't support cookies for a login session, so the solution didn't exactly fit the problem. – Alex Fort Mar 04 '09 at 16:31
  • The C# answer (mine) got downvoted too. I suppose it was the authentication thing. – JohnFx Mar 11 '09 at 17:54
1

You can use the build in WebClient Object instead of crating the request yourself.

WebClient wc = new WebClient();
wc.Credentials = new NetworkCredential("username", "password");
string url = "http://foo.com";          
try
{
    using (Stream stream = wc.OpenRead(new Uri(url)))
    {
        using (StreamReader reader = new StreamReader(stream))
        {
            return reader.ReadToEnd();
             }
    }
}
catch (WebException e)
{
    //Error handeling
}
Matthew M. Osborn
  • 4,673
  • 4
  • 25
  • 26
-2

Try this:

public string GetContent(string url)  
{ 
  using (System.Net.WebClient client =new System.Net.WebClient()) 
  { 
  return client.DownloadString(url); 
  } 
} 
JohnFx
  • 34,542
  • 18
  • 104
  • 162