1

I would like to create an application to keep a history of some information that is provided on a webpage.

An example of what a page would look like: http://csgolounge.com/match?m=4961

So what I was thinking, is to put a browser inside a form and then navigate to a page and click a button to save the info on the page, in the backcode it will take the page selected, view the source or something and select the appropriate data and store that.

The data i would like to obtain is: Team1, Team2, Winner and Percentages for each team and the ratios for the bet.

Simply I would just like to know if this is possible or is there a better way of doing it? I'm not sure if the website has an API or anything.

No need for code, as I haven't started yet,

Cleaven
  • 974
  • 2
  • 9
  • 30

3 Answers3

0

Beautiful Soup is made for scraping data off of web pages. It is written in Python, so it is really easy to pick up and learn too.

From their website:

Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application

There is a good walkthrough example here: http://www.crummy.com/software/BeautifulSoup/bs4/doc/

DarkBee
  • 16,592
  • 6
  • 46
  • 58
kodybrown
  • 2,337
  • 26
  • 22
  • I forgot to mention that if there was an api of some kind, that would probably be the best, easiest way to get the data.. Especially if the website changes html/layout often.. – kodybrown Aug 14 '15 at 15:47
  • Thanx for the response, I don't have enough rep to upvote your answer, but its something i'll look into during the week when i get some free time, i did look for an api bu i didnt seem to find one for the site. – Cleaven Aug 16 '15 at 13:17
0

Html Agility Pack for C#. Using this you can accomplish the same as Beautiful Soup.

There is a great answer already on SO from @bouvard here: https://stackoverflow.com/a/170856/139793

Sorry for the second answer, I just noticed the c# tag..

Community
  • 1
  • 1
kodybrown
  • 2,337
  • 26
  • 22
0

Have you done web scraping before? If not, it looks like that is what you're trying to do. Web Scraping usually falls under this gray area whether it's legal or not, but if your app is for non-commercial purposes, I don't think you should have any problems.

There are lots of web scraping APIs. For example, CSQuery and HTMLAgilityPack are famous web scraping libraries for .NET

I'd recommend using these libraries. Here is how you'd scrape with something like CSQuery. Here is the fiddle - https://dotnetfiddle.net/0ugatU

using System;
using System.Text.RegularExpressions;
using CsQuery;

public static class Scraper
{ 
    public static string RemoveHTMLTags(string html)
    {
        return Regex.Replace(html, "<.*?>", string.Empty);
    }

    public static bool FindWinner(string item)
    {
        if(item.Contains("(win)"))
        {
            return true;
        }

        return false;
    }
}

public class Program
{
    public static void Main()
    {
        CQ dom = CQ.CreateFromUrl("http://csgolounge.com/match?m=4961");
        CQ bold = dom["div > a b"];     
        CQ italic = dom["div > a i"];

        string team1 = Scraper.RemoveHTMLTags(bold[0].Render());
        string team2 = Scraper.RemoveHTMLTags(bold[1].Render());
        string team1Percent = Scraper.RemoveHTMLTags(italic[0].Render());
        string team2Percent = Scraper.RemoveHTMLTags(italic[1].Render());           

        if(Scraper.FindWinner(team1))
        {
            Console.WriteLine("-- Winner --");
            Console.WriteLine(team1 + " - " + team1Percent);
            Console.WriteLine("-- Loser --");
            Console.WriteLine(team2 + " - " + team2Percent);            
        }
        else
        {                               
            Console.WriteLine("-- Winner --");
            Console.WriteLine(team2 + " - " + team1Percent);
            Console.WriteLine("-- Loser --");
            Console.WriteLine(team1 + " - " + team2Percent);
        }       
    }   
}

Note: Install CSQuery as a Nuget Package

Aswin Ramakrishnan
  • 3,195
  • 2
  • 41
  • 65
  • Wow, thank you for the detailed answer, When i get the chance to carry on working on this, I'll test this out. Thank you non the less, if its what im working towards I'll mark it as the answer. Also i have never done web scraping so this will be a first xD – Cleaven Aug 16 '15 at 13:18
  • No worries. I'm sure this is what you're working towards. It's just about the option of web scraper. CSQuery and Html Agility Pack are really good web scrapers for what you're trying to accomplish. Try fiddling with both of them (or even others) to get a feel for it to better understand your preference. – Aswin Ramakrishnan Aug 16 '15 at 16:51