Screen Scraping (AKA Web Fetching) using ASP.NET
Screen Scraping, in terms of programmer means fetching data from a website
into your application. To talk more technically, it is actually a way by which
your application extracts data from the output of some other program. This
technique is basically making the request and parsing the response.
This can help you in a tremendous way. You can scrape all products from a
website and put them in your application or save them in a spreadsheet, you can
do comparisons by scraping data from multiple sites and do research or analysis.
To perform Screen scraping in ASP.NET, we will be using the WebResponse
and the WebRequest objects. For this you will need to import
System.Net namespace.
I am attaching the code, you can download example
Screen Scraping Visual Studio 2005 project.
The start page (i.e. startpage.aspx) looks like as shown in the figure below:
And as you click the button "Click to view", the data from my html page is
fetched to the second page of the application i.e.(WebForm1.aspx) as shown in
the figure below:
In this application I have screen scraped one of the html pages designed by
me, and it is hardcoded, so you can simply go to the line:
//you
need to replace this string with any web site url that you require
stringstr="C:/Documents and Settings/Sairam/Desktop/agro_prod.htm";
home.Text = screenscrape(str);
and put the URL you wish to fetch the content from. Or if you wish to work as
this is, you will need to change the URL, depending on the location you save the
html file. I have attached both the .net project as well as the "agro_prod.html".
Just go and try it.
Here the "screenscrape" method is the method defined by
me which performs the major functionality.
private
string screenscrape(string
url)
{
WebResponse obj;
WebRequest obj1=System.Net
.HttpWebRequest .Create (url);
obj=obj1.GetResponse
();
using(StreamReader sr=new
StreamReader (obj.GetResponseStream ()))
{
r = sr.ReadToEnd
();
sr.Close ();
}
return r;
}
Once you get the whole content from some site, you can now parse the data
there. Extract table from there and many things as per your requirements.
In this demo I have only concentrated on one way of fetching data. There are
other methods too. Two of them are listed below:
WebClient obj =
new WebClient();
Byte[] result;
result=obj.DownloadData("http://myssiteToScrape.com");
UTF8Encoding encoding;
String strResult;
strResult=encoding.GetString(result);
Label1.Text=result;
TextWriter writer =
new StringWriter();
Server.Execute ("startpage.aspx",writer);
Response.Output.Write(writer.ToString());
It is to note
here that TextWriter is an abstract class so it cannot be instantiated.
|