Dr Alexiei Dingli defines e-government as: "The use of information and communication technology to provide and improve government services, transactions and interactions with citizens, businesses, and other arms of government."
The idea behind this article is to explain the approach I have adopted to create a Web 2.0 e-government website which aims to gather data from existing services (both government and private) in order to provide additional value to the end user. In my project I have integrated the popular AutoTrader website with the local car valuation website.
First and foremost these are the main features of the website:
- A highly scalable website – the website is flexible enough to be integrated with any private car imports website. For the time being, it has been integrated with the popular the AutoTrader website.
- Extremely user friendly – The user is not required to fill in long forms as in the current government car registration website.
- Fast and effective – Searching and valuation of cars is done efficiently and fast with a small, known sacrifice in accuracy.
- Search Engine Friendly – SEO was taken into consideration as from the initial stages of the project.
- Accessible – Intuitive yet accessible user interface which conforms to the FITA accessibility audit check list.
Despite of the ongoing Web 2.0 buzz, the absolute majority of the Web pages are still very Web 1.0: They heavily mix presentation with content. This makes hard for a computer to extract meaningful data from the rest of the elements. To remedy this problem, some sites provide access to their content through web services or RSS.
Unfortunately, neither the ADT (car valuation website) nor the AutoTrader website provides access in this manner. As a solution to this problem, web scraping techniques have been adopted. Web Scraping involves observing the page structure and wrapping out the relevant records. Since the website has been developed using Ruby on Rails, a ruby library called Hpricot has been used to make web scraping easy and fast. More information about For this library lands check this website: Www.rubyrailways.com/data-extraction-for-web-20-screen-scraping-in-rubyrails/
As in the case of the ADT website, web scraping is even more complex than that. The website is wizard driven having data scattered over multiple pages (ie multiple HTTP requests). Also, since the website is stateful (using the traditional ASP.NET web forms), the web scraping algorithm needed to take into consideration the ASP.NET view state, event validation and cookies.
Links: The project: www.karozzi.co.uk [http://www.karozzi.co.uk]