Model 4: Web Hub/Aggregator

Prev | Next

Finding the information that you want on the web is sometimes a pain in the ass. In the bad old days before Google users could spend hours clicking through the 1 million results that AltaVista would return to search terms as esoteric as, "Shaving a HedgeHog with a Pound of Cotton Candy". For those of us who remember those days and for people who can't seem to settle on one place to do their shopping or investing there is the Aggregator services. These websites take content from all around the web and assemble it in one place. While a search engine is the most commonly used form of an Aggregator, services like Google News take information feeds from all over the world and assemble it in one easily searched location. The RSS standard was designed to meet the needs of these types of aggregating organizations. If it's important to get your message out you can publish it as an RSS data stream and your content will appear on other people's websites throughout the world.

This diagram discusses the development and promotion of an engineering standard for data exchange. This may be naive or far-thinking depending on who you talk to. There already exists a very popular exchange standard which is constantly being abused worldwide. The lingua franca (as a reflective pretentious ass I wonder if that is the right term) of the web is HTML. Screen Scraping is the process of grabbing an HTML document and cleverly extracting just the information you want from it. One of my friend's has a company who aggregates investor's information on one website. They do this without the express approval of those websites. The method? They store your login information, login into the site as you, take the HTML document that is sent to them, and cleverly scrape out all the information they want. They then store this in their database. When you log in to their financial portal the information from the ten websites you trade through is located in one spot. People pay money for this sort of service and it didn't require the development of a new standard, it just involves a practice which pisses of a bunch of investing companies. * Tristan Cohen is in no way affiliated with Google.com and in fact is mortified that a search for his name turns up anything.

tristancohen@yahoo.com