A mashup is a complex form of data visualization. On the web, mashup often refers to an integrated application created by combining of geographical location and other information with a service such as Google maps or Microsoft Virtual Earth. The term has achieved widespread usage in describing this kind of web application since Google introduced its public Google Maps API[1] in 2005. Though not restricted to the web, mashups have become an increasingly popular internet paradigm, leading to the creation of a variety of web based mashups. Tim O'Reilly lists Mashups as one of the Web 2.0 technologies. [2].
Before the availability of the Google maps API, mashup-like applications were being developed mainly with proprietary, complex geographic information systems (GIS) software packages. Such GIS applications have been available commercially since the 1980's, but it is only since the early 2000's that non-computer-experts have had the tools that allowed such combinations of maps and user-specific data to proliferate on the web. Mashups that do not use spatial or mapping data are also possible, but the mapping application is likely the first kind that comes to mind when one says "mashup" in the context of the world wide web.
Mashups are a convergent technology of sorts. Convergence of communications is a recognition that a variety of communications can run over the same Internet Protocol-based infrastructure, without building a separate infrastructure for each service. From the standpoint of communications engineers convergence is not necessarily about the user interface or the merging of technologies. That may be a beneficial side effect, but it is not the focus of the groups concerned with convergence, such as the Multimedia Forum. To a communications engineer, mashups are not clearly distinguished from a multi-windowed interface, or even a structured dashboard, presenting multiple services to the end user.
Thanks to Google Maps, Internet mashups have become popular in recent years; however the concept of mashups has been around for a long time in a context completely unfamiliar to typical Internet engineers. Before internet mashups became popular, mashups referred to music. Music mashups are the fusion of two or more songs by overlaying their tunes and lyrics to form a new song. They have been around since the beginning of recorded music. Before this was a popular buzzword, this was called multi track recording and rerecording, where the Beatles made notable advances. Today, music mashups have been extended to incorporate videos and are still prevalent in the entertainment industry. Websites like http://www.mashup-charts.com/ are used to rate amateur music mashups.
The general purpose of mashups can therefore be stated as, merging or overlaying entities in the hopes of obtaining a comprehensive product which will be more useful or interesting, and which will present a broader perspective than the individual entities on their own.
Before music mashups, the concept of merging entities for a specific purpose was used in epidemiology. They were not referred to as mashups at the time, but served a similar purpose as modern day internet mashups. John Snow (1813 - 1858) was a British Physician who is often considered one of the founders of epidemiology. [3] Prior to the early 1800s, experts in the medical field believed that cholera was air borne. John Snow refuted that belief and published an essay in 1849 called On the Mode of Communication of Cholera expressing his views on the subject. Without a concrete way to prove his assertions however, he did not make much headway in convincing others.
In August 1854, a tragic outbreak of cholera occurred in Soho. By plotting the outbreaks of cholera on a map, John Snow was able to identify a water pump as the source of the disease. After having the handle of the pump removed, the cases of cholera immediately began to diminish. This incident helped to prove that cholera was transmitted by the consumption of water from the pump, and got into the body through the mouth. Today, cartographic data is studied by various research institutions like the Centers for Disease Control and Prevention (CDC) and in academia as Geographical Information Science. It is used for the display, storage and analysis of spatial data. The concept of mashups therefore lends itself to various fields.
Following the trend in history, it is no surprise therefore that it took Google Maps, a geographical tool, to popularize web mashups. However, web mashups are not restricted to maps and geographical data. There are mashups that combine travel information, news, shopping information and social networking. Because mashups are created from already existing technology, they are restricted only by the technologies they emulate.
A tally of tags for mashups recorded on http://www.programmableweb.com indicates recent mashup trends.
Google mapsis the poster child of web mashups but it was not the first company to introduce internet mapping technology. Prior to it's existence, Mapquest and Yahoo maps dominated the scene. These sites were used mostly to get driving directions and for address lookups. In 2005, Google re-introduced internet mapping with a twist. It not only provided a resource for driving directions or address lookups, it extended the technology by creating an API through which users could create personalized Google map widgets. Yahoo and Microsoft followed suit with Yahoo Maps and Virtual Earth APIs respectively. Since then, mashups have received considerable attention.
An enabling factor of this growth is the fact that Web 2.0 is gaining traction in the enterprise. Web 2.0 embodies the belief that the World Wide Web is breaking away from its origins and evolving into the next stage of human interaction with a computer and the global community[4]. The concept encourages collaboration, reusability, personalization and standardization, which are properties that have fostered the development of mashups – one of the many trends in Web 2.0 (others include blogging, wikis, podcasting, etc). Gradually, the Web is becoming a distribution network of content and service as evidenced by mashups.
Another factor that has helped make mashups popular is, Web browsers have better Ajax support, which implies increased speed. Desktop applications would be much more attractive to businesses than Web based services if the latter are extremely slow.
Also, open source software has grown more popular[5]. The implication is that many more people are getting involved in developing contents that can be used by the general public.
Logically, a mashup can be viewed as being composed of three different participants, which are usually physically separated too. They are
This refers to the providers of the content being used in a mashup application. The sources of content are disparate and often controlled by different parties. The most popular ways of exposing content for retrieval are
An Application Programming Interface (API) enables the creation of a web-based mashup by providing a means of gaining access (rules and procedures) to an application or content e.g. Google Maps. This allows for compatible software. APIs should be made as simple as possible if their use is to be encouraged.
APIs can be
A Web API is usually accessed via HTTP by making a call to some script on a remote server.
Popular websites that offer open APIs include Amazon.com, AOL, eBay, Google, MapQuest, MSN, Shopping.com, UPS.com, US Postal service.
The contents of a web site that lacks an open API can still be accessed via a process referred to as screen scraping, in which unstructured text is pulled from a website.
An example of JavaScript code used to display a Google map of the Philadelphia area is given below. The example makes use of the Google Maps API. The map can be used to show the location of apartments in a web site that offers such apartment listing services.
<html> <head> <title>Apartment Listing</title> <script src="http://maps.google.com/maps?file=api&v=2& key=ABQIAAAAni0_HyJTfcbhvyNrGunJdhQuvnbIrZPj1yxxzdYDS-DWipzTChQL8GeWLFZ2SA- _q3wsWjD16IYlVg" type="text/javascript"></script> <script type="text/javascript" language="javascript">
function initMap() { var phillyMap = new GMap2(document.getElementById("phillyMap")); phillyMap.addControl(new GLargeMapControl()); phillyMap.addControl(new GMapTypeControl()); phillyMap.setCenter(new GLatLng(39.953333, -75.17), 12); phillyMap.setMapType(G_NORMAL_MAP); } </script> </head>
<body onLoad="initMap()"> <div id="phillyMap" style="width: 800px; height: 500px"></div> </body> </html>
To use Google Maps, you need to request an API key from Google, which is a relatively easy process. You load the Google Maps API in your website using an HTML <script> tag. The url specified in the src attribute of the <script> tag points to the location of the JavaScript file that includes all of the symbols and definitions you need for using the Google Maps API. You should replace the key in this attribute with the key that was assigned to you. The key in the example above is ABQIAAAAni0_HyJTfcbhvyNrGunJdhQuvnbIrZPj1yxxzdYDS-DWipzTChQL8GeWLFZ2SA-_q3wsWjD16IYlVg.
The HTML <div> tag acts as a placeholder for the map on your web page. It also specifies a size for the map and assigns itself an identity, phillyMap.
GMap2 is a class that represents a map - we create an instance of this class using the new operator to define a map and assign phillyMap that we described in the previous paragraph as a container for our map.
Next, we need to initialize our map using the setCenter() method which takes a GLatLng coordinate and a zoom level as parameters. The GLatLng is an object which specifies the latitude and longitude to be used as center point for the map. I have supplied the coordinates of Philadelphia above.
Information feeds are a common mashup source because they are in a standardized form and are readily available on the internet. Web feeds like RSS and Atom are in XML format, therefore they are easy to parse. The parsed data is then used to create new information feeds or different mashup types. For example, a user can parse an RSS feed for the New York Times, run the extracted data through content analysis and generate a map with flickr photos relating to the locations referenced in the New York Times articles. [6]
XML is a standard for data transfer over the internet. For example, RSS and Atom feeds are inherently XML documents. XML is the basis on which AJAX operates. All data that is stored on the web in XML form can be retrieved, parsed and analyzed. After analysis, users can create intelligible mashups from resulting data. XML in it's bare format is used when there is no tool providing data in a packaged form (e.g RSS) to the user . In such a case, the user will most likely scrap the web (called screen scraping) for information. Unlike with APIs and information feeds, a relatively higher level of programming expertise is required to analyze raw XML. Also, if the source content changes (which happens often), the code written for extraction of data breaks since it was dependent on the presentation of the data.
JSON is another data interchange format which can be used to create mashups. Like XML, if data for the mashup is in JSON format, it will be retrieved, parsed and analyzed before it is used
In general, any structured data interchange format can be used for mashups, as long as both provider and user understand the format (i.e., syntax) and meaning (i.e., semantics) of said data. Two images, for example, can have a compatible syntax, but one could be a summer and one could be a winter photograph of the same area; climate could not be inferred by combining them in a mashup. A street map and a geographic photograph may both be in a compatible graphic format, but they are of different scales (i.e., magnification) and different coordinate systems.
This refers to where the mashup is hosted. It is the application that is created by drawing on content from content providers. This application can be generated using client-side scripting such as JavaScript within the client's browser. Server-side technologies that generate content dynamically could also be used. Examples of these server-side technologies include Java servlets, CGI, PHP and ASP. Client-side scripting has the advantage of reduced communication overhead with the mashup server. After accessing the page, subsequent operations are carried out by communicating directly with the content provider. This is what Google Maps uses.
This is where users access and interact with the mashup.
Most mashups are developed using one or more of the following languages
However, there are many tools available today that require no coding at all. Some examples are given in the mashup tools/editors section below.
A number of organizations have developed or are developing tools to allow users develop, deploy and share their own mashups. Some of these tools require substantial programming skills, while others require none at all. Some of these editors are listed below, there are many others available.
Mashups that combine visual elements and data from multiple sources.
An example of a consumer mashup is http://www.housingmaps.com which gets rental listings from Craigslist and displays these listings on a Google Map by using Google Maps' API. The results displayed below are searches for 3+ bedroom apartments in Philadelphia that cost between $1000 and $1500.
Mashups that combine multiple data sources (e.g. RSS feeds) into a single data source.
An example of a data mashup is the travel site http://www.kayak.com. Kayak is a comprehensive travel search engine which gets its data from over 100 other travel sites. Kayak therefore does not sell directly to customers but serves as a portal through which customers can be directed to travel agencies that can serve their needs. The results displayed below are searches for flights from Philadelphia to New York. Kayak displays flights from http://www.cheaptickets.com and http://www.orbitz.com.
Similar to consumer mashups, but solve business problems. Many enterprises are embracing mashups for various reasons. Some need their software systems to change often to keep up with the rapid rate at which their business needs change. Such businesses find mashups an attractive solution – they make use of available components that have been developed and tested, and can launch their software in shorter time as compared to if they had to build from scratch. Some other businesses do not have the resources or competences required to develop some applications and thus are eager to incorporate such.
Scatter plots of statistics about occurrences of death or illness have been used to correlate harm to things such as pollution from smokestacks of industrial installations, since from the bunching up of occurrences, a shape matching the downwind scatter of air from the smokestack became visible on the mashup map.
Special consideration needs to be given to mashups developed for businesses, especially businesses with sensitive data. With the plethora of services that could serve as mashup content available on the World Wide Web, some concerns arise. First, designing your enterprise’s systems to allow incorporation of services and applications outside your enterprise. Second, designing your firewall to allow you to access these services and applications without compromising your security.
Mashup preparation can be divided into six stages