With the increase in E- Commerce there have been increasing buzz among all with the word ‘Big Data’. For Example in order to cover growing Indian market Amazon have tied up with IRCTC, as it is having one of the biggest customer base. Companies like Flipkart, Snapdeal, Amazon are continuously in an attempt to provide a better customer experience with help of such data. Not only this companies like FMCG, Apparels collect data from Retails such as Hypermarket, Departmental Store, Supermarket through sales data or loyalty cards.
According to IBM “Every Day, we create 2.5 quintillion bytes of data- so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is big data”.
So, big data is exactly what it sounds like- a lot of data, since the evolution of the Internet. It's been estimated that in all the time leading up to the year 2003, only 5 exabytes of data were generated- that's equal to 5 billion gigabytes. But from 2003 to 2012, the amount reached around 2.7 zettabytes (or 2,700 Exabyte’s, or 2.7 trillion gigabytes) [sources: Intel]. According to Berkeley researchers, we are now producing roughly 5 quintillion bytes (or around 4.3 Exabyte’s) of data every two days [source: Romanov].
The term 'big data' is usually used to refer to massive, rapidly expanding, varied and often unstructured sets of digitized data that are difficult to maintain using traditional databases. It can include all the digital information floating around out there in the ether of the Internet, the proprietary information of companies with whom we've done business and official government records, among a great many other things. These data are today being analyzed for some purpose such as finding consumer preference, loyalty, targeting media for advertisement, etc.
Company generates lots of such data by making online purchases and participating in social media, but that is just the tip of the iceberg. Big data can include digitized documents, photographs, videos, audio files, tweets and other social networking posts, e-mails, text messages, phone records, search engine queries, RFID tag and barcode scans and financial transaction records, though those aren't the only sources. The data are produce every time you do anything online, leaving a digital trail that others can come along and gather the useful information.
The numbers and types of devices that produce data have been continuously increasing as well. Besides home computers and retailers' point-of-sale systems, we have Internet-connected smartphones, WiFi- enabled scales that tweet our weight, fitness sensors that track and sometimes share health related data, cameras that can automatically post photos and videos online and global positioning satellite (GPS) devices that can pinpoint our location on the globe, to name a few. Don't forget weather and traffic sensors, surveillance cameras, sensors in cars and airplanes and other things not connected with individuals that are constantly collecting data. The large numbers of electronic devices that generate and upload data have given rise to the term "the IOT- Internet of things."
You'll find multiple definitions of big data out there, so not everyone agrees entirely on what is included, but it can be anything anyone might be interested to know that can be subjected to computer analysis. And these large, unwieldy sets of data require new methods to collect, store, process and analyze them.
How Big Data is Analyzed and Used
Big data has to be collected, massaged, linked together and interpreted for it to be of any use to anyone. Companies and other entities need to filter the vast amount of available data to get to what's most relevant to them. Fortunately, hardware and software that can process, store and analyze huge amounts of information are becoming cheaper and faster, so the work no longer requires massive and prohibitively expensive supercomputers. Some of the software is becoming more user friendly so that it doesn't necessarily take a team of programmers and data scientists to wrangle the data (although it never hurts to have knowledgeable people who can understand your requirements).
Companies take advantage of cloud computing services so that they don't even have to buy their own computers to do all that data crunching. Data centers, also called server farms, can distribute batches of data for processing over multiple servers, and the number of servers can be scaled up or down quickly as needed. This scalable distributed computing is accomplished using innovative tools like Apache Hadoop, MapReduce and Massively Parallel Processing (MPP). NoSQL databases have been developed as more easily scalable alternatives to traditional SQL-based database systems.
Much of this big data processing and analysis is aimed at finding patterns and correlations that provide insights that can be exploited or used to make decisions. Businesses can now mine massive amounts of data for information about consumer habits, their products' popularity or more efficient ways to do business. Big data analytics can be used to target relevant ads, products and services at the customers they believe are most likely to buy them, or to create ads that are more likely to appeal to the public at large. Companies are now even starting to do things like send real-time ads and coupons to people via their smartphones for places that are near locations where they have recently used their credit cards.
It's not just for making us buy stuff, however. Businesses can use the information to improve efficiency and practices, such as finding the most cost-effective delivery routes or stocking merchandise more appropriately. Government agencies can analyze traffic patterns, crime, utility usage and other statistics to improve policy decisions and public service. Intelligence agencies can use it to, well, spy, and hopefully foil criminal and terrorist plots. News outfits can use it to find trends and develop stories, and, of course, write more articles about big data.
In essence, big data allows entities to use nearly real-time data to inform decisions, rather than relying mostly on old information as in the past. But this ability to see what's going on with us in the present, and even sometimes to predict our future behavior, can be a bit creepy.
For a conclusion 'Big Data' is need for today. If a company needs to gain a competitive edge over its competitors is have to understand its importance as soon as possible. If you want to get more of such information related to market update follow my blog. Do comment below if something is missed that can add value.