Do We Still Need Relational Database in Big Data Era?

December 2014 | Dios Kurniawan

The relational database has been dominating the way we store our data in the data warehouse for the last 30 years; whatever the data sources you have in your organization, it must be stored neatly in perfect structure, that is, in tables with rows and columns.

Relational databases need schema to be defined in advance before loading the data, you can either choose normalized data model, star schema or other similar models to structure your data. The pitfall is changes afterwards –even the slightest ones- will require significant effort in altering the tables. But things change. In the era of big data technology, relational database may soon be less relevant particularly in data warehousing implementations. Big Data technologies such as Hadoop let us store and analyze massive data of any type without the need to follow a predefined schema structure. And at much lower cost.

Since Dr Codd invented relational database concept in 1970’s, it has grown hugely important in the computing industry that it is even taught as a compulsory course to all computer science students. At the heart of relational concept, the third normal form (3NF) model was largely designed to solve the problem of disk space usage, among other things. The 3NF model promises efficient use of disk space by eliminating redundancy in the data stored on disks. Disk storage was expensive in the 1970s era, and any effort to save storage space such as 3NF would be highly rewarding at that time.

But that was then. Today, disk storage is abundant and cheap. The cost of storing 1TB of data in a Hadoop cluster is now less than $500 (in 1980, a 5MB hard drive cost $1500). It makes much less sense today to design a data warehouse using 3NF because conserving disk usage has now become less of a pressing need. For applications which in nature serve transactional processing, 3NF may still be best fit but for data warehousing and the world of analysis (query, reporting, data mining etc.), there is no absolute need to use 3NF anymore.

As an alternative to 3NF, for years, the concept of star schemawhich was introduced by Dr Ralph Kimball has been regarded as the more acceptable standard method to store information in a data warehouse. Data is stored in fact and dimension tables, also in relational databases. This makes analysis easier for business users as data is organized by subject areas. Similar to 3NF, star schema must be defined for a particular analysis purpose – changes in business definitions would lead to cumbersome task of database modifications. Also similar to 3NF, star schema requires users to use a lot of joins to execute complex data queries.

Today, in the era of big data technology and data science, the preference has shifted to a “flat” data model. This means data is stored as is, or is stored by integrating multiple information into a single, flat table, eliminating the need for table joins. It emphasizes on denormalization, a completely different route from relational model. This is the method usually preferred by data scientists and can easily be implemented in Hadoop. They will create flattened data model and will create huge tables with long records. Yes there will be redundancies and inefficiencies, but disk storage is cheap anyway. Using flat model might as well consume a lot of computing resources, however providing abundant processing power at lower cost is what Hadoop is all about.

The emergence of “schema on read” approach further exaggerates the demise of our dependency on relational model in data warehousing. It allows much flexible way on how the data can be stored and consumed. Simply store the data in Hadoop and start exploring the information inside it. We are no longer stuck in a predefined, rigid schema. But one would ask, what about data integrity? For decades, the ACID (atomicity, consistency, isolation and durability) properties have been the strong points, the bread-and-butter of relational database. Back in 1970-1990s, enterprise data was so “mission-critical”, very important and should never get corrupted.

Relational database system was designed for data consistency and integrity, not allowing a single record to be lost. But today, in the land which is flooded with petabytes of data, it is not economically feasible -and even is not necessary – to keep and to scrutinize every bit of data in our data warehouse. When you have billions of records, losing few thousands records would be quite acceptable and would not make the result of your analysis go significantly erroneous; insight and discoveries can still be obtained. Big Data platforms focus on extracting value from the data straight away, and data scientists are willing to sacrifice consistency for speed and flexibility.

There has been a lot of buzz of Hadoop these days and indisputably Hadoop has changed the landscape of data warehousing industry forever. For the first time, now we have the choice of NOT using relational database for our data warehousing needs. Does it mean the end of relational database in data warehousing? Well, not really. At least not now.

Hadoop indeed promises a lot of good things, yet I would not say that it is the silver bullet to all your data warehousing requirements. There are reports and analysis that are still better served by relational database, such as the ever-important corporate financial reports. The relational database technology is very mature, very well understood and very widely used. Relational database has its own place in the computing world and will still find its way into the data warehousing applications, however Hadoop will certainly dethrone its dominance.

What Will Happen to Jet Planes When Fossil Fuels Are Gone?

July 2014 | Dios Kurniawan

Let’s face it. Fossil fuels – gasoline, oil, natural gas, coal – that people have been exploiting in the last 100+ years are depleting. Their prices are constantly going up. Oil reserves are draining while discovery of new oil wells has slowed in the last decade. Experts predict that by the end of this century, nearly all fossil fuels in the world will be gone, or at least will not be economically feasible to produce anymore.

However, that will not mean the end of the world. Technology has introduced into our lives many renewable energy sources such as solar power and biofuels, lessening our dependence to fossil fuels. We have seen the emergence of electric cars and hydrogen-powered vehicles. Trains have been electric-powered since long time ago. There are ships already propelled by nuclear power today, and in the future as fossil fuel becoming more expensive nuclear-powered ships could become a norm. But what will happen to jet planes? Never seen an aeroplane powered by coal. Or by electric batteries. There was an experiment in 1950’s to carry a nuclear reactor inside a B-36 bomber plane, but it never actually flew the plane.

What about solar energy? There are solar-powered aircrafts today, but mostly are experimental only, and none can carry more than two people. Solar power is not efficient if you put it into aviation technology. Let’s see it this way: the best solar panels today produce less than 100W per square meter under sunlight. A Boeing 737, the world’s most popular jetliner, has a wing area of only 125 square meters, that translates into 100×125=12.5kW of electricity produced by the solar panels. OK you you can put a few more solar panels on top of the fuselage, let’s say that adds 10kW = total 22kW. Even that is still very small compared to the power a jet engine can produce, which is in the order of 50-100 mega watts. The solar panels maybe enough to power the lighting and air conditioning of the plane, but a Boeing 737 weighs no less than 40 tons, so definitely the solar power would never be enough to move the plane, even on the ground.

Biofuel could be our only hope, but at this moment it does not seem promising considering the amount land use needed to produce biofuel. Competition with food supply is a major consequence of biofuels. The impact on the environment is huge, and it seems biofuel is still many decades away from being economically viable to be used as jet fuel.

By looking at the currently available technology in renewable energy, the replacement of fossil fuel in air travel simply does not exist. When the world faces the end of fossil fuel in about 90-100 years from now, by that time people will still be able to travel by car or by train, but not by air. Even if there is still fossil fuel left, flying will become unaffordable to most people. So I guess by the year 2100, once again all of us will have to travel by ships if we want to visit a distant country, just like our grandparents did …

Why Merpati Failed?

February 2014 | Dios Kurniawan

When Merpati stopped its operation last week, it confirmed my prediction long time ago; Merpati would NOT survive.

Merpati was born in the era when the government controlled every aspect of every airline; the regulator decided which routes airlines could fly, set their ticket prices, and even regulated what planes airlines could buy (remember in the 80’s only Garuda was allowed to fly jet planes). Merpati, being a government-owned company, was heavily subsidized. If the government thought Merpati needed more money to cover the operational costs to fly thin routes in remote areas, it simply injected fresh money. For many years Merpati saw no competition in the industry as there were no other major airlines operating in Indonesian’s sky other than its big brother Garuda. For Merpati, life was sweet those days.

But that was then, this is now.

A Fokker F28 in Cengkareng, remnant of Merpati awaiting to be scrapped  (photo: Dios K)

Indonesia is the Asia’s most rapidly expanding air travel market, fuelled by Indonesia’s burgeoning economy in the last five years.  The industry growth rate is more than 20% per year, the fastest in the region, and probably in the world. Only six years ago, Indonesian airlines carried less than 30 milllion passengers each year in total. Today, the number quadrupled to 100 million. Soaring demand in domestic air travel brought many airlines in the playground but somehow Merpati failed to take benefit of the opportunity.

So why Merpati did not succeed? This is my analysis:

1. It failed to expand its fleet. It does not operate new generation of fuel-efficient planes. Instead, Merpati continues to rely on a small number of older aircrafts such as classic Boeing 737-300/400s and Twin Otters, most of them are more than 20 years old. As with ageing old cars, they consume more fuel and are becoming more expensive to operate and to maintain. Newer airlines like Lion have been on shopping spree placing largest-ever order of 200+ airplanes, while Merpati has not been able to procure any new plane for decades (with the exception of several China-built MA-60 turboprops which are not in the same league as the American and European counterparts).

2. It failed to develop a workable business plan. Merpati never positioned itself in the marketplace correctly. Historically Merpati served thin routes in isolated areas, flying turboprops in small airports which could not accommodate jetliners. It should have strengthened its position in that particular market, serving regional routes unserved by jet-operating airlines. Instead, Merpati chose to fly on already very competitive trunk routes such as Jakarta-Surabaya and Jakarta-Bali, and did that with legacy, less efficient aircrafts. With multiple airlines serving the same routes, airline tickets are now like a commodity; customers will fly with whoever offers the best service and/or the lowest price. Flying is flying, no matter who is selling.

Airline business is notoriously tough. It is always regarded as high-risk, high-cost and low-margin business. Planes are extremely expensive to buy and to operate; a brand-new Airbus A320 costs up to $80 million (1 trillion Rupiah) a piece, and its maintenance costs can easily reach $1-2 million per year. Operating costs are constantly increasing, fuel prices never go down, and so do pilot salaries. Competition is fierce and yields are low. Many new entrants already paid the price; Adam Air, Jatayu, Mandala and Indonesia Air collapsed several years ago. Batavia Air followed suit last year.

Merpati has tried repeatedly in recent years to restructure its huge debts and has attempted to attract private investment with little success. It does not take a genius to see that the cash-strapped Merpati, with its huge $600 million debt in the shrinking market share, is not in a good position to invite potential investors. It has been in a serious trouble since many years ago. Only today it is becoming pretty clear that the problem is more serious than ever.

As for Merpati, I believe the end is near. There is no way anyone can save it now.