Big Data Ethics

July 2017 | Dios Kurniawan

So you run a business, and you have a lot of customers. Thousands, or even millions of them. You hold all the customer data and their transaction history in your system, most likely in your data warehouse. You understand that you have tons of information at your fingertips and you start thinking about taking advantage of the data for making money.

Purchase behavior, mobility patterns, demographic information are the bread and butter of Big Data (photo: Dios K)

You installed Hadoop machines, and start pouring data into it to perform analysis on your customer’s demography and behavior. Examine what they purchase. Analyze their spending habits, even putting the location where your customers go into the microscope.

Before you know it, you are already crossing the boundaries between protecting your customer’s sensitive personal data and exploiting it.

There is no denying that today is the era of big data. You have got data in your hands, but the ethical question becomes: do you have the right to do anything you like with the data?

The answer is yes. But only if you protect the personally identifiable information (PII). It is the right of each customer to keep his or her private information private. In some countries, this is a legal requirement (Indonesia is a bit relaxed on this matter). The question is, do you really respect customer’s rights?

Therefore, before you perform data mining on your data, do not forget to anonymize your customer’s data. Change real names, phone numbers into hash codes and delete the actual data. Obscure sensitive information right inside the database tables.

The Safest Seats in An Aircraft

April 2017 | Dios Kurniawan

Last week I was having a discussion with friends about which seats should we choose when checking in for our upcoming flight. Some prefer window seats, while some others like seats on the aisle – no body really wants the middle seats, which is quite understandable. It is more of personal preference. But which part of the aircraft will you choose, at the front section, in the middle, or in the back of the plane?

For me, my answer is always the same: I always choose to sit in the back.

Seats in an Airbus A380 (photo: Dios K)

Why? Statistically speaking, the last few rows at the rear of an aircraft, near the tail, would give you the highest chance of survival in the event of a crash.

To support my argument, I compiled data from real-world airline accidents in the last 30-40 years. The list below might not be exhaustive, but it should give you a picture on why sitting near the tail is safer than sitting in the other part of an aircraft.

Here is the list:

  1. Japan Airlines Flight 123 that crashed into a mountain near Tokyo in 1985. This is the most notable tragic accident when more than 500 passengers died in the crash, with only 4 survivors. All survivors were seated in the last 3 rows at the rear of the plane.
  2. Mandala Airlines Flight 91 accident in Medan in 2005, with more than 100 fatalities. Seated at the rear of the plane, 17 passengers survived the crash.
  3. Air Florida Flight 90 in 1985 which crashed into Potomac River. Four passengers survived, while the rest 78 lost their lives. Again, the survivors sat in the last few rows of the plane.
  4. Lion Air Flight 538 in 2004 overran the runway at Adisumarmo Airport during landing, killing 25 passengers, all of those were seated in the front section cabin. Passengers who sat in the aft section survived.
  5. Garuda Indonesia Flight 035 in 1987, struck a power line near Polonia Airport. The tail section separated from the plane on impact with the ground, allowing 22 passengers and crew to escape. Sadly, 23 people who sat in the other cabin section lost their lives.
  6. Garuda Indonesia Flight 200 in 2007 which skidded off the runway at Yogyakarta Airport and caught fire. The accident resulted in 21 fatalities, most were seated in the front and middle sections. Remaining 100 passengers and crew managed to escape the fire.

The statistics above clearly points to the same conclusion: sitting near the tail increases your chance of survival. Yes, there have been many other airline accidents which rate of survival cannot be directly related to the pattern of seating position, but almost none favors sitting in the front of the plane.

Why it is safer in the back? There is no simple answer for this, but one thing for sure is: the middle section of a commercial airliner is usually where the main fuel tank is located, so passengers are practically sitting on top of highly flammable fuel at all times. The front section is the part which would absorb most impact of a crash, because, well, most crash happened nose-first.

No one wants to be involved in an accident, but when booking the seat for your next flight, my little piece of advice is to pick the last 2-3 rows at the rear of the plane. You might see me there, too.

Digital Dark Age is Real

February 2017 | Dios Kurniawan

Few days ago I realized that all my family video tapes stored in MiniDV format from the year 2002 to 2011 were simply unreadable. Not because they are defective, but because my MiniDV camcorder refuses to turn on – most likely because of its 15-year-old age. It is terrifying to see that suddenly I lost years of memory just because I do not have the hardware to play it back. Sure, I can always buy a new camcorder, but MiniDV is an obsolete video format and not many manufacturers still produce the hardware today.

Luckily I have transferred all videos to DVD discs, but the original raw uncompressed video – with higher quality than what’s stored in DVD’s MPEG-2 format – remains in those tapes, so I am left with a pile of video tapes which are as good as useless.

My video album in MiniDV tapes

This made me realize that the threat of “Digital Dark Age” was real. Digital Dark Age refers to a possible dystopian situation when our future generation cannot read our history records that we store in digital media. This can be compared to the first “Dark Age” in the mid ages after the fall of Roman Empire when most record of history on civilization was lost.

MiniDV is a relatively young digital format introduced in late 1990’s, but with the rise of flash storage technology, tape technology has slowly faded away from consumer electronics. Imagine your collection of memorable moments, photos, videos and documents you have amassed in the last 20 years in tapes would be unreadable if you don’t quickly migrate to the new technology. Digital Dark Age is looming over our lives.

Another example that digital dark age is upon us: In 1997, I published a book (see my book here), it was printed few thousands copies and they sold pretty well. Now, 20 years later, I still have the physical copy of my book, but I don’t have the digital copy anymore because the computer that I used to write the book in 1997 has gone forever.

Many digital formats have come to obsolescence and have finally extinct. Remember floppy disk? It used to be the most popular media to store computer files in 1980’s. Nobody uses it anymore, but there must be tons files are still stored in floppy disks which have not been migrated to a new storage media.

The same is true for CD-ROM, DVD, hard drive, USB flash disk, etc. Not one person in this world can guarantee that in 50-100 years time, someone will still own the device to read and extract the information.

My PATA hard disk, CD-ROM and Floppy Disks (remember them?)

Even if the files in the legacy digital media can be restored, there is still a big probability that we do not possess the software for reading the files in their original format. Those who were raised in 1980’s to early 1990’s most likely have used PCs to write documents using old word processing software which does not exist anymore. Remember Wordstar and Wordperfect? Can we open the files properly today?

JPEG format may be the de facto standard for storing digital pictures today, but who can guarantee that the algorithm to decompress JPEG images will still be known by the future generations in 100 or 200 years from now?

Cloud storage is also a vulnerability. We are accustomed to store our photos in Google Photos, Dropbox or Apple iCloud and think they will be safe there. Are we 100% sure that Google and Apple will still in business 100 years from now?

Large organizations are now relying on Big Data technology to store and process data in large amount. They put files in Hadoop File System with multiple different compression formats. How can we ensure that in 20 years the data will still be readable?

If we do not do something in controlling our way in storing digital data, we are risking the possibility that our grand children will never be able to read our records. History would be lost forever. I recommend that from now on, all of us make physical copies of our most important photos and documents to preserve them.