Installing OpenCV with Anaconda

May 2019 | Dios Kurniawan

OpenCV is an open-source Computer Vision library which can be used in Python. Installing OpenCV in MacOS is a bit cumbersome since there is no binaries provided in the website, but you can do this easily with the help of Anaconda. This is what I can share with you to install OpenCV with Anaconda:

  1. Open terminal, and type: conda create –name ComputerVision python=3.7
  2. Install OpenCV: conda install -c menpo opencv
  3. Activate: source activate ComputerVision
  4. Test your installation: import cv2 as cv

That’s all.

Pandas for Data Transformation

December 2018 | Dios Kurniawan

Market Basket Analysis (MBA, as many call it) is an analytical method widely used in retail business to gather insights on what products are usually purchased together by consumers. This time around, I was given a problem of analyzing transaction data from a client in Food & Beverages business, finding purchase patterns so the management can later examine which meals and drinks to bundle into ‘paket hemat’.

To start with, I have to extract the transaction data from the Point of Sale (POS) system, which sits in a SQL database, into a CSV file. The data, as one might suspect, is in raw format and requires preprocessing. The data is not much, less than 20,000 rows so I immediately thought that I would simply use Python and run the process in my laptop.

Below is an example of the original transaction data format (I changed the product names to obscure its real values for publication in this blog). The data has gone through some cleansing to eliminate nulls and inconsistent values.

I was looking for a quick way to transpose transactional data above into 1-hot encoded format which spans to the right. Sure, I could do that in SQL but that would require me to write a long query and re-extract the data from POS again. I did not want to do that. Pandas came to the rescue:

import pandas as pd
penjualan1 = pd.read_csv('D:\data1.csc', parse_dates=['TRX_TS'], index_col=['TRX_ID'])
pivot1=penjualan1.pivot_table(index='TRX_ID', columns='PRODUCT_NAME', values='PRICE').fillna(0) 
pivot1[pivot1 > 0] = 1

It results in exactly the format I need:

Voila! Data is transformed within few seconds. With such a short program, only two lines of code, my data is ready for further analysis. Pandas is just great.

Installing Scikit-learn in MacOS

August 2018 | Dios Kurniawan 

Scikit-learn is an interesting library for doing machine learning work on Python. It offers regression, classification and much more, and the good thing is it’s free. However, I found it a bit challenging to install scikit-learn on Mac OS X. The documentation on scikit-learn.org is not enough to get going. I knew I could use Anaconda distribution, but I was looking for a more ‘manual’ way.

After some try-and-error, I have successfully installed scikit-learn on my MacBook. For those who might have the same problem, I am sharing the installation steps in this blog post. Just follow the steps below. This assumes you don’t have Python installed on your Mac yet.

  1. Install XCode
    To begin with, download the latest XCode from the Apple App Store if you haven’t done it. XCode is a strong IDE, but some will say, we want to run Python, why do we need XCode? Well, for some reason, XCode is needed for its command line tool for the subsequent steps below. I don’t know why, but it just works.After XCode is successfully installed in your Mac, you must install the command line tool. Open a Terminal and issue this command:

    xcode-select --install
    

    Follow the instructions until finish. Then, continue with this command to deal with the licence agreement:

    sudo xcodebuild -license
  2. Install Macports
    Apart from XCode, you will need Macports to install development tool packages. You can download it from macports.org. Just follow the installation instruction. After installation is finished, optionally execute the following command to check for updates:

    sudo port selfupdate
  3. Install Python using Macports
    Now comes the Python package itself. Using Macports, you can easily download and install Python. At the time of writing, the current version is 3.7, so use “python37” as the argument for the following command:

    sudo port install python37

    To make the default Python is set to the latest version, execute these commands:

    sudo port select --set python python37
    sudo port select --set python3 python37

    After that you may want to verify the installation. Open a new Terminal and execute:

    python -V
    

    Check if the default version is correct.

  4. Install Numpy, Scipy, Pandas and other libraries
    Because scikit-learn is built on other libraries like Scipy and Numpy, you will need installing these packages too:

    sudo port install py37-numpy 
    sudo port install spy37-scipy 
    sudo port install py37-matplotlib 
    sudo port install py37-pandas 
    sudo port install py37-statsmodels 
    sudo port install py37-pip
    sudo port select --set pip pip37
    
  5. Finally, install the scikit-learn package
    Use pip to download and install. Execute:

    sudo pip install -U scikit-learn

Check if it is correctly installed by running this in Python:

import sklearn
print(sklearn.__version__)

It should return the installed version like this:

0.19.2

That’s it! Now you can use scikit-learn in your Python program.