Fine tuning the cloud – Making machines learn faster.

In a previous post I discussed how easy it was to setup machine learning libraries in python using virtual machines (vm) hosted in the cloud. Today, I am going to add more details to it. This post covers how to make machine learning code run faster. This post will help any user compile a tensorflow installer to make it run faster in the cloud. The speed advantages are due to installer optimizations to take advantage of processor instruction set extensions that power the cloud vm.

The steps  described below will help us compile a tensorflow installer with accelerated floating point and integer operations using instruction set extensions for the AMD64 architecture . The CPU we will be dealing with today is an Intel Xeon processor running at 2.3GHz. It is a standard server CPU that supports AVX, AVX2, SSE + floating point math and SSE4.2.

In the previous post, we were using a simple pip installer for tensorflow. The pip installer is a quick and easy way to set-up tensorflow. But, this installation method for tensorflow is not optimized for the extended instruction sets that are present in these advanced CPUs powering the vm in Google Cloud Platform.

If we compile an installer optimized for these instruction set extensions, we can speed-up many of the computation tasks. When tensorflow is used as a back-end in keras for machine learning problems, the console reminds you constantly, to optimize your tensorflow installation. These instructions below will also help you get rid of those warning messages in your console.

First step to compile an optimized tensorflow installer is to complete all the linux package dependencies. Run the line below to complete the missing packages needed to compile tensorflow installer.

To compile tensorflow, we need a build tool called Bazel. Bazel is an opensource tool developed by Google. It is used to automate software development and testing. Since tensorflow is at the leading edge of machine learning world, features are added, bugs are fixed and progress is made at a dizzying speed than the relatively pedestrian pace of traditional software development. In this rapid development, testing and deployment environment; Bazel help users manage this process more efficiently. Here is the set of code to install Bazel.

Once Bazel is installed in your computer, next step is to clone the source files of tensorflow from GitHub.

After the source files are copied from GitHub repository to your local machine, we have to do some housekeeping. We need to ensure the python environment to run tensorflow has all the necessary libraries installed. To fulfill the library dependencies in python, we need to install numpy, development environment, pip and wheel using the code below:

The next step is to configure the build process for the installer. Before we do the actual configuration, we are going to use a preview of the configuration dialog. This will help us understand what parameters we should know before hand to complete the configuration process successfully. The configuration dialog for building the installer is as follows:

We need four important pieces of information before we go ahead and configure the build process. They are: 1) location of the python installation, 2) python library path, 3) g++ location and 4) gcc location. The last two are optional and only needed to enable OpenCL. If your cloud vm supports OpenCL and CUDA, the instructions to compile tensorflow are slightly different, which I will not cover in this post. Identifying python installation location and library paths can be done using the following code. I have also included the steps for finding locations for gcc and g++ compilers in the code below:

We have all the information needed for configuring build process. We can proceed to configure the build process using the following line:

If you encounter the following error:

Purge the openjdk-9 and reinstall jdk-8 version. Use the instructions below:

Now, try ./configure again. Once the build process is configured properly, we can go ahead and build the installer using the following commands:

Everything should proceed smoothly and the build process is going to take some serious time. An octacore 2.3GHz Intel Xeon powered virtual machine needs around 30 minutes to complete this process. So, plan this process ahead of time. A short notice deployment is impossible, if one is looking to build the installer from scratch.

If the last step above threw a file not found error, it can be resolved by peeking into the build directory for the correct installer name.

Once you have the correct installer name, append the last line of code above with the correct installer name and the installation process should finish without any error messages. If we manage to finish all the steps above, it means: we have successfully installed an optimized installer for tensorflow. This installed tensorflow library is compiled to take advantage of the processor instruction set extensions.

Finally, to check and see if tensorflow can be imported into python, use the next few lines of code. The code follows these steps: first, we have to exit the tensorflow install directory, then invoke python and import tensorflow.

The import tensorflow line in python environment should proceed with no errors. Hurrah, we have a working tensorflow library that can be imported in python. We will need to do some tests to ensure that everything is in order, but, no errors up to this point means easy sailing ahead.

I have described a step-by-step process of building and installing tensorflow based on my logical reasoning of progression of things. These steps have worked extremely well for me, so far. The optimized tensorflow installation has cut-down run-time for some of the tasks by a factor of ten. As an added advantage, I longer have error messages that ask me to do optimizations to tensorflow installation in the python console.

Happy machine learning in the cloud everyone.

This work is done as part of our startup project nanøveda. For continuing nanøveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanøveda.

In the cloud – Geography agnostic enterprise using Google Cloud Platform.

The concept of geography/location agnostic enterprise is very simple. Whether I am in Beijing or Boston, Kent or Kentucky, New Delhi or New York, Silicon Valley or Silicon Wadi, Qatar or Quebec; I should have access to a standard set of tools to run the company. Moving around geographic locations are hard challenge for any enterprise. One of the first problems we wanted to solve is how quickly we can deploy some of our tools to countries around the world. Part of the reason why I envision my mission with nanøveda to be a geography/location agnostic enterprise is because the problem we are trying to solve is universal. We want people around the world to have uniform access to our technology. It will also help us become better in business too.

To solve this problem I am in search for answers. I found a brilliant solution a few days back. Recently, I got an invite for a trial of Google Cloud Platform (GCP). Google was kind enough to give me a $300 credit towards creating applications in their cloud platform. I was very excited to try this cloud computing platform from one of the leaders in computing. Finally, last Friday, I decided to explore GCP. For me cloud computing brings two advantages: 1) Zero time and cost overhead of maintaining a set of in-house linux servers; 2) Creating a truly geographic agnostic business. I am running an Ubuntu 16.10 workstation for running machine learning experiments. Most of the tasks handled by this workstation has already started to overwhelm its intended design purpose. Therefore, I was actively looking for solutions to expand the total compute infrastructure available for me. It was right at this moment, I turned to Google to help solve our compute infrastructure problem.

I have never used GCP before. Therefore, I had to go through a phase of experimentation and learning, which took approximately a day. Once I learned how to create a virtual machine, everything started to fall in place. To check if the vm is seen properly by the guest os, I ran some basic diagnostic tests.

The GCP has an interesting flat namespace Object storage. The feature is called Buckets. The buckets feature allow the virtual machine to share data from a remote computer, very conveniently over a web interface. Google has a command-line tool called gsutil, to help its users streamline the management of their cloud environment. One of the first commands I learned was to transfer files from my local computer to the Object storage space. Here is an example:

Once I learned file transfer, the next step was to learn how to host a virtual machine. After I set-up an Ubuntu 16.10 virtual machine in the cloud, I needed to configure it properly. Installing necessary linux packages and python libraries were easy and fast.

After the vm was configured to run the code I wrote, the next step was to test file transfer to the vm itself. Since vm and the Object storage are beautifully integrated, file transfer was super quick and convenient.

After the code and all the dependent files were inside the cloud vm, the next step was to test the code in python. The shell command below executed the code in the file

Since some of the code execution takes hours to complete, I needed a way to create a persisting ssh connection. Google offers a web-browser based ssh client. The browser ssh client is simple, basic way of accessing a virtual machine. But, for longer sessions: creating a persistent session is ideal. Since my goal is to make most of the computing tasks as geography agnostic as possible, I found a great tool for linux called screen. Installing screen was very straightforward.

Once screen was installed, I created a screen session by typing screen in terminal. The screen session works like the terminal. But, if you are using ssh, it allows persistence of the commands being executed in the terminal even after the ssh terminal is disconnected. To quit screen just use the keyboard short cuts: ctrl+a followed by ctrl+d.

To resume a screen session, just type screen -r in vm terminal. If there are multiple screen sessions running, then one will have to specify the specific session that needs to be restored.

The ssh + screen option is a life saver for tasks that require routine management, and needs a lot of time to execute. It allows a vm administrator to convert any network connection into a highly persistent ssh connection.

The combination of Google cloud Object storage, ease of networking with the vm, ssh and screen has allowed me to transfer some of the complex computing tasks to the cloud in less than a day.The simplicity and lack of cognitive friction of the GCP has taken me by surprise. The platform is extremely powerful, sophisticated and yet; very easy to use. I have future updates planned for the progress and evolution of my usage of GCP for our start-ups computing needs.

I am still amazed by how easy it was for me to take one of the most important steps in creating a truly geography/location agnostic enterprise with the help of Google Cloud Platform. I have to thank the amazing engineering team at Google for this brilliant and intuitive cloud computing solution.

This work is done as part of our startup project nanøveda. For continuing nanøveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanøveda.

Pi day – Calculate 2017 digits of pi using Python3

Here is a short and elegant code to calculate 2017 digits of pi. The code is implemented in python3. In three lines of very simple python code; we are going to calculate the 2017 digits of pi.

The output is:

Happy 2017 pi day.

The code is available on GitHub.

This work is done as part of our startup project nanøveda. For continuing nanøveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanøveda.

Installation notes – OpenCV in Python 3 and Ubuntu 17.04.

These are my installation notes for OpenCV in Ubuntu 17.04 linux for python3. These notes will help start a computer vision project from scratch. OpenCV is one of the most widely used libraries for image recognition tasks. The first step is to fire-up the linux terminal and type in the following lines of commands. These first set of commands will fulfill the OS dependencies required for installing OpenCV in Ubuntu 17.04.


Update the available packages by running:


Install developer tools for compiling OpenCV 3.0


Install packages for handling image and video formats:

Add installer repository from earlier version of Ubuntu to install libjasper and libpng and install these two packages:

Install libraries for handling video files:


Install GTK for OpenCV GUI and other package dependencies for OpenCV:


The next step is to check if our python3 installation is properly configured. To check python3 configuration, we type in the following command in terminal:

The output from the terminal for the command above indicates the location pointers to target directory and current directory for the python configuration file. These two locations must match, before we should proceed with installation of OpenCV.

Example of a working python configuration location without any modification will look something like this:

The output from the terminal has a specific order, with the first pointer indicating the target location and the second pointer indicating the current location of the config file. If the two location pointers don’t match, use cp shell command to copy the configuration file to the target location. We are using the sudo prefix to give administrative rights to the linux terminal to perform the copying procedure. If your linux environment is password protected, you will very likely need to use sudo prefix. Otherwise, one can simply the skip the sudo prefix and execute the rest of the line in terminal. I am using sudo here because it is required for my linux installation environment.

Step6 (Optional):

Set-up a virtual environment for python:

The next step is to update ~/.bashrc file. Open the file in text editor. Here we are using nano.

At the end of the file, past the following text to update the virtual environment parameters:

Now, either open a new terminal window or enforce the changes made to bashrc file by:

Create a virtual environment for python named OpenCV:


Add python developer tools and numpy to the python environment we want to run OpenCV. Run the following code below:


Now we have to create a directory to copy the OpenCV source files from GitHub and download the source-files from GitHub to our local directory. This can be achieved using mkdir and cd commands along with git command. Here is an example:

We also need OpenCV contrib-repo for access to standard keypoint detectors and local invariant descriptors (such as SIFT, SURF, etc.) and newer OpenCV 3.0 features like text detection in images.


The final step is to build, compile and install the packages from the files downloaded from GitHub. One thing to keep in mind here is that: we are now working in the newly created folder in terminal and not in the home directory. An example of linux terminal code to perform the installation of OpenCV is given below. Again, I am using sudo prefix to allow the terminal to have elevated privileges while executing the commands. It may not be necessary in all systems, depending on the nature of the linux installation and how system privileges are configured.


The final step of compiling and installing from the source is a very time consuming process. I have tried to speed this process up by using all the available processors to compile. This is achieved using the argument nproc –all for the make command.

Here are the command line instructions to install OpenCV from the installer we just built:

For the OpenCV to work in python, we need to update the binding files. Go to the install directory and get the file name for the OpenCV file installed. It is located in either dist-packages or site-packages.

The terminal command lines are the following:

Now we need to update the bindings to the python environment we are going to use.  We also need to rename the symbolic link to cv2 to ensure we can import OpenCV in python as cv2. The terminal commands are as follows:

Once OpenCV is installed, type cd / to return to home directory in terminal. Then type python3 to launch the python environment in your OS. Once you have launched python3 in your terminal, try importing OpenCV to verify the installation.

Let us deactivate the current environment by:

First we need to ensure, we are in the correct environment. In this case we should activate the virtual environment called OpenCV and then launch the python terminal interpreter. Here are the terminal commands:

Let us try to import OpenCV and get the version number of the installed OpenCV version. Run these commands in the terminal:

If the import command in python3 console returned no errors, then it means python3 can successfully import OpenCV. We will have to do some basic testing to verify if OpenCV installation has been successfully installed in your Ubuntu 17.04 linux environment. But, if one manages to reach up to this point, then it is a great start. Installing OpenCV is one of the first steps in preparing a linux environment to solve machine vision and machine learning problems.

A more concise version of the instructions is also available on GitHub. Using the Moad Ada dev-ops server for deep-learning, the linux environment comes pre-installed with OpenCV. This makes it easier for artificial intelligence & deep learning application developers to develop machine vision applications. The dev-ops server can be ordered here: Moad online store.

Fair trade – How hard can it be?

This year’s United States Presidential address to the congress featured an impassioned plea by Pres. Donald J. Trump towards focusing more on fair trade. His reasoning was clear: a fair global trade will generate less disruptive effect on societies. His concerns were directed toward the ordinary American tax payers, and his interest in protecting their livelihood. But, his calls for incorporating fair trade as part of globalization has another beneficiary: the entire world itself. This is an unintended consequence for a US president who has admitted to eliminating any pretensions of him acting as “the leader of the free world”. American policies will be a powerful force in dictating the direction of the global economy for decades to come. It may not be a deliberate attempt to change the world, but a mere result of being the most powerful and richest country on earth.

To tackle fair trade, a key issue the president raised was: creating a better system of taxation across nation states. The goal of his proposed revised taxation structure is to make trade more equitable between nation states. The example he used to reiterate his logic on fair trade and taxation focused on Milwaukee, Wisconsin based Harley Davidson. Harley Davidson has been part of the American journey for the past 114 years. Yet, it has difficulty competing in some of the world’s largest motorcycle markets, including India.

North American motorcycle market accounts for only 2% of the global motorcycle sales volume. It is tiny compared to the rest of the global motorcycle marketplace. The largest motorcycle manufacturer in the world is Minato, Tokyo, Japan based Honda. India is one of the largest volume markets for Honda. In India, Honda has managed to establish a large local manufacturing facility to produce the popular mass market low displacement motorcycles. The sheer volume of monthly sales of Honda motorcycles overshadow the annual sales of Harley Davidson, not just in India, but around the world.

The reason for Harley Davidson being overshadowed by motorcycle manufacturers from the rest of the world is partly due to their strategy of catering to an exclusive set of customers. They position themselves as a lifestyle brand more than a motorcycle brand. Most of the sales volume in Asia is for commodity, commuter, low displacement motorcycles. Harley Davidson has no products to compete in this segment. For European markets, Harley Davidson again forgets to cater to the sports bike segment. The issues of Harley Davidson’s struggle in global markets are not just due to taxation and duties on motorcycles.

If the interest in making global trade fairer is genuine, one has to consider another key component: the human factor. Even the world’s most powerful democracy starts crying foul play when it comes to global trade, makes one wonder the real consequences of global trade in this world.

Recently, the privately held food manufacturer Cargill came under microscope for their questionable environmental practices in bringing food to the table for millions of Americans. Cargill and its local suppliers involved in circumventing local Brazilian laws to prevent deforestation, is another great example for the desperate need to incorporate fair trade in globalization. The Brazilian suppliers of Cargill are empowered by the capital and resources of the North American market, which even local governments can’t fight against.

The Brazilian soybeans could have been replaced by produce sourced from North American farmers themselves, who adhere to more stringent environmental standards than Cargill’s Brazilian counterparts. Instead Cargill’s decision to cut upfront costs for livestock feeds, explicitly demonstrate the flaws in global trade. A call for fair free trade also means placing restrictions for companies like Cargill. The current trade practices allow unhinged environmental damages in the process of bringing fast-food burgers to the American market. Therefore, a call for fairer traded world also means better protection of Brazilian rain-forests.

Global trading of coffee beans is another great example to illustrate the difficulty of implementing fair trade. Coffee is one of the largest volume traded commodity in the world. Yet, only 30% of the coffee beans produced worldwide meet the current definition of fair trade. Even the millennial favorite coffee chain: Starbucks, has managed to source only about 8% of their coffee beans through fair trade.  The current mechanisms to create fair traded coffee beans is riddled with massive economic and social problems. An important issue that comes to my mind is: despite the coffee chains marketing the fair traded coffee at a premium price; only a fraction of the added price paid by the customer, reaches the producer.

The discussion on fair global free trade is a great running start to create a more equitable world. Creating uniform taxation rules across nation states is the first logical step towards this goal. But, the concept of fair trade extends way beyond mere taxation. I am excited by the fact that the US president has initiated a conversation on fair trade. It is an important topic that has more substance to it than just the mettle of selling motorcycles made in Milwaukee to the milkmen in Mumbai.

Descriptions for photographs featured in this post from top to bottom: 1) Photograph of the Whitehouse, 1600 Pennsylvania Ave NW, Washington, DC 20500, United States, at dusk, via Wikipedia. 2) A vintage style Harley Davidson advertisement poster featuring model Marisa Miller, obtained from public domain via Pinterest and reproduced under fair usage rights. 3) Jaguar (Panthera onca) photographed at the Toronto zoo, obtained via Wikipedia. Jaguar is a near endangered apex predator. Its natural habitat include the Brazilian rain-forests, that are currently under threat of massive deforestation due to unfair trade practices followed by food suppliers like Cargill. 4) Commercial coffee farm in Jinotega, Nicaragua, Source: Lillo Caffe, obtained through Dartmouth college blog-post on subsistence agriculture in Nicaragua.