Installation notes – OpenCV in Python 3 and Ubuntu 17.04.

These are my installation notes for OpenCV in Ubuntu 17.04 linux for python3. These notes will help start a computer vision project from scratch. OpenCV is one of the most widely used libraries for image recognition tasks. The first step is to fire-up the linux terminal and type in the following lines of commands. These first set of commands will fulfill the OS dependencies required for installing OpenCV in Ubuntu 17.04.

Step1:

Update the available packages by running:

Step2:

Install developer tools for compiling OpenCV 3.0

Step3:

Install packages for handling image and video formats:

Add installer repository from earlier version of Ubuntu to install libjasper and libpng and install these two packages:

Install libraries for handling video files:

Step4:

Install GTK for OpenCV GUI and other package dependencies for OpenCV:

Step5:

The next step is to check if our python3 installation is properly configured. To check python3 configuration, we type in the following command in terminal:

The output from the terminal for the command above indicates the location pointers to target directory and current directory for the python configuration file. These two locations must match, before we should proceed with installation of OpenCV.

Example of a working python configuration location without any modification will look something like this:

The output from the terminal has a specific order, with the first pointer indicating the target location and the second pointer indicating the current location of the config file. If the two location pointers don’t match, use cp shell command to copy the configuration file to the target location. We are using the sudo prefix to give administrative rights to the linux terminal to perform the copying procedure. If your linux environment is password protected, you will very likely need to use sudo prefix. Otherwise, one can simply the skip the sudo prefix and execute the rest of the line in terminal. I am using sudo here because it is required for my linux installation environment.

Step6 (Optional):

Set-up a virtual environment for python:

The next step is to update ~/.bashrc file. Open the file in text editor. Here we are using nano.

At the end of the file, past the following text to update the virtual environment parameters:

Now, either open a new terminal window or enforce the changes made to bashrc file by:

Create a virtual environment for python named OpenCV:

Step7:

Add python developer tools and numpy to the python environment we want to run OpenCV. Run the following code below:

Step8:

Now we have to create a directory to copy the OpenCV source files from GitHub and download the source-files from GitHub to our local directory. This can be achieved using mkdir and cd commands along with git command. Here is an example:

We also need OpenCV contrib-repo for access to standard keypoint detectors and local invariant descriptors (such as SIFT, SURF, etc.) and newer OpenCV 3.0 features like text detection in images.

Step9:

The final step is to build, compile and install the packages from the files downloaded from GitHub. One thing to keep in mind here is that: we are now working in the newly created folder in terminal and not in the home directory. An example of linux terminal code to perform the installation of OpenCV is given below. Again, I am using sudo prefix to allow the terminal to have elevated privileges while executing the commands. It may not be necessary in all systems, depending on the nature of the linux installation and how system privileges are configured.

Step10:

The final step of compiling and installing from the source is a very time consuming process. I have tried to speed this process up by using all the available processors to compile. This is achieved using the argument nproc –all for the make command.

Here are the command line instructions to install OpenCV from the installer we just built:

For the OpenCV to work in python, we need to update the binding files. Go to the install directory and get the file name for the OpenCV file installed. It is located in either dist-packages or site-packages.

The terminal command lines are the following:

Now we need to update the bindings to the python environment we are going to use.  We also need to rename the symbolic link to cv2 to ensure we can import OpenCV in python as cv2. The terminal commands are as follows:

Once OpenCV is installed, type cd / to return to home directory in terminal. Then type python3 to launch the python environment in your OS. Once you have launched python3 in your terminal, try importing OpenCV to verify the installation.

Let us deactivate the current environment by:

First we need to ensure, we are in the correct environment. In this case we should activate the virtual environment called OpenCV and then launch the python terminal interpreter. Here are the terminal commands:

Let us try to import OpenCV and get the version number of the installed OpenCV version. Run these commands in the terminal:

If the import command in python3 console returned no errors, then it means python3 can successfully import OpenCV. We will have to do some basic testing to verify if OpenCV installation has been successfully installed in your Ubuntu 17.04 linux environment. But, if one manages to reach up to this point, then it is a great start. Installing OpenCV is one of the first steps in preparing a linux environment to solve machine vision and machine learning problems.

A more concise version of the instructions is also available on GitHub. Using the Moad Ada dev-ops server for deep-learning, the linux environment comes pre-installed with OpenCV. This makes it easier for artificial intelligence & deep learning application developers to develop machine vision applications. The dev-ops server can be ordered here: Moad online store.

Fair trade – How hard can it be?

This year’s United States Presidential address to the congress featured an impassioned plea by Pres. Donald J. Trump towards focusing more on fair trade. His reasoning was clear: a fair global trade will generate less disruptive effect on societies. His concerns were directed toward the ordinary American tax payers, and his interest in protecting their livelihood. But, his calls for incorporating fair trade as part of globalization has another beneficiary: the entire world itself. This is an unintended consequence for a US president who has admitted to eliminating any pretensions of him acting as “the leader of the free world”. American policies will be a powerful force in dictating the direction of the global economy for decades to come. It may not be a deliberate attempt to change the world, but a mere result of being the most powerful and richest country on earth.

To tackle fair trade, a key issue the president raised was: creating a better system of taxation across nation states. The goal of his proposed revised taxation structure is to make trade more equitable between nation states. The example he used to reiterate his logic on fair trade and taxation focused on Milwaukee, Wisconsin based Harley Davidson. Harley Davidson has been part of the American journey for the past 114 years. Yet, it has difficulty competing in some of the world’s largest motorcycle markets, including India.

North American motorcycle market accounts for only 2% of the global motorcycle sales volume. It is tiny compared to the rest of the global motorcycle marketplace. The largest motorcycle manufacturer in the world is Minato, Tokyo, Japan based Honda. India is one of the largest volume markets for Honda. In India, Honda has managed to establish a large local manufacturing facility to produce the popular mass market low displacement motorcycles. The sheer volume of monthly sales of Honda motorcycles overshadow the annual sales of Harley Davidson, not just in India, but around the world.

The reason for Harley Davidson being overshadowed by motorcycle manufacturers from the rest of the world is partly due to their strategy of catering to an exclusive set of customers. They position themselves as a lifestyle brand more than a motorcycle brand. Most of the sales volume in Asia is for commodity, commuter, low displacement motorcycles. Harley Davidson has no products to compete in this segment. For European markets, Harley Davidson again forgets to cater to the sports bike segment. The issues of Harley Davidson’s struggle in global markets are not just due to taxation and duties on motorcycles.

If the interest in making global trade fairer is genuine, one has to consider another key component: the human factor. Even the world’s most powerful democracy starts crying foul play when it comes to global trade, makes one wonder the real consequences of global trade in this world.

Recently, the privately held food manufacturer Cargill came under microscope for their questionable environmental practices in bringing food to the table for millions of Americans. Cargill and its local suppliers involved in circumventing local Brazilian laws to prevent deforestation, is another great example for the desperate need to incorporate fair trade in globalization. The Brazilian suppliers of Cargill are empowered by the capital and resources of the North American market, which even local governments can’t fight against.

The Brazilian soybeans could have been replaced by produce sourced from North American farmers themselves, who adhere to more stringent environmental standards than Cargill’s Brazilian counterparts. Instead Cargill’s decision to cut upfront costs for livestock feeds, explicitly demonstrate the flaws in global trade. A call for fair free trade also means placing restrictions for companies like Cargill. The current trade practices allow unhinged environmental damages in the process of bringing fast-food burgers to the American market. Therefore, a call for fairer traded world also means better protection of Brazilian rain-forests.

Global trading of coffee beans is another great example to illustrate the difficulty of implementing fair trade. Coffee is one of the largest volume traded commodity in the world. Yet, only 30% of the coffee beans produced worldwide meet the current definition of fair trade. Even the millennial favorite coffee chain: Starbucks, has managed to source only about 8% of their coffee beans through fair trade.  The current mechanisms to create fair traded coffee beans is riddled with massive economic and social problems. An important issue that comes to my mind is: despite the coffee chains marketing the fair traded coffee at a premium price; only a fraction of the added price paid by the customer, reaches the producer.

The discussion on fair global free trade is a great running start to create a more equitable world. Creating uniform taxation rules across nation states is the first logical step towards this goal. But, the concept of fair trade extends way beyond mere taxation. I am excited by the fact that the US president has initiated a conversation on fair trade. It is an important topic that has more substance to it than just the mettle of selling motorcycles made in Milwaukee to the milkmen in Mumbai.

Descriptions for photographs featured in this post from top to bottom: 1) Photograph of the Whitehouse, 1600 Pennsylvania Ave NW, Washington, DC 20500, United States, at dusk, via Wikipedia. 2) A vintage style Harley Davidson advertisement poster featuring model Marisa Miller, obtained from public domain via Pinterest and reproduced under fair usage rights. 3) Jaguar (Panthera onca) photographed at the Toronto zoo, obtained via Wikipedia. Jaguar is a near endangered apex predator. Its natural habitat include the Brazilian rain-forests, that are currently under threat of massive deforestation due to unfair trade practices followed by food suppliers like Cargill. 4) Commercial coffee farm in Jinotega, Nicaragua, Source: Lillo Caffe, obtained through Dartmouth college blog-post on subsistence agriculture in Nicaragua.

Formula One 2017 – Exciting changes for team Mercedes.

Today is the first day of the pre-season testing for 2017 formula one season. I am very excited about the 2017 season. The new regulations are expected to make the cars go faster. One of the criticisms for the hybrid era formula one races have been the lack of excitement. Faster cars are definitely going to make this sport very exciting indeed.

Along with car changes there is also a change in driver lineup. Last year’s world champion: Nico Rosberg retired from the sports. To fill-in his spot is Finnish driver Valtteri Bottas. Bottas moved from Williams-Martini Racing, a Mercedes powered team, to the factory team. Lewis Hamilton is the veteran driver for the team. I am expecting Hamilton to be faster of the two. Unlike last year, I also hope Hamilton to have a better luck with reliability. Mercedes AMG Petronas have made some significant changes to the 2017 car.

This video below explains all the regulation changes in place for 2017 season.

Here are my thoughts on some of the interesting details I found out about the Mercedes AMG Petronas W08 2017 car.

One of the important changes to the 2017 season W08 EQ+ powered car is the delta shaped front wings with 12.5 degree back sweep angle. It features cascade winglets similar to 2016 season. The delta front wings are a regulation change for the 2017 season.

The W08 has a slender nose, shorter camera pods and a new S duct design with deeper grooves and embedded antenna in the middle. The four element turning vanes attached to the new nose design features their own individual footplates that are again divided into a total of seven individual aerofoils. There is also a bommerang shaped fin sitting above the barge-boards.

The car also features a taller front suspension by placing the upper wishbone higher. A taller wishbone frees-up space between suspension elements and create cleaner and increased airflow into aerodynamic components behind it.

The primary barge-board occupies almost all of the box area set out by the FIA regulations. It also features a number of perforations along the base to help optimize the airflow and seal off the splitter behind. The perforated body work is designed for creating elongated vortices and optimized surrounding free-flow of air. There is also a more detailed out-sweep floor board design to optimize airflow underneath the car. Another interesting addition to the floor board is a simpler design to displace turbulent airflow by featuring nine perforations ahead of the rear tires. This approach is a relatively simpler design compared to previous years. These floor-board perforations will allow cleaner airflow over the rear tires by preventing the formation of vortices.

The overall length of the car has been increased by 15 cm. It also features a very big, complexly sculpted vane tower design on either sides of the barge-boards. The side pods are extended for better engine air intake. Since the 2017 design increases the airflow to the engine, I am expecting an increase in the power-output for the otherwise largely unchanged internal combustion design. The side-pods also feature a highly detailed three element flow conditioners for maximizing the deflection of the wake from the side-pod undercuts.

 

The rear of the car features a FIA regulation mandated slanted, bow shaped rear wing, which is shallower at the tips than at the center. There is also an open leading edge slotted end-plate design for the rear wing. A narrow ‘T-shaped’ mini-wing is also placed ahead of the rear tires. The new wing design and the mini-wing is aimed at making the car more aerodynamically balanced. I am expecting slight changes to the wing design, depending on the nature of the race circuit.

Even though the initial reveal at Silverstone showed a car with a subtle ‘shark-fin’ element over the rear engine cover, the race car debuted at Barcelona has a more prominent ‘shark-fin’ aero-structure. The more prominent ‘shark-fin’ engine cover element also features a clever engine intake point, possibly to assist in cooling some of the kinetic energy recovery (MGU-K) components.

Another area of important changes are to the hybrid power train. Here is a video from the Mercedes team explaining all the significant engine changes for the 2017 season.

According to Andy Cowell, the chief of Mercedes High Performance Powertrains, the new engine features updated high-power switches for better efficiency. The new engine design is aimed at taking advantage of the added down-force and grip with the new aerodynamic design. Since, the full-throttle time is projected to increase for the 2017 races, both engine and motor generator units (MGUs) for energy recovery systems needed the upgrade. The drive cycle change expected for 2017 races have led to the development of a more efficient MGU-H and an updated MGU-K system. I am expecting an increased reliability for the power train. The removal of token system for engine development means there is some room for significant improvement in performance as the season evolves.

An interesting addition to the new wing mirrors is the integration of infrared cameras for tire temperature monitoring. Mercedes team has been partnering with mobile communication giant Qualcomm. This partnership has made some exciting and significant improvements in telemetry and data acquisition from the new car.

The tires have changed for the 2017 season. The new Pirellis on the W08 are wider than last years, with 305 mm front and 405 mm rear tires. These tires will improve the overall grip. It also allows better transfer of power to the road, which means increased cornering speeds.

With the wider tires, improved aerodynamics for better stability and down-force, and an updated power-train, I am expecting to see a 3 to 3.5 second improvement in lap-times with the new car over 2016 season.

Here is an awesome video from the Mercedes AMG Petronas team featuring their drivers: Lewis Hamilton and Valtteri Bottas.

From the looks of it, team Mercedes has another winning package. Hamilton and Bottas are the two very talented drivers currently in formula one. The 2017 race car has significant improvements from last years in areas that matter. The engineers at Mercedes have managed to retain all the key elements of the W07 car that worked very well for the team last year. The new W08 car features extensive updates, but remains an evolutionary design over last years car. Considering the dominance of W07 over its rivals in the last season, an evolutionary design should work very well for the Mercedes team.

If the early track testings are any indication, the 2017 formula one season is going to be hugely exciting. My bet is on a close Mercedes-Ferrari battle with Mercedes having a slight upper hand.

All photographs featured in this blog post were taken at the Silverstone race circuit 2017 Silver Arrows ‘Collateral Day’ session, Towcester NN12 8TN, UK and photographed by Steve Etherington. The first picture at the beginning of this post features from left to right: Lewis Hamilton, Toto Wolff and Valtteri Bottas, behind a Mercedes AMG Petronas Silver-arrows W08 formula one race car. 

New kid in the block – Homomorphic encryption.

Healthcare data has an important challenge from a cryptography standpoint. It has to be both private and useful. An initial review of these requirements may appear as  two completely contradictory elements. Encrypted data using traditional encryption techniques remove the usability factor. In a traditional encryption scheme, unless the end user with the encrypted data has a decryption key, the data is completely useless.

But, what if there is a new way of encrypting data? A technique where, the end user can perform certain relevant functions on the encrypted data without ever decrypting it. As it turns out, there is a mechanism of accomplishing this. It is called homomorphic encryption scheme.

This scheme was first proposed by Ronald L. Rivest, Leonard Adleman and Michael Dertouzos.  The general expression for a fully homomorphic encryption scheme is:

There are currently a few cryptographic libraries that can be classified as fully homomorphic encryption schemes.

The key advantage of a fully homomorphic encryption scheme is the ability to perform mathematical calculations on the cipher text. For healthcare data to be useful, one need to perform these calculations on the data. Using a fully homomorphic encryption scheme, these computations can be performed without ever decrypting the data.

Homomorphic encryption scheme is the next big step in big data and artificial intelligence. As more and more healthcare organizations are looking to reduce cost of their IT infrastructure by adopting cloud computing, a truly homomorphic encryption scheme will not only protect the data, but also provide useful insights into these massive data-sets without ever compromising privacy.

Some of this work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

(The picture of rows of protein data creating a colorful sequence display at the Genomic Research Center in Gaithersburg, Maryland. This image was created by © Hank Morgan – Rainbow/Science Faction/Corbis and obtained from public domain via nationalgeographic.com)

Research and development – When is a good time to invest?

Businesses have limited resources. How to efficiently manage them is an art. A key controversial area of spending is always research and development (R&D). As a start-up we are even more constrained than a regular well established business. Therefore the question I often encounter: Is that a great idea to invest in research and development.

I have given some thought to that question. My answer is a resounding yes. I am pitching this idea on top of Amar G. Bose’s vision of role of research and development in business. A trendy approach for most corporations is to keep the research and development efforts to a bare minimum, mostly in the name of shareholder or investor value. Most business doesn’t view themselves as flag bearers of innovation. They orient themselves to protect the status quo of their commercial enterprise. When going gets tough, these enterprises cut spending from R&D to shift the blame from having a poor product portfolio in the first place.

This is a counter-intuitive, yet, widely adopted practice in the world of business. Amar Bose had a very different take. According to Bose, when the economy is going through a recession or when a company is struggling to find a better place in the market, it is the best time to invest in research and development. His reasoning being, cutting money from R&D wouldn’t provide the oxygen for coming up with newer products and innovations that a company needs in the first place. By the time the recession is over, or when customers realize there is a gap in what their expectations are and what the product delivers, the company will no longer be in a position to deliver the increased expectations of the customers or the business environment. New competitors will fill-in the gap.

My suggestion is: always invest in R&D and be bullish about those investments. Even if the business is just a mom and pop store in a highly popular tourist neighborhood, R&D will work. Especially in an era when social media and data sciences have become the life blood of businesses, all types of businesses, whether small or large, need to invest in R&D. By R&D I don’t mean running a lab with a bunch of scientists with white coats. The research and development includes, how to improve supply chain efficiency, how to improve communication and PR, how to improve cash inflow, how to better develop a targeted marketing, so on and so forth.

Science and business goes hand in hand. Science has an empirical view of the world. Businesses need to have an empirical view of financial performance. When these two merge together, it is a recipe for growing into a great business. An approach of R&D heavy investments will help businesses understand the emerging blind-spots in the market place and solve them as quickly as possible.

A great example is Exxon-Mobil. Despite having a heavy investment in fossil fuels, the company invested billions of dollars into scientific research on climate science. When the results of the research started coming out, it turned out to be a completely unexpected outcome for the executives initially. But, it still provided a valuable tool to foresee the evolving energy market. How Exxon-Mobil dealt with the unexpected results is highly controversial, but, I admire the ability of an organization to fund scientific research that had far-reaching consequences to their traditional business model.

My view of R&D is: it is the stethoscope of the market place. It will allow us to listen into small shifts in rhythm, way before those shifts change into a disastrous event. This listening tool will help enterprises survive from being blind-sided by large scale disruptive changes in the market place.

This work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

(The picture of the International Space Station (ISS) taken on 19 Feb. 2010, backdropped by Earth’s horizon and the blackness of space. This image was photographed by a Space Transportation System (STS) -130 crew member on space shuttle Endeavour after the station and shuttle began their post-undocking relative separation. Undocking of the two spacecraft occurred at 7:54 p.m. (EST) on Feb. 19, 2010. The picture was obtained from public domain via nasa.gov)

Understanding brain imaging data – 65,000 shades of gray.

I wanted to explain how structural and functional MRI image format NIFTI works. NIFTI stands for Neuroimaging Informatics Technology Initiative. It is currently a 8 to 128 complete bit, signed or unsigned integer data storage format. The most common implementation of NIFTI is the 16 bit signed integer data storage. The NIFTI images usually have the file extensions *.nii or (*.hdr and *.img) pairs. This image format is important due to two reasons: 1) The format stores the spatio-temporal imaging details and 2) compression to allow better space management.

Anyone who underwent a brain scan knows, the picture of the brain from an MRI or CT scanner is usually a grayscale picture with 65,536 shades of gray. The raw files from the scanner are usually DICOM (Digital Imaging and Communications in Medicine) format with an extension of *.dcm. The DICOM format is similar to the RAW image format for cameras. Instead of pixel read out stored in RAW images, the DICOM images store scanner readouts.

Each scan of a subject usually contains several DICOM files. This is an advantage and a disadvantage. For sharing specific image slices, DICOM is therefore extremely useful. But, for most interpretation purposes, the analysis often requires full image sets. A few slices from the scanner becomes less useful. This is where NIFTI format comes to rescue.

Since the format stores the entire sequence in a single file, the issues of managing large number of files are eliminated. The interpretation of a specific image based on the image preceding and succeeding it becomes easier. This is due to the ordered arrangement of images.

There is another important advantage of NIFTI. Brain imaging data is most relevant from an analytical point of view, to be used as a 3D data structure. Even though the individual components of the NIFTI are 2D images, the interpretation of an image becomes more reproducible if we treat them as 3D images. For this purpose, the NIFTI format is the best format to work with.

An example is the use of a machine learning tool called 3D convolutional neural networks (cnn). 3dcnn’s provide the 3d spatial context to a voxel. For image sequences like brain scans, identification of various structures or any abnormalities require the 3d spatial context of a voxel. The 3d cnn approach is very similar to looking at a video and trying to identify what the scene is about. Instead of using it for video scene recognition, 3d cnn can be used to train and detect specific features in a brain scan.

This work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

Virtualization – Matryoshka dolls of computing.

A few weeks back I talked about various open operating systems to efficiently run some of the deep learning and simulation models. I switched back and forth between six different flavors of linux to finally settle with one. This experimentation phase is helpful in the long-run.

But, for folks who want to run one particular toolkit in the convenience of their preferred operating system environment, there is an alternate option. It is virtualization, and one software particularly: Docker.

Virtualization is the computing equivalent of Matryoshka dolls. A host computer can have multiple operating systems running inside it, or one can nest virtual machine within a virtual machine within a physical machine. This layering approach to operating systems have made software applications somewhat platform agnostic.

I love Docker in Windows 10. The caveat is, the OS has to be 64 bit and the processor should support specific extensions that allow hardware level virtualization. These extensions are often referred to as x86-VT-x. Docker prefers Microsoft Hyper-V, to run its linux virtual machines inside windows. In systems that may not meet these requirements, it is possible to force Docker to use VirtualBox’s implementation of virtual machines inside Windows.

The backbone of modern internet applications including cloud computing applications are based on virtualization. With the advent of hardware extensions supporting virtual machines, the performance difference between a physical machine and a well configured virtual machine in a right host is non-existent.

For new-comers to linux, who are still more comfortable dealing with off-the-shelf consumer hardware, virtualization is an easy entry point to start using some of the awesome tools for deep learning and computational simulations. But, once you learn the tricks of the trade, there is always the option to convert all your applications to a physical machine or run it inside a cloud provider.

This work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

(The image of Docker whale logo is from Docker blog)

Democracy and science – What South Sudan teaches us?

I had an incredible opportunity to participate in a Doctors without borders (MSF) initiative to fill-in missing geographic information data into satellite images. This is an incredibly important process to figure out a huge number of operational details for aid agencies and non-profit organizations like MSF.

This includes resource allocation, rapid disaster relief, quick response to public health crisis like outbreak of epidemics, administration of vaccines to children and so many other important life saving efforts. Our mission for the day was to help fill-in housing details in a region called Aweil in South Sudan.

I consider MSF as one of the most important organizations in the world. In 1999, when MSF won a Nobel prize for peace, as a teenager hoping to find what should be my next mission in life, reading about some of the incredible life saving efforts of MSF acted as one of the strong motivators for me to become a doctor.

But, yesterday’s mission of helping fill-in the missing mapping information in South Sudan made me realize another important fact. The importance of investing in transparent democracy, science and technology even in resource poor settings.

According to 2013 World Bank data South Sudan has a per-capita GDP of $1044.99. The country has a steady source of revenue by exporting oil, which accounts for nearly 40% of the GDP. This gives South Sudan another important distinction: the most oil dependent economy in the world.

At a time when most nations around the world are pledging to invest more in renewable energy and as the global economy is shifting away from oil, how can a young upstart like South Sudan cope with these changes? The secret is: early investments in science and technology.

Despite robust oil revenues, due to systemic inefficiencies in the South Sudanese economy, most of it never benefit the citizens of this North African nation. Even today, due to these inefficiencies, the oil revenue is a non-existent contributor of economic development in South Sudan.

The financial infrastructure in South Sudan is virtually non-existent and the military acts as a bank to distribute currency to the public. This creates a huge conflict of interest. Due to the lack of financial transparency in how the country’s revenues are handled, most of the cash distribution system fail to address the poverty and social issues that riddle South Sudanese society.

At a time when democracies around the world are racing towards reinventing themselves as opaque, protectionist and self-serving institutions; this little North African nation serves as a warning beacon against such policies.

Despite all the challenges, I see hope for a small country like South Sudan. With a little bit of external help and guidance, the democratic and financial institutions can be made more efficient. The much needed access to healthcare is currently provided by brilliant organizations like MSF. But, developing local skills and training will be extremely important for South Sudanese society to flourish and be healthy.

There is an incredible opportunity for this nation to invest in good educational infrastructure. This will create more empowered citizens, a much needed resource for a fairly new country. Investment in education is a necessity for a healthy democracy.

Another key area that needs investment is the development of technological backbone to support independent public and private financial institutions. This will create a more accountable economy, reducing financial inefficiencies.

These are very hard tasks, even for developed economies. But, investing in these key basic goals will elevate South Sudan from a languishing new democracy to a beacon of hope, it once was, a few years ago.

Read more about MSF activities in South Sudan. Here is a short movie explaining how a day in Aweil is like for members of MSF.

Please consider making a donation to Médecins Sans Frontières (Doctors without borders) to help MSF continue their incredible work of bringing accessible healthcare to some of the poorest societies in the world. Organizations like MSF are the epitome of hope and inspiration for free societies around the world.

Donate here: through MSF website page for donations.

(The picture of a mother with a new born taken at Aweil, Médecins Sans Frontières hospital, South Sudan, retrieved from MSF UK website, © Diana Zeyneb Alhindawi.)

Linux distros – The art of selecting one.

I have decided to migrate all of the programming environments to linux . The reason is simplicity of linux to run Python and R. I am often befuddled by common dependency issues, which linux seems to have avoided. This is especially true for Python. An added advantage is the ability to run very sophisticated deep learning algorithms, including Nvidia Digits and Google Tensorflow. If one is serious about machine learning, embrace linux.

Right now, I am divided between two major linux distros: Ubuntu and Fedora. Ubuntu has the advantage of wider support. Fedora has Project Wayland, which makes Fedora 24 way more secure than any other X-system based linux distros. For now, the decision is a virtual tie. Right now, I am experimenting with a very minimalist Ubuntu based OS, called Elementary.

I initially ran three tensorflow experiments in Elementary. The OS has major issues with Anaconda and Docker. But, since I don’t care much about either of those issues, as long as tensorflow experiments could perform well, I was very happy. The chief attraction of Elementary OS was the distraction free, minimalist UI. Being a Sherlock fan, the name was also a subjective matter of attraction to me.

After two days of experimentation with Elementary, I decided to stick with plain vanilla Ubuntu 16.10. The biggest issue for me was the lack of a stable package manager. A simple Docker installation routine broke the package manager. Then came the errors in tensorflow. The UI is beautiful in Elementary. It is very beginner friendly distro. But, for advanced applications, I have decided to stick with Ubuntu.

This work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

(The image of  Elementary OS Loki 0.4 desktop from Elementary OS blog.)

Quantifying trust – An essential necessity for any business.

This post is an evolving set of ideas to quantify trust in decision making systems and processes. For me, an empirical definition of trust is the relative distance of a decision from the ground truth. Quantifying and optimizing trust in decision making systems are therefore highly important. This process will make systems that are involved in decision making processes to perform consistently and closer to the future reality.

The first step to optimizing such systems, human or computational, will be to develop an algorithmic approach to quantify and optimize trust.

The first experimentation is using a measurement of distance from the center. The idea here is, as overall trustworthiness of a decision making system improves overtime, the system has a very short distance from the mean. Also, patterns of delineation between systems that consistently lag behind in real world prediction problems can be easily identified.

The code above is an example I pulled from CRAN for starting an experimentation with k-centroids cluster analysis.

Another approach to quantification of decision systems is to use log-loss. Log-loss is very interesting because of the increased penalty for systems that are very far off from the ground reality.

Here is a simple implementation of the log-loss function. But this function has a series of downsides, which I will discuss below.

The function log(1-predicted), is the function I am wary of. What if the algorithm used for making predictions return a value greater than 1? For most applications, a simple specification of the range between 0 and 1 for the prediction values will fix the issue of >1 values. But, there are circumstances where >1 values are needed as outputs for prediction problems. An excellent scenario is regression problems using machine learning.

In regression problems, there is no real solution to identify whether a probability function returning a value slightly higher than 1, when plugged into another equation, matches the real observation or not. To address this issue of handling >1 probability values, I have modified the code to include an absolute function. This will prevent the log function to return imaginary (i) values or in most programming environments: NaN values. The modified code is included below:

The quantification of trust in decision processes, especially the artificial intelligence systems are important. I visualize AI systems as very similar to constructing a bridge across a deep ravine with a river flowing at break-neck speeds.

If someone builds a rickety rope bridge (very low trust scores), people have the intuition to not use the bridge to cross the ravine. On the other hand, when we build a strong steel suspension bridge with a service lifespan of 300 years and a load carrying capacity way higher than anything currently imaginable (very high trust scores), folks will use the bridge without ever thinking about the risks. The reason is quite simple: the statistical probability of the well engineered steel suspension bridge failing is very close to zero.

But, the problem for AI systems currently is: there is no straight forward and intuitive solutions to quantify the trust worthiness of these systems. The metrics that I am trying to develop, will help visualize and quantitate the trust worthiness of AI systems. It is very similar to human cognitive approach to the bridge crossing problem, but applied for AI and decision systems.

Note: Evolving content with changes in the post as I add more content.

This work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

(The image of “Mother and Child, 1921” by Pablo Picasso, Spanish, worked in France, 1881–1973, from Art Institute of Chicago and published under fair use rights.
© 2016 Estate of Pablo Picasso / Artists Rights Society (ARS), New York, 

The image of island rope bridge, Sa Pa, Vietnam, is an edited version of a public domain photograph obtained through Google image search. )