Introduction to Core ML

At WWDC 2017, Apple released a lot of exciting frameworks and APIs for us developer to use. Among all the new frameworks, one of the most popular is definitely Core ML. Core ML is a framework that can be harnessed to integrate machine learning models into your app. The best part about Core ML is that you don’t require extensive knowledge about neural networks or machine learning. Another bonus feature about Core ML is that you can use pre-trained data models as long as you convert it into a Core ML model. For demonstration purpose, we will be using a Core ML model that is available on Apple’s Developer Website. Without further ado, let’s start to learn Core ML.

What’s Core ML

Core ML lets you integrate a broad variety of machine learning model types into your app. In addition to supporting extensive deep learning with over 30 layer types, it also supports standard models such as tree ensembles, SVMs, and generalized linear models. Because it’s built on top of low level technologies like Metal and Accelerate, Core ML seamlessly takes advantage of the CPU and GPU to provide maximum performance and efficiency. You can run machine learning models on the device so data doesn’t need to leave the device to be analyzed.

Apple’s official documentation about Core ML

Core ML is a brand new machine learning framework, announed during this year’s WWDC, that comes along with iOS 11. With Core ML, you can integrate machine learning models into your app. Let’s back up a little bit. What is machine learning? Simply put, machine learning is the application of giving computers the ability to learn without being explicitly programmed. A trained model is the result of combining a machine learning algorithm with a set of training data.

trained-model

As an application developer, our main concern is how we can apply this model to our app to do some really interesting things. Luckily, with Core ML, Apple has made it so simple to integrate different machine learning models into our apps. This opens up many possibilities for developers to build features such as image recognition, natural language processing (NLP), text prediction, etc.

Now you may be wondering if it is very difficult to bring this type of AI to your app. This is the best part. Core ML is very easy to use. In this tutorial, you will see that it only takes us 10 lines of code to integrate Core ML into our apps.

Cool, right? Let’s get started.

Demo App Overview

The app we are trying to make is fairly simple. Our app lets user either take a picture of something or choose a photo from their photo library. Then, the machine learning algorithm will try to predict what the object is in the picture. The result may not be perfect, but you will get an idea how you can apply Core ML in your app.

coreml-app-demo

Getting Started

To begin, first go to Xcode 9 and create a new project. Select the single-view application template for this project, and make sure the language is set to Swift.

Creating the User Interface

Editor’s note: If you do not want to build the UI from scratch, you can download the starter project and jump to the Core ML section directly.

Let’s begin! The first thing we want to do is to head on over to Main.storyboard and add a couple of UI elements to the view. Choose the view controller in storyboard, and then go up to the Xcode menu. Click Editor-> Embed In-> Navigation Controller. Once you have done that, you should see a navigation bar appear at the top of your view. Name the navigation bar Core ML (or whatever you see fit).

Next, drag in two bar button items: one on each side of the navigation bar title. For the bar button item on the left, go to the Attributes Inspector and change the System Item to “Camera”. On the right bar button item, name it “Library”. These two buttons let users pick a photo from his/her photo library or shoot one using camera.

The final two objects you need are a UILabel and a UIImageView. Take the UIImageView and center it into the middle of the view. Change the width and height of the image view to 299x299 thus making it a square. Now for the UILabel, place it all the way to the bottom of the view and stretch it such that it is touching both ends. That completes the UI for the app.

While I am not covering how to Auto Layout these views, it is highly recommended that you try to do so to avoid any misplaced views. If you can’t accomplish this, then build the storyboard on a device which you will be using to run the app.

Implementing the Camera and Photo Library Functions

Now that we have designed the UI, let’s move onto the implementation. We will implement both the library and camera buttons in this section. In ViewController.swift, first adopt the UINavigationControllerDelegate protocol that will be required by the UIImagePickerController class.

Then add two new outlets for both the label and image view. For simplicity, I have named the UIImageView imageView and the UILabel classifier. Your code should look like this:

Next, you need to create the respective actions from clicking the bar button items. Now insert the following action methods in the ViewController class:

To summarize what we did in each of these actions, we created a constant that is a UIImagePickerController. Then we made sure that the user can’t edit the photo that is taken (either from the photo library or the camera). Then we set the delegate to its self. . Finally, we present the UIImagePickerController to the user.

Because we didn’t add the UIImagePickerControllerDelegate class method to ViewController.swift, we will receive an error. We will adopt the delegate using an extension:

The line above handles the app if the user cancels the image taken. It also assigns the class method UIImagePickerControllerDelegate to our Swift file. Your code should now look a little like this.

Make sure you go back to the storyboard and connect all the outlet variables and action methods.

To access your camera and photo library, there is still one last thing you must do. Go to your Info.plist and two entries: Privacy – Camera Usage Description and Privacy – Photo Library Usage Description. Starting from iOS 10, you will need to specify the reason why your app needs to access the camera and photo library.

Okay, that’s it. You’re now ready to move onto the core part of the tutorial. Again, if you don’t want to build the demo app from scratch, download the starter project here.

Integrating the Core ML Data Model

Now, let’s switch gears for a bit and integrate the Core ML Data Model into our app. As mentioned earlier, we need a pre-trained model to work with Core ML. You can build your own model, but for this demo, we will use the pre-trained model available on Apple’s developer website.

Go to Apple’s Developer Website on Machine Learning, and scroll all the way down to the bottom of the page. You will find 4 pre-trained Core ML models.

coreml-pretrained-model

For this tutorial, we use the Inception v3 model but feel free to try out the other three. Once you have the Inception v3 model downloaded, add it into the Xcode Project and take a look at what is displayed.

Core ML Inception v3 model
Note: Please make sure that Target Membership of the project is selected, otherwise, your app will not be able to access the file.

In the above screen, you can see the type of data model, which is neural network classifier. The other information that you have to take note is model evaluation parameters. It tells you the input the model takes in, as well as, the output the model returns. Here it takes in a 299×299 image, and returns you with the most like category, plus the probability of each category.

The other thing you will notice is the model class. This is the model class (Inceptionv3) generated from the machine learning model such that we can directly use in our code. If you click the arrow next to Inceptionv3, you can see the source code of the class.

inceptionv3-class

Now, let’s add the model in our code. Go back to ViewController.swift. First, import the CoreML framework at the very beginning:

Next, declare a model variable in the class for the Inceptionv3 model, and initialize it in the viewWillAppear() method:

I know what you’re thinking.

“Well Sai, why don’t we initialize this model earlier?”

“What’s the point of defining it in the viewWillAppear function?”

Well, dear friends, the point is that when your app will try to recognize what the object in your image is, it will be a lot faster.

Now if we go back to Inceptionv3.mlmodel, we see that the only input this model takes is an image with dimensions of 299x299. So how do we convert an image into these dimensions? Well, that is what we wil be tackling next.

Converting the Images

In the extension of ViewController.swift, update the code like below. We implement the imagePickerController(_:didFinishPickingMediaWithInfo) method to process the selected image:

What the highlighted code does:

  1. Line #7-11: In the first few lines of the method, we retrieve the selected image from the info dictionary (using UIImagePickerControllerOriginalImage key). We also dismiss the UIImagePickerController once an image is selected.
  2. Line #13-16: Since our model only accepts images with dimensions of 299x299, we convert the image into a square. We then assign the square image to another constant newImage.
  3. Line #18-23: Now, we convert the newImage into a CVPixelBuffer. For those of you not familiar with CVPixelBuffer, it’s basically an image buffer which holds the pixels in the main memory. You can find out more about CVPixelBuffers here.
  4. Line #31-32: We then take all the pixels present in the image and convert them into a device-dependent RGB color space. Then, by creating all this data into a CGContext, we can easily call it whenever we need to render (or change) some of its underlying properties. This is what we do in the next two lines of code by translating and scaling the image.
  5. Line #34-38: Finally, we make the graphics context into the current context, render the image, remove the context from the top stack, and set imageView.image to the newImage.

Now, if you do not understand most of that code, no worries. This is really some advanced Core Image code, which is out of the scope of this tutorial. All you need to know is that we converted the image taken into something which the data model can take into. I would recommend you playing around with the numbers and seeing what the results in order to have a better understanding.

Using Core ML

Anyway, let’s shift the focus back to Core ML. We use the Inceptionv3 model to perform object recognition. With Core ML, to do that, all we need is just a few lines of code. Paste the following code snippet below the imageView.image = newImage line.

That’s it! The Inceptionv3 class has a generated method called prediction(image:) that is used to predict the object in the given image. Here we pass the method with the pixelBuffer variable, which is the resized image. Once the prediction, which is of the type String, is returned, we update the classifier label to set its text to what it has been recognized.

Now it’s time to test the app! Build and run the app in the simulator or your iPhone (with iOS 11 beta installed). Pick a photo from your photo library or take a photo using camera. The app will tell you what the image is about.

coreml-successful-case

While testing out the app, you may notice that the app doesn’t correctly predict what you pointed it at. This is not a problem with your code, but rather with the trained model.

coreml-failed-case

Before We Begin…

The purpose of this tutorial is to help you learn how to convert data models in various formats into the Core ML format. However before we begin, I should give you some background about machine learning frameworks. There are many popular deep learning frameworks out there which provide developers the tools to design, build, and train their own models. The model which we are going to use is from Caffe. Caffe was developed by Bekerley Artificial Intelligence Research (BAIR) and it is one of the most commonly used framework for creating machine learning models.

Apart from Caffe, there are also plenty other frameworks such as Keras, TensorFlow, and Scikit-learn. All of these frameworks have their own advantages and disadvantages which you can learn about here.

In machine learning, everything starts with the model, the system that makes predictions or identifications. Teaching computer to learn involves a machine learning algorithm with training data to learn from. The output generated from training is usually known as machine learning models. There are different types of maching learning models that solve the same problem (e.g. object recognition) but with different algorithms. Neural Networks, Tree Ensembles, SVMs are some of these machine learning algorithms.

Editor’s note: If you’re interested in learning more about machine learning model, you can take a look at this and this article.

At the time of publication, Core ML doesn’t support the conversion of all of these models from different frameworks. The following image, provided by Apple, shows the models and third-party tools supported by Core ML.

model-supported-by-coreml-tool

To convert the data models to Core ML format, we use a software called Core ML Tools. In the next section, we will be using Python to download these tools and use them for the conversion.

Installing Python and Setting Up the Environment

Lots of researchers and engineers have made Caffe models for different tasks with all kinds of architectures and data. These models are learned and applied for problems ranging from simple regression, to large-scale visual classification, to Siamese networks for image similarity, to speech and robotics applications.

– Caffe Model Zoo

You can find different pre-trained Caffe models on GitHub. To effectively share the models, BAIR introduces the model zoo framework. And, you can find some of the available models here. In this tutorial, I use this Caffe model to show you how to convert it to Core ML format, as well as, implementing flower identification.

To begin, download the starter project here. If you open the project and look at the code, you can see that the code required to access the camera and photo library are already filled out. You might recognize this from the previous tutorial. What is missing is the Core ML model.

You should also notice 3 more files in the project bundle: oxford102.caffemodel, deploy.prototxt and class_labels.txt. This is the Caffe model and files that we will use for demo. I will discuss in details with you later.

To use the Core ML tools, the first step is to install Python on your Mac. First, download Anaconda (Choose Python 2.7 version). Anaconda is a super easy way to run Python on your Mac without causing any problems. Once you have Anaconda installed, head over to terminal and type the following:

In these two lines of code, we install the python version we want. At the time this tutorial was written, the latest version of Python 2 was 2.7.13. Just in case, once Python installed, type the second line so it updates to the latest version.

install-python-terminal

The next step is to create a virtual environment. In a virtual environment, you can write programs with different versions of Python or packages in them. To create a new virtual environment, type the following lines of code.

When Terminal prompts you,

Type “y” for yes. Congrats! Now, you have a virtual environment named flowerrec!

Lastly, type the following command to install the Core ML Tools:

Converting The Caffe Model

Open Terminal again and type the code that will take you to your virtual environment:

Then change to the directory of your starter project that contains the three files: class_labels.txt, deploy.prototxt and oxford102.caffemodel.

Once you are in the folder, it’s time to initiate python. Simply type python and you will be taken to the Python interface within Terminal. The first step is to import the Core ML tools so that is exactly what we’ll do.

The next line is really important, so please pay attention. Type the following line but don’t press enter.

Now while this is a really short line, there is a lot going on here. Let me explain what these 3 files were about.

  1. deploy.prototxt – describes the structure of the neural network.
  2. oxford102.caffemodel – the trained data model in Caffe format.
  3. class_labels.txt – contains a list of all the flowers that the model is able to recognize.

In the statement above, we define a model named coreml_model as a converter from Caffe to Core ML, which is a result of the coremltools.converters.caffe.convert function. The last two parameters of this line are:

  1. image_input_names='data'
  2. class_labels='class_labels.txt'

These two parameters define the input and output we want our Core ML model to accept. Let me put it this way: computers can only understand numbers. So if we do not add these two parameters, our Core ML Model will only accept numbers as an input and output rather than an image and string as a input and output, respectively.

Now, you can press ENTER and treat yourself to a break. Depending the computational power of your machine, it will take some time for the converter to run. When the converter finishes, you will be greeted with a simple >>>.

python-convert-ml

Now that the Caffe model is converted, you need to save it. You can do this by typing

The .mlmodel file will save in the current folder/directory.

coreml-model-ready
Advertisements
Introduction to Core ML

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s