Machine Learning | The Blog Pros

January 11, 2019

How AI Can Help Solve Unsolved Crimes

The use of AI in policing to date has largely been in areas such as facial recognition and helping to deploy resources in the most effective ways, but a recent study led by Northumbria University highlights how it can also help to crack unsolved crimes, especially by providing insights into the weapons used in committing the crime.

“Machine learning uses a series of algorithms to model complex data relationships,” the authors say. “Through careful fine-tuning, these can be applied to predict important characteristics of the ammunition used in a particular shooting event from those of the respective gunshot residue (GSR) deposited on surrounding surfaces or items, such as spent cases, wounds, and, potentially, also the shooter’s hands.”

January 11, 2019January 22, 2019

JVM Advent Calendar: Two Years in the Life of AI, ML, DL, and Java [Links]

Now, back to what I was going to write about. If you ask me, I’ll already admit that I have NOT even scraped the surface of these topics. What I share here is a glimpse of what’s out there and each one of you might have discovered many more aspects of these topics as part of your daily professional and personal pursuits.

One of my motivations of putting this post and the links below together comes from the discussion we had during the LJC Unconference in November 2018, where Jeremie, Michael Bateman, and I, along with a number of LJC JUG members, gathered at a session discussing a similar topic. And, the questions raised by some were along the lines of "where does Java stand in the world of AI-ML-DL?," "how do I do any of these things in Java?, and "which libraries and frameworks should I use?"

January 11, 2019

Survey Analysis for SAE Institute: A Case Study

SAE Institute from Australia is considered the front-runner in creative studies. They recently ran a survey to understand the general sentiment of students and identify key areas of improvement. This case study describes how ParallelDots carried out a detailed survey analysis to make sense of the student responses and derived specific insights from this data. 750 students responded to the survey, resulting in a total of 4500 comments.

The Challenges in Traditional Survey Analysis

Traditional methods of survey analysis suffer from multiple human biases and are intensive in terms of effort as well as time. Knowing this, the institute decided to outsource the project to ParallelDots.

January 9, 2019

Powerful Image Analysis With Google Cloud Vision And Python

Bartosz Biskupski

2019-01-09T13:45:32+01:00 2019-01-10T04:13:47+00:00

Quite recently, I’ve built a web app to manage user’s personal expenses. Its main features are to scan shopping receipts and extract data for further processing. Google Vision API turned out to be a great tool to get a text from a photo. In this article, I will guide you through the development process with Python in a sample project.

If you’re a novice, don’t worry. You will only need a very basic knowledge of this programming language — with no other skills required.

Let’s get started, shall we?

Never Heard Of Google Cloud Vision?

It’s an API that allows developers to analyze the content of an image through extracted data. For this purpose, Google utilizes machine learning models trained on a large dataset of images. All of that is available with a single API request. The engine behind the API classifies images, detects objects, people’s faces, and recognizes printed words within images.

To give you an example, let’s bring up the well-liked Giphy. They’ve adopted the API to extract caption data from GIFs, what resulted in significant improvement in user experience. Another example is realtor.com, which uses the Vision API’s OCR to extract text from images of For Sale signs taken on a mobile app to provide more details on the property.

Machine Learning At A Glance

Let’s start with answering the question many of you have probably heard before — what is the Machine Learning?

The broad idea is to develop a programmable model that finds patterns in the data its given. The higher quality data you deliver and the better the design of the model you use, the smarter outcome will be produced. With ‘friendly machine learning’ (as Google calls their Machine Learning through API services), you can easily incorporate a chunk of Artificial Intelligence into your applications.

Recommended reading: Getting Started With Machine Learning

How To Get Started With Google Cloud

Let’s start with the registration to Google Cloud. Google requires authentication, but it’s simple and painless — you’ll only need to store a JSON file that’s including API key, which you can get directly from the Google Cloud Platform.

Download the file and add it’s path to environment variables:

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/apikey.json

Alternatively, in development, you can support yourself with the from_serivce_account_json() method, which I’ll describe further in this article. To learn more about authentication, check out Cloud’s official documentation.

Google provides a Python package to deal with the API. Let’s add the latest version of google-cloud-vision==0.33 to your app. Time to code!

How To Combine Google Cloud Vision With Python

Firstly, let’s import classes from the library.

from google.cloud import vision
from google.cloud.vision import types

When that’s taken care of, now you’ll need an instance of a client. To do so, you’re going to use a text recognition feature.

client = vision.ImageAnnotatorClient()

If you won’t store your credentials in environment variables, at this stage you can add it directly to the client.

client = vision.ImageAnnotatorClient.from_service_account_file(
'/path/to/apikey.json'
)

Assuming that you store images to be processed in a folder ‘images’ inside your project catalog, let’s open one of them.

Image of receipt that could be processed by Google Cloud Vision — An example of a simple receipt that could be processed by Google Cloud Vision. (Large preview)

image_to_open = 'images/receipt.jpg'

with open(image_to_open, 'rb') as image_file:
    content = image_file.read()

Next step is to create a Vision object, which will allow you to send a request to proceed with text recognition.

image = vision.types.Image(content=content)

text_response = client.text_detection(image=image)

The response consists of detected words stored as description keys, their location on the image, and a language prediction. For example, let’s take a closer look at the first word:

[
...
description: "SHOPPING"
bounding_poly {
  vertices {
    x: 1327
    y: 1513
  }
  vertices {
    x: 1789
    y: 1345
  }
  vertices {
    x: 1821
    y: 1432
  }
  vertices {
    x: 1359
    y: 1600
  }
}
...
]

As you can see, to filter text only, you need to get a description “on all the elements”. Luckily, with help comes Python’s powerful list comprehension.

texts = [text.description for text in text_response.text_annotations]

['SHOPPING STORE\nREG 12-21\n03:22 PM\nCLERK 2\n618\n1 MISC\n1 STUFF\n$0.49\n$7.99\n$8.48\n$0.74\nSUBTOTAL\nTAX\nTOTAL\nCASH\n6\n$9. 22\n$10.00\nCHANGE\n$0.78\nNO REFUNDS\nNO EXCHANGES\nNO RETURNS\n', 'SHOPPING', 'STORE', 'REG', '12-21', '03:22', 'PM', 'CLERK', '2', '618', '1', 'MISC', '1', 'STUFF', '$0.49', '$7.99', '$8.48', '$0.74', 'SUBTOTAL', 'TAX', 'TOTAL', 'CASH', '6', '$9.', '22', '$10.00', 'CHANGE', '$0.78', 'NO', 'REFUNDS', 'NO', 'EXCHANGES', 'NO', 'RETURNS']

If you look carefully, you can notice that the first element of the list contains all text detected in the image stored as a string, while the others are separated words. Let’s print it out.

print(texts[0])

SHOPPING STORE
REG 12-21
03:22 PM
CLERK 2
618
1 MISC
1 STUFF
$0.49
$7.99
$8.48
$0.74
SUBTOTAL
TAX
TOTAL
CASH
6
$9. 22
$10.00
CHANGE
$0.78
NO REFUNDS
NO EXCHANGES
NO RETURNS

Pretty accurate, right? And obviously quite useful, so let’s play more.

What Can You Get From Google Cloud Vision?

As I’ve mentioned above, Google Cloud Vision it’s not only about recognizing text, but also it lets you discover faces, landmarks, image properties, and web connections. With that in mind, let’s find out what it can tell you about web associations of the image.

web_response = client.web_detection(image=image)

Okay Google, do you actually know what is shown on the image you received?

web_content = web_response.web_detection
web_content.best_guess_labels
>>> [label: "Receipt"]

Good job, Google! It’s a receipt indeed. But let’s give you a bit more exercise — can you see anything else? How about more predictions expressed in percentage?

predictions = [
(entity.description, '{:.2%}'.format(entity.score))) for entity in web_content.web_entities
]

>>> [('Receipt', '70.26%'), ('Product design', '64.24%'), ('Money', '56.54%'), ('Shopping', '55.86%'), ('Design', '54.62%'), ('Brand', '54.01%'), ('Font', '53.20%'), ('Product', '51.55%'), ('Image', '38.82%')]

Lots of valuable insights, well done, my almighty friend! Can you also find out where the image comes from and whether it has any copies?

web_content.full_matching_images
 >>> [
url: "http://www.rcapitalassociates.com/wp-content/uploads/2018/03/receipts.jpg", 
url:"https://media.istockphoto.com/photos/shopping-receipt-picture-id901964616?k=6&m=901964616&s=612x612&w=0&h=RmFpYy9uDazil1H9aXkkrAOlCb0lQ-bHaFpdpl76o9A=", 
url: "https://www.pakstat.com.au/site/assets/files/1172/shutterstock_573065707.500x500.jpg"
]

I’m impressed. Thanks, Google! But one is not enough, can you please give me three examples of similar images?

web_content.visually_similar_images[:3]
>>>[
url: "https://thumbs.dreamstime.com/z/shopping-receipt-paper-sales-isolated-white-background-85651861.jpg", 
url: "https://thumbs.dreamstime.com/b/grocery-receipt-23403878.jpg", 
url:"https://image.shutterstock.com/image-photo/closeup-grocery-shopping-receipt-260nw-95237158.jpg"
]

Sweet! Well done.

Is There Really An Artificial Intelligence In Google Cloud Vision?

As you can see in the image below, dealing with receipts can get a bit emotional.

Man screaming and looking stressed while holding a long receipt — An example of stress you can experience while getting a receipt. (Large preview)

Let’s have a look at what the Vision API can tell you about this photo.

image_to_open = 'images/face.jpg'

with open(image_to_open, 'rb') as image_file:
    content = image_file.read()
image = vision.types.Image(content=content)

face_response = client.face_detection(image=image)
face_content = face_response.face_annotations

face_content[0].detection_confidence
>>> 0.5153166651725769

Not too bad, the algorithm is more than 50% sure that there is a face in the picture. But can you learn anything about the emotions behind it?

face_content[0]
>>> [
...
joy_likelihood: VERY_UNLIKELY
sorrow_likelihood: VERY_UNLIKELY
anger_likelihood: UNLIKELY
surprise_likelihood: POSSIBLE
under_exposed_likelihood: VERY_UNLIKELY
blurred_likelihood: VERY_UNLIKELY
headwear_likelihood: VERY_UNLIKELY
...
]

Surprisingly, with a simple command, you can check the likeliness of some basic emotions as well as headwear or photo properties.

When it comes to the detection of faces, I need to direct your attention to some of the potential issues you may encounter. You need to remember that you’re handing a photo over to a machine and although Google’s API utilizes models trained on huge datasets, it’s possible that it will return some unexpected and misleading results. Online you can find photos showing how easily artificial intelligence can be tricked when it comes to image analysis. Some of them can be found funny, but there is a fine line between innocent and offensive mistakes, especially when a mistake concerns a human face.

With no doubt, Google Cloud Vision is a robust tool. Moreover, it’s fun to work with. API’s REST architecture and the widely available Python package make it even more accessible for everyone, regardless of how advanced you are in Python development. Just imagine how significantly you can improve your app by utilizing its capabilities!

Recommended reading: Applications Of Machine Learning For Designers

How Can You Broaden Your Knowledge On Google Cloud Vision

The scope of possibilities to apply Google Cloud Vision service is practically endless. With Python Library available, you can utilize it in any project based on the language, whether it’s a web application or a scientific project. It can certainly help you bring out deeper interest in Machine Learning technologies.

Google documentation provides some great ideas on how to apply the Vision API features in practice as well as gives you the possibility to learn more about the Machine Learning. I especially recommend to check out the guide on how to build an advanced image search app.

One could say that what you’ve seen in this article is like magic. After all, who would’ve thought that a simple and easily accessible API is backed by such a powerful, scientific tool? All that’s left to do is write a few lines of code, unwind your imagination, and experience the boundless potential of image analysis.

(rb, ra, il)

December 17, 2018

Collective #477

Rallax.js

With Rallax.js you can easily implement a dynamic parallax scrolling effect.

Check it out

Free Font: Optician Sans

Optician Sans is a free font that is based on the same visual principles as the LogMAR chart, adjusted to be used as a fully functional display typeface. It was created by ANTI Hamar and typographer Fábio Duarte Martins.

Get it

tutpad

In case you didn’t know about it: Tutpad is an online learning community for designers with high quality design tutorials.

Check it out

From Our Blog

Ambient Canvas Backgrounds

Five ambient webpage backgrounds created using the HTML5 Canvas API and jwagner’s Simplex Noise library.

Check it out

Collective #477 was written by Pedro Botelho and published on Codrops.

August 19, 2017

Collective #342

Inspirational Website of the Week: Formigari

An elegant design with great details and modern effects. Our pick this week.

Get inspired

Our Sponsor

Collective #342 was written by Pedro Botelho and published on Codrops.