Many existing databases are unlabeled because large amounts of data make

it difficult for humans to manually label the categories of each instance. More-

over, human labelling is expensive and subjective. Hence, unsupervised learn-

ing is needed.

Besides being unlabeled, several applications are characterized by high-dimensional data (e.g., text, images, gene). However, not all of the features domain experts utilize to represent these data are important for the learning task.

**Unsupervised means there is no teacher, in the form of class labels.** One type of unsupervised learning problem is clustering. **The goal of clustering is to group “similar” objects together**.* …*

We will cover an introduction to working with text in TensorFlow. We start by

introducing how word embeddings work and using the bag of words method, then we move on to implementing more advanced embeddings such as Word2vec and Doc2vec

The topic discussed here is **bag of words.**

If we want to use text, we must find a way to convert the text into numbers.

There are many ways to do this and we will explore a few common ways this is achieved.

If we consider the sentence** TensorFlow makes machine learning easy**, we could convert the words to numbers in the order that we observe them. This would make the sentence become

Matrices are well known in mathematics and have their representation in NumPy as well. Universal functions work on arrays, element-by-element, or on scalars. ufuncs expect a set of scalars as input and produce a set of scalars as output. Universal functions can typically be mapped to mathematical counterparts, such as, add, subtract, divide, multiply, and so on. We will also be introduced to trigonometric, bitwise, and comparison universal functions.

** Matrices in NumPy are subclasses of ndarray . Matrices can be created using a special string format.** They are, just like in mathematics, two-dimensional. Matrix multiplication is, as you would expect, different from the normal NumPy multiplication. The same is true for the power operator. …

Apache Spark is a powerful open-source processing engine originally developed by Matei Zaharia as a part of his PhD thesis while at UC Berkeley. The first version of Spark was released in 2012.

- What is Apache Spark?
- Spark Jobs and APIs
- Resilient Distributed Dataset

Apache Spark is an open-source powerful distributed querying and processing engine. It provides flexibility and extensibility of MapReduce but at significantly higher speeds: Up to 100 times faster than Apache Hadoop when data is stored in memory and up to 10 times when accessing the disk.

*Flexible functional-style API*

1. Manipulates immutable data collections (RDD)

2. …

In the prior article, we implemented fully connected layers. We will expand our knowledge of various layers in this article.

We have explored how to connect between data inputs and a fully connected hidden layer. There are more types of layers that are built-in functions inside TensorFlow. ** The most popular layers that are used are convolutional layers and maxpool layers.** We will show you how to create and use such layers with input data and with fully connected data. First, we will look at how to use these layers on one-dimensional data, and then on two-dimensional data.

While neural networks can be layered in any fashion, one of the most common uses is to use convolutional layers and fully connected layers to first create features. If we have too many features, it is common to have a maxpool layer. After these layers, non-linear layers are commonly introduced as activation functions. …

Carrying on our previous article we will learn how to build one layer neural network using TensorFlow. We will be using the iris dataset.

We will implement a neural network with one hidden layer. It will be important to understand that a fully connected neural network is based mostly on matrix multiplication. As such, the dimensions of the data and matrix are very important to get lined up correctly.

Since this is a regression problem, we will use the mean squared error as the loss function.

- To create the computational graph, we’ll start by loading the necessary libraries:

`import matplotlib.pyplot as plt`

import numpy as np

import tensorflow as tf

from sklearn import…

Now that we can link together operational gates, we will want to run the computational graph output through an activation function. Here we introduce common activation functions.

we will compare and contrast two different activation functions, the sigmoid

and the rectified linear unit (ReLU). Recall that the two functions are given by the following equations:

Neural networks are currently breaking records in tasks such as image and speech recognition, reading handwriting, understanding text, image segmentation, dialogue systems, autonomous car driving, and so much more. It is important to introduce neural networks as an easy-to-implement machine learning algorithm so that we can expand on it later.

The concept of a neural network has been around for decades. However, it only recently gained traction computationally because we now have the computational power to train large networks because of advances in processing power, algorithm efficiency, and data sizes.

The important trick with neural networks is called ‘backpropagation’. Backpropagation is a procedure that allows us to update the model variables based on the learning rate and the output of the loss function. …

*We will use a multi-class SVM to categorize the three types of flowers in the iris dataset.*

By design, SVM algorithms are binary classifiers. However, there are a few strategies employed to get them to work on multiple classes. The two main strategies are called one versus all, and one versus one.

One versus one is a strategy where a binary classifier is created for each possible pair of classes. Then a prediction is made for a point for the class that has the most votes. This can be computationally hard as we must create ** k!/(k-2)!2! **classifiers for k classes.

Another way to implement multi-class classifiers is to do a one versus all strategy where we create a classifier for each of the classes. The predicted class of a point will be the class that creates the largest SVM margin. …

Support vector machines are a method of binary classification. The basic idea is to find a linear separating line (or hyperplane) between the two classes. We first assume that the binary class targets are -1 or 1, instead of the prior 0 or 1 targets. Since there may be many lines that separate two classes, we define the best linear separator that maximizes the distance between both classes.

We will apply a non-linear kernel to split a dataset.

We will implement the Gaussian kernel SVM on real data. We will load the iris data set and create a classifier for* I.* setosa (versus non-setosa). …

About