Installing kerasR

This is a quick reference to installing kerasR, a slim wrapper around Keras starting with the required Python packages.

Python packages

Create a virtualenv:

$ virtualenv pydata --python=/usr/bin/python3
Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/brian/pydata/bin/python3
Also creating executable in /home/brian/pydata/bin/python
Installing setuptools, pip, wheel...done.
$ source pydata/bin/activate
(pydata) $

Install keras. This will also install the other prerequisites for doing any sort of datasciency stuff in Python (numpy, pandas) as well as Theano. Tensorflow will be installed in the next step.

(pydata) $ pip install keras
Collecting keras
Collecting six (from keras)
Using cached six-1.10.0-py2.py3-none-any.whl
Collecting theano (from keras)
Collecting pyyaml (from keras)
Collecting scipy>=0.14 (from theano->keras)
  Downloading scipy-0.19.0-cp35-cp35m-manylinux1_x86_64.whl (47.9MB)
    100% |████████████████████████████████| 47.9MB 27kB/s
Collecting numpy>=1.9.1 (from theano->keras)
  Downloading numpy-1.13.0-cp35-cp35m-manylinux1_x86_64.whl (16.9MB)
    100% |████████████████████████████████| 16.9MB 66kB/s
Installing collected packages: six, numpy, scipy, theano, pyyaml, keras
Successfully installed keras-2.0.4 numpy-1.13.0 pyyaml-3.12 scipy-0.19.0 six-1.10.0 theano-0.9.0

Install Tensorflow:

(pydata) $ pip install tensorflow

kerasR

In R, install the kerasR package:

> install.packages("kerasR")
Installing package into ‘/home/brian/R/x86_64-pc-linux-gnu-library/3.4’
...
** testing if installed package can be loaded
successfully loaded keras
* DONE (kerasR)

This may also install the reticulate package, which is an interface to Python objects and methods.

A guide to using kerasR is provided as a vignette.

Troubleshooting

If you get an error message when executing library(kerasR) saying:

> library(kerasR)

keras not available
See reticulate::use_python() to set python path,
then use kerasR::keras_init() to retry

this means kerasR (or more specifically, reticulate) can't find the keras python package, you need to start R after loading your virtualenv:

$ source pydata/bin/activate
(pydata) $ R
> library(kerasR)
Using TensorFlow backend.
successfully loaded keras
>

Philippine Startups Wordcloud

In [2]:
import pandas as pd
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from string import punctuation

I went to the Kickstart and Ideaspace websites and scraped the descriptions of the startups they funded.

And by scraped, I mean I cut-and-paste stuff into a Google Sheets document.

In [3]:
raw = pd.read_csv("../files/Philippine Startups - Sheet1.csv")
In [4]:
descriptions = raw['Long Description']
descriptions.head()
Out[4]:
0    Arthrologic designs and develops a TKA (Total ...
1    ​BluLemons Gaming Studio is an all-Filipino th...
2    Croo enables people to swiftly send informatio...
3    The Company has the opportunity to create the ...
4    Despite current transponder technologies avail...
Name: Long Description, dtype: object
In [5]:
raw_words = word_tokenize(" ".join(descriptions))
In [6]:
stop_words = set(stopwords.words('english') + list(punctuation))

words = [w.lower() for w in raw_words if w.lower() not in stop_words and not w.isdigit() and len(w) > 3]
In [7]:
words[:20]
Out[7]:
['arthrologic',
 'designs',
 'develops',
 'total',
 'knee',
 'arthroplasty',
 'system',
 'simple',
 'evidence-based',
 'utilizing',
 'successful',
 'clinical',
 'data',
 'improve',
 'surgical',
 'skills',
 'easy-to-use',
 'surgeon-friendly',
 'instrumentation',
 'assure']
In [8]:
word_str = " ".join(words)
word_str[:1000]
Out[8]:
'arthrologic designs develops total knee arthroplasty system simple evidence-based utilizing successful clinical data improve surgical skills easy-to-use surgeon-friendly instrumentation assure successful predictable results offer competitive cost provide greater majority patients access technology improve living \u200bthe product asian-fit 2-component total knee arthroplasty system definitive surgical treatment severe end-stage osteoarthritic knees \u200bblulemons gaming studio all-filipino theme mobile gaming studio develop games based filipino culture \u200bthey believe creating mobile games great avenue showcase philippines offer globally vision create games impact filipino youth across cultures croo enables people swiftly send information loved ones need arises without typing anything calling anyone button accessory clicked sent predetermined emergency contacts smartphone application text message contains important information person’s current location nearby landmarks person’s contacts equipped '
In [9]:
with open("../files/phstartupwords.txt","w") as f:
    f.write(word_str)

Lazy Wordcloud Visualization

Enter the contents of the file generated into http://www.wordclouds.com/, and manually remove the words that occur less than 3 times:

Disabling/Enabling the Asus UX303 Touchscreen in Ubuntu 16.04

Find the Atmel touchscreen device:

$ xinput --list
⎡ Virtual core pointer                        id=2    [master pointer  (3)]
⎜   ↳ Virtual core XTEST pointer                      id=4    [slave  pointer  (2)]
⎜   ↳ FocalTechPS/2 FocalTech FocalTech Touchpad      id=17   [slave  pointer  (2)]
⎜   ↳ Logitech USB Optical Mouse                      id=20   [slave  pointer  (2)]
⎜   ↳ Atmel                                           id=10   [slave  pointer  (2)]
⎣ Virtual core keyboard                       id=3    [master keyboard (2)]
    ↳ Virtual core XTEST keyboard                     id=5    [slave  keyboard (3)]
    ↳ Power Button                                    id=6    [slave  keyboard (3)]
    ↳ Sleep Button                                    id=9    [slave  keyboard (3)]
    ↳ USB2.0 UVC HD Webcam                            id=13   [slave  keyboard (3)]
    ↳ Video Bus                                       id=7    [slave  keyboard (3)]
    ↳ AT Translated Set 2 keyboard                    id=16   [slave  keyboard (3)]
    ↳ Video Bus                                       id=8    [slave  keyboard (3)]
    ↳ Asus WMI hotkeys                                id=15   [slave  keyboard (3)]

The Atmel device is our touchscreen.

Use the xinput disable and enable commands to turn the touchscreen off or on again.:

$ xinput disable Atmel
$ xinput enable Atmel

Both commands are silent, unless you specify a device that doesn't exist.

Creating EC2 keypairs with AWS CLI

It is easy to create EC2 keypairs with the AWS CLI:

$ aws ec2 create-key-pair --key-name mynewkeypair > keystuff.json

After creating the keypair it should appear in your EC2 key pairs listing. The keystuff.json file will contain the RSA private key you will need to use to connect to any instances you create with the keypair, as well as the name of the key and its fingerprint.

{
    "KeyMaterial": "-----BEGIN RSA PRIVATE KEY-----\n<your private key>==\n-----END RSA PRIVATE KEY-----",
    "KeyName": "mynewkeypair",
    "KeyFingerprint": "53:47:ee:01:3a:35:9b:52:1c:4f:99:6f:87:b0:0f:8b:ed:83:55:3b"
}

To extract the private key into a separate file, use the jq JSON filter.

$ jq '.KeyMaterial' keystuff.json --raw > mynewkey.pem

GitLab Weirdness

If you're using GitLab.com for hosting your repositories, you may have encountered a strange problem wherein your newly-created repository's dashboard doesn't update.

/images/gitlab-weirdness.thumbnail.png

That is, when you git push your changes to the repository, the interface still looks like a newly-created repository, and neither your files nor your commits are visible in the web UI. This is weird because the remote repository works in all other respects. You can push code up to it, clone it, etc. You just can't see it on the GitLab website.

I've seen this happen a couple of times, and so far I've found that the quick fix is to run Housekeeping on the repository from the Edit Project page.

/images/gitlab-housekeeping.thumbnail.png

Housekeeping can take a couple of minutes but most of the time it works and you can see your repository's files and commit history after running it. If it doesn't work, you have to delete the repository in GitLab and re-create it, pushing your code up again.

Installing Python 2.7.11 on CentOS 7

CentOS 7 ships with python 2.7.5 by default. We have some software that requires 2.7.11. It's generally a bad idea to clobber your system python, since other system-supplied software may rely on it being a particular version.

Our strategy for running 2.7.11 alongside the system python is to build it from source, then create virtualenvs that will run our software.

Step 1. Update CentOS and install development tools

# as root
yum upgrade -y
yum groupinstall 'Development Tools' -y
yum install zlib-devel openssl-devel

Step 2. Download the Python source tarball

# As a regular user (avoid doing mundane things as root)
$ cd /tmp
$ wget https://www.python.org/ftp/python/2.7.11/Python-2.7.11.tgz
$ tar -zxf Python-2.7.11.tgz
$ cd Python-2.7.11

Step 3. Configure, build and install into /opt (replace with /usr/local/ if you prefer)

$ ./configure --prefix=/opt/
$ make
$ make install

Step 4. Install pip and virtualenv for the system Python

You have to be root for this.

# curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
# python get-pip.py
# pip install virtualenv

Step 5. Use the system virtualenv to create a venv for your updated Python

You can now create virtualenvs, just point --python to the 2.7.11 interpreter

$ mkdir env
$ virtualenv --python=/opt/bin/python2.7 env/pyenv
$ source env/pyenv/bin/activate
$ python --version
Python 2.7.11