Keystroke Behavioural Analysis For Fraud Detection: Deep Learning as-a-service Infrastructure

User identification is a fundamental, but yet an open problem in fraud detection.
Traditional approaches resort to user account information or browsing history.
However, such information can pose security and privacy risks, and it is not robust as can
be easily changed, e.g., the user changes to a new device or using a different application.
Monitoring biometric information including a user’s typing behaviours tends to produce
consistent results over time while being less disruptive to user’s experience.
For this purpose, I collaborated on a project aimed at defining novel
solutions to identify users by learning from patterns in their keystroke
behaviours. In particular, I’ve been in charge of defining the infrastructure required
to effectively deploy a machine learning process as a remote cloud service.
In this talk, I will go through the technical details of frameworks and tools I used
to create RESTful APIs serving the acquisition of keystrokes gathered
from web and mobile devices as complex `json` objects. I will present `eve`,
a library sitting on top of `flask` and `mongodb`, to sanitise and validate
data in a fast and effective way.
Then, the `luigi` based data pipeline for feature extraction, and for
machine learning predictions triggering will be discussed.

Ernesto Arbitrio
Lead Software Architect, FBK

Ernesto has been using Python and related technologies since early 2000’s.
He is FBK/MPBA big data architect developing heterogeneous infrastructures for analytics
of complex data. He worked with several Pharmaceutical and Food manufacturers helping on
production and research analysts Ernesto is an active member of the Italian Python
community, and one of the organisers of PyCon Italy.

Valerio Maggio
Data Scientist, FBK

Valerio Maggio has a Ph.D. in Computational Science from the University of Naples “Federico II” and he is currently a Postdoc researcher in the FBK – MPBA team in
Trento, Italy. His research interests are focused on Machine and Deep Learning. Valerio is very much involved in the scientific Python community, and he is an active
speaker at many Python conference including EuroPython & EuroScipy. He uses Python as the mainstream language for his deep/machine learning code, making an
intensive use of `pandas`, `numpy` based libraries (e.g. `sklearn`), and `keras` + `tensorflow` to crunch, filter, and learn from data.
Valerio is a member of the Italian Python community, and the lead organiser of the PyData Italy conferences, held in Florence since 2015. He also enjoys playing basketball and drinking tea.