Keystroke Behavioural Analysis For Fraud Detection: A Deep Learning Solution

User identification is a fundamental, but yet an open problem in fraud detection.
Traditional approaches resort to user account information or browsing history.
However, such information can pose security and privacy risks, and it is not robust as can be easily changed, e.g., the user changes to a new device or using a different application. Monitoring biometric information including a user’s typing behaviours tends to produce consistent results over time while being less disruptive to user’s experience.

In this talk I will present the Machine Learning pipeline I set up to prevent frauds in user
authentications. Challenges for processing and filtering real user data
accessing bank accounts from web and mobile devices will be discussed, along
with the deep neural networks adopted to learn to detect impostors.
During the talk, I will present the Pythonic tools (e.g. `pandas`) and data formats
(i.e. `hdf5` and `json`) I used to collect and store data, as well as those to
configure the machine learning process (i.e. `scipy.cluster`, `sklearn` and `keras`).

The talk is meant for data scientists, as well as for practitioners with no specific background in machine or deep learning. Basic knowledge of `pandas` and other `numpy` based scientific libraries is assumed.

Ernesto Arbitrio
Lead Software Architect, FBK

Ernesto has been using Python and related technologies since early 2000’s.
He is FBK/MPBA big data architect developing heterogeneous infrastructures for analytics
of complex data. He worked with several Pharmaceutical and Food manufacturers helping on
production and research analysts Ernesto is an active member of the Italian Python
community, and one of the organisers of PyCon Italy.

Valerio Maggio
Data Scientist, FBK

Valerio Maggio has a Ph.D. in Computational Science from the University of Naples “Federico II” and he is currently a Postdoc researcher in the FBK – MPBA team in
Trento, Italy. His research interests are focused on Machine and Deep Learning. Valerio is very much involved in the scientific Python community, and he is an active
speaker at many Python conference including EuroPython & EuroScipy. He uses Python as the mainstream language for his deep/machine learning code, making an
intensive use of `pandas`, `numpy` based libraries (e.g. `sklearn`), and `keras` + `tensorflow` to crunch, filter, and learn from data.
Valerio is a member of the Italian Python community, and the lead organiser of the PyData Italy conferences, held in Florence since 2015. He also enjoys playing basketball and drinking tea.