A reproducible bike ride in Hamburg (…or elsewhere in Europe)
The talk focuses on the use case analysis “Bike sharing usage in Hamburg”. It will discuss techniques that can take this base analysis and make it reproducible, as well as extend it to other locations in Europe. Base analysis poster case: https://www.user2017.brussels/uploads/Kruse_poster-session.pdf
Reproducible research is gaining traction in academia, as well as in enterprise environments. The lack of it can lead to significant issues in science, corporate, health care, and policy making. Data scientists must provide not only the results, but also the data, code, and environment where the analysis is executed.
After the first iteration where the application works as desired, decisions need to be made: how reproducible the report/research/analysis should be? Which data archiving strategy to use? Which data to archive? Which steps can be more cost/time effective to make the analysis reproducible in the future? Follow best practices or step away from them? All those trade-offs and more are part of the craft of a data scientist.
The application language is R, although same concepts are applicable to other languages, environments and use cases.
Data Scientist, Reliable Dynamics
Carles CG is a Data Scientist at Reliable Dynamics. His passion about wind energy lead him to a Danish based manufacturer applying Six Sigma concepts to reduce lead time. This changed the focus of his career to learn and improve the link between the data and the physical world.
Since then, he has been developing and applying analytical skills in the energy business from wind operations, to energy trading, as well as advising different companies on big data, reliability engineering, and recently on blockchain technology.
In addition, enjoys writing code, developing through trial and error and being an active member of the Copenhagen R-user community!