PyData London 2016
09 May 2016I had a great pleasure attending PyData London this weekend. Main goal behind it was to catch up on new developments and see how my current data science skills compare. I took part in conference as well as workshops on Friday.
A things I really enjoyed were:
- Amazing keynote by Andreas Freise discussing LIGO project. Two weeks before I gave a presentation about this and it was interesting to see Andreas take on it. One of the main takeways was his discussion why python phased out Matlab at his lab (and lack of replacement for simlink)
- Benjamin talk about mining large social networks. He discussed use of minhash to reduce complexity and provide real time system on a laptop.
- Learned about python implementations of Survival Analysis during workshop and talk. Results of Academic life analysis suggest publish a lot with large number of co-authors or die. Lifelines package looks really interesting.
- Talks about building pipelines with Luigi. It was expecially great to see how system can be scaled within short period
- Great introduction to machine learning
- Daniel Slater workshop demonstrated building and training a Ping Pong playing AI.
- Tahid introduction to Neural Networks
- Check out opengym, providing a toolkit for developing and comparing reinforcement learning algorithms.
- Mark Neo4j workshop, introducing this graph database.
- Bayesian analysis using PyMC3. There was a number of talks about this and overall takeaway is that I need to look carefuly at this library. Only caveat - it uses Theano
- Great talk about Panda.
- Travis talk on how python can be used to scale. My takeaway is to look at Numba and Blaze
- Tetiana talk discussing hacking Data Scientist in 6 months. I think my main takeaways where:
- plan and define your objectives.
- competition is where learning happens.
- limit your uncertainty, stick to your goals until you fail beyond reasonable doubt.
- its all about environment and ppl you interact with.
- Using python and UAV to support conservation attempts of Orangutans in Borneo. Dirk spend 8 months building and testing project showing that simplicity and robustness is the key to success. The most important takeaway was his final slide Conservation is creating hope when there is none.
- Time series talk allowed me to fully understand colours of noise. This might sound stupid but I never properly checked definition before.
Overall it was great three days. It was great due to effort put in by organisers, volunteers and presenters. A really huge thanks. I do look forward to the next year.
Ah, almost forgot. You can watch main conference here.
Main Takeaways?
- A lot of ppl use Python 3.x. I really contemplate installing python3 alongside python2.
- Do not try to install Theano under Windows. I tried to install it on the fly during workshop. On Windows. Bad idea. Definitely recommend using docker insteadUse docker.
- From discussions, and some presentations, I understand that if you want to play with deep learning the very sensible start is to use cloud service. This is also where docker comes into play. I will definitely explore this option.