This year CVPR (Computer Vision and Pattern Recognition) conference has accepted 900+ papers. This blog post has overview of some of them. Here you can find notes that we captured together with my amazing colleague Tingting Zhao.
The main conference had the following presentation tracks during 3 days:
- Special session: Workshop Competitions
- Object Recognition and Scene Understanding
- Analyzing Humans in Images
- 3D Vision
- Machine Learning for Computer Vision
- Video Analytics
- Computational Photography
- Image Motion and Tracking
Below are some trends and topics worth mentioning:
- Video analysis: captioning, action classification, predict in what direction person (pedestrian) will move.
- Visual sentiment analysis.
- Agent orientation in space (room), virtual rooms datasets — topics related to enabling machines to perform tasks.
- Person re-identification in video feeds.
- Style transfer (GAaaaNs) is still a theme.
- Adversarial attacks analysis.
- Image enhancements — remove drops, remove shadows.
- NLP+Computer Vision.
- Image and video saliency.
- Efficient computation on edge devices.
- Weakly supervised learning for computer vision.
- Domain adaption.
- Interpretable Machine Learning.
- Applications of Reinforcement Learning to CV: optimize network, data, NN learning process.
- Lots of interest into data-labeling area.
Notes below are semi-grouped to the following subsections:
Here is nice compilation of person re-identification related papers (in Mandarin, online translators are doing ok job 🙂 ).
For more info please dig into into presentations and workshops archive.
Videos from sessions are here.
Below are videos worth checking out from recent Spark+AI Summit.
Building the Software 2.0 Stack Andrej Karpathy (Tesla) – Andrej’s talk at Spark+AI Summit. If I had time to watch one, I’d do this one.
A lot of our code is in the process of being transitioned from Software 1.0 (code written by humans) to Software 2.0 (code written by an optimization, commonly in the form of neural network training). In the new paradigm, much of the attention of a developer shifts from designing an explicit algorithm to curating large, varied, and clean datasets, which indirectly influence the code. I will provide a number of examples of this ongoing transition, cover the advantages and challenges of the new stack, and outline multiple opportunities for new tooling.
Using AI to Build a Self-Driving Query Optimizer Continue reading
I came across this nice compilation on current efforts done to Fairness in ML by Google researches. Recommend to take a look.
This year I will be presenting 2 sessions at the [de:code] conference and cover topics ranging from Metalerning, Object Detection @Edge to Hierarchical Attention Neural Networks.Thank you Daiyu Hatakeyama-san for warming up the audience 😉.
Presenting https://www.microsoft.com/developerblog/2018/03/06/sequence-intent-classification/ at The Datascience Conference: loved all the questions from audience and ideas on how to apply our work!
One of my favorite is to use our case study for malware classification to analyze application logs 😉
It’s pretty amazing what Hierarchical Attention Networks can do with text data. Like we can classify malware using disassembled code. Or using processes logs classify behavior of the processes. Interested? Read more in the blog post here.
(Photo by Markus Spiske on Unsplash)
Last week Long Beach, CA was hosting annual NIPS (Neural Information Processing Systems) Conference with record breaking (8000+) number of attendees. This conference is consider once of the biggest events in ML\DNN Research community.
Below are thoughts and notes related to what was going on at NIPS. Hopefully those brief (and sometimes abrupt) statements will be intriguing enough to inspire your further research ;).
- Deep learning everywhere – pervasive across the other topics listed below. Lots of vision/image processing applications. Mostly CNNs and variations thereof. Two new developments: Capsule Networks and WaveNet.
- Reinforcement Learning – strong come-back with multiple sessions and papers on Deep RL and multi-arm bandits.
- Meta-Learning and One-Shot learning are often mention in Robotics and RL context.
- GANs – still popular, with several variations to speed up training / conversion and address mode collapse problem. Variational Auto-Encoders also popular.
- Bayesian NNs are area of active research
- Fairness in ML – Keynote and several papers on dealing with / awareness of bias in models, approaches to generate explanations.
- Explainable ML — lots of attention to it.
- Tricks, approaches to speed up SGD.
- Graphic models are back! Deep learning meets graphical probabilistic modeling.
Doing ML for good cause is inspiring. If it involves picture of cute fluffy birds — event better! Check out our recent blog post here.
Last 2 days of the conference were workshops and actually had less rock-star content.
In overall ICML in this year was well organized (well, minus pass-holders that emit constant cow-bell like tinkling) and rich for content. I have not noticed any breakthrough papers though. Lots of RNNs, LSTMs, language\speech related work, GANs and Reinforcement Learning.
Toolset wise it “feels” like mostly Tensorflow, Caffe, Pytorch, even Matlab was mentioned few times.
Principled Approaches to Deep Learning
This track was about theoretical understanding DNN architectures.
Do GANs actually learn distribution? I personally had higher expectation of this talk. Main point was that yes, it’s problematic quantify success of GANs training algo and that mode collapse is a problem. That’s pretty much all about it.