CVPR 2018 — recap, notes and trends

This year CVPR (Computer Vision and Pattern Recognition) conference has accepted 900+ papers. This blog post has overview of some of them. Here you can find notes that we captured together with my amazing colleague Tingting Zhao.

The main conference had the following presentation tracks during 3 days:

  • Special session: Workshop Competitions
  • Object Recognition and Scene Understanding
  •  Analyzing Humans in Images
  •  3D Vision
  •  Machine Learning for Computer Vision
  •  Video Analytics
  •  Computational Photography
  •  Image Motion and Tracking
  •  Applications

Below are  some trends and topics worth mentioning:

  • Video analysis: captioning, action classification, predict in what direction person (pedestrian) will move.
  •  Visual sentiment analysis.
  •  Agent orientation in space (room), virtual rooms datasets   — topics related to enabling machines to perform tasks.
  •  Person re-identification in video feeds.
  •  Style transfer (GAaaaNs) is still a theme.
  •  Adversarial attacks analysis.
  •  Image  enhancements — remove drops, remove shadows.
  •  NLP+Computer Vision.
  •  Image and video saliency.
  •  Efficient computation on edge devices.
  •  Weakly supervised learning for computer vision.
  •  Domain adaption.
  •  Interpretable Machine Learning.
  • Applications of Reinforcement Learning to CV: optimize network, data,  NN learning process.
  • Lots of interest into data-labeling area.

Notes below are semi-grouped to the following subsections:

Here is nice compilation of person re-identification related papers (in Mandarin, online translators are doing ok job 🙂 ).

For more info  please dig into into presentations and workshops archive.

Videos from sessions are here.

Continue reading

Advertisements

Spark+AI gems (from the Summit)

Below are videos worth checking out from recent Spark+AI Summit.

Building the Software 2.0 Stack Andrej Karpathy (Tesla) – Andrej’s talk at Spark+AI Summit. If I had time to watch one, I’d do this one.

A lot of our code is in the process of being transitioned from Software 1.0 (code written by humans) to Software 2.0 (code written by an optimization, commonly in the form of neural network training). In the new paradigm, much of the attention of a developer shifts from designing an explicit algorithm to curating large, varied, and clean datasets, which indirectly influence the code. I will provide a number of examples of this ongoing transition, cover the advantages and challenges of the new stack, and outline multiple opportunities for new tooling.

Using AI to Build a Self-Driving Query Optimizer Continue reading