Ross Girshick (rbg)
Research Scientist
Facebook AI Research (FAIR)
r...@gmail.com
arXiv / Google scholar / cv

Research

I'm interested in algorithms for visual perception (object recognition, localization, segmentation, pose estimation, ...) and visual reasoning (answering complex queries, often in natural language, about images). My work explores topics in computer vision and machine/deep/statistical learning.

I'm looking for a postdoc to join me at FAIR in Seattle, starting around September 2017. Contact me directly to apply.

Four papers accepted to CVPR 2017 -- congrats to the FAIR interns involved!

About me

I finished my Ph.D. in computer vision at The University of Chicago under the supervision of Pedro Felzenszwalb in April 2012. Then, I enjoyed two wonderful years as a postdoc at UC Berkeley under Jitendra Malik. From Berkeley, I worked for one year as a Researcher at Microsoft Research, Redmond. Now, I'm a Research Scientist with the terrific group of researchers and engineers in Facebook AI Research (FAIR).

During my Ph.D., I spent time as a research intern at Microsoft Research Cambridge, UK working on human pose estimation from (Kinect) depth images. I also participated in several first-place entries into the PASCAL VOC object detection challenge, and was awarded a "lifetime achievement" prize for my work on deformable part models. I think this refers to the lifetime of the PASCAL challenge—and not mine!

More recently, in work with wonderful colleagues I pioneered R-CNN (Region-based Convolution Neural Networks), currently the dominant approach to object detection.

New tech reports

Detecting and Recognizing Human-Object Interactions
Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He
arXiv preprint Apr., 2017 / bibtex
@article{gkioxari2017,
  Author    = {Georgia Gkioxari and Ross Girshick
               Piotr Doll\'{a}r and Kaiming He},
  Title     = {Detecting and Recognizing Human-Object Interactions},
  Journal   = {arXiv preprint arXiv:1704.07333},
  Year      = {2017}}
    
Mask R-CNN
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick
arXiv preprint Mar., 2017 / bibtex
@article{he2017maskrcnn,
  Author    = {Kaiming He and Georgia Gkioxari and
               Piotr Doll\'{a}r and Ross Girshick},
  Title     = {{Mask R-CNN}},
  Journal   = {arXiv preprint arXiv:1703.06870},
  Year      = {2017}}
    
Low-shot Visual Recognition by Shrinking and Hallucinating Features
Bharath Hariharan, Ross Girshick
arXiv preprint Nov., 2016 / bibtex
@article{hariharan2016lowshot,
  Author    = {Bharath Hariharan and Ross Girshick},
  Title     = {Low-shot Visual Recognition by Shrinking and
               Hallucinating Features},
  Journal   = {arXiv preprint arXiv:1606.02819},
  Year      = {2016}}
    

CVPR 2017

Learning Features by Watching Objects Move
Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan
To appear in CVPR 2017 / bibtex
  @inproceedings{pathak2016motion,
    Author    = {Deepak Pathak and Ross Girshick and
                 Piotr Doll\'{a}r and Trevor Darrell and
                 Bharath Hariharan},
    Title     = {Learning Features by Watching Objects Move},
    Booktitle = {{CVPR}},
    Year      = {2017}}
      
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie
To appear in CVPR 2017 / bibtex
  @inproceedings{lin2016fpn,
    Author    = {Tsung-Yi Lin and Piotr Doll\'{a}r and
                 Ross Girshick and Kaiming He and
                 Bharath Hariharan and Serge Belongie},
    Title     = {Feature Pyramid Networks for Object Detection},
    Booktitle = {{CVPR}},
    Year      = {2017}}
      
Code now available Aggregated Residual Transformations for Deep Neural Networks
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He
To appear in CVPR 2017 / bibtex / github (code)
  @inproceedings{xie2016groups,
    Author    = {Saining Xie and Ross Girshick and
                 Piotr Doll\'{a}r and Zhuowen Tu and
                 Kaiming He},
    Title     = {Aggregated Residual Transformations for
                 Deep Neural Networks},
    Booktitle = {{CVPR}},
    Year      = {2017}}
      

Selected older publications

All publications and tech reports (Google scholar)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
Neural Information Processing Systems (NIPS), 2015
Python code / Matlab code / bibtex
@inproceedings{ren2015faster,
  Author = {Shaoqing Ren and Kaiming He and
            Ross Girshick and Jian Sun},
  Title = {Faster {R-CNN}: Towards Real-Time Object Detection
           with Region Proposal Networks},
  Booktitle = {Neural Information Processing Systems ({NIPS})},
  Year = {2015}
}
    
Fast R-CNN
Ross Girshick
IEEE International Conference on Computer Vision (ICCV), 2015
oral presentation
code / slides / bibtex
@inproceedings{girshick15fastrcnn,
  Author = {Ross Girshick},
  Title = {Fast {R-CNN}},
  Booktitle = {Proceedings of the International
               Conference on Computer Vision ({ICCV})},
  Year = {2015}
}
    
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
R. Girshick, J. Donahue, T. Darrell, J. Malik
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
oral presentation
arXiv tech report / supplement / code / poster / slides / bibtex
@inproceedings{girshick2014rcnn,
  Author    = {Ross Girshick and
               Jeff Donahue and
               Trevor Darrell and
               Jitendra Malik},
  Title     = {Rich feature hierarchies for accurate
               object detection and semantic segmentation},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
This paper proposes R-CNN, a state-of-the-art visual object detection system that combines bottom-up region proposals with rich features computed by a convolutional neural network. At the time of its release, R-CNN improved the previous best detection performance on PASCAL VOC 2012 by 30% relative, going from 40.9% to 53.3% mean average precision.
Efficient Human Pose Estimation from Single Depth Images
J. Shotton, R. Girshick, A. Fitzgibbon, T. Sharp, M. Cook, M. Finocchio, R. Moore, P. Kohli, A. Criminisi, A. Kipman, A. Blake
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 12, Dec. 2013
abstract / bibtex
@article{shotton2013kinect,
  Author    = {J. Shotton and
               R. Girshick and
               A. Fitzgibbon and
               T. Sharp and
               M. Cook and
               M. Finocchio and
               R. Moore and
               P. Kohli and
               A. Criminisi and
               A. Kipman and
               A. Blake},
  Title     = {Efficient Human Pose Estimation
               from Single Depth Images},
  Volume    = {35},
  Number    = {12},
  Journal   = {Pattern Analysis and Machine Intelligence},
  Year      = {2013}}
    
An integrated description of the original Kinect pose estimation algorithm and our ICCV 2011 algorithm.
Object Detection with Discriminatively Trained Part Based Models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010
abstract / PAMI code / latest code (voc-release5) / bibtex
@article{felzenszwalb2010dpm,
  Author    = {P. Felzenszwalb and
               R. Girshick and
               D. McAllester and
               D. Ramanan},
  Title     = {Object Detection with Discriminatively
               Trained Part Based Models},
  Volume    = {32},
  Number    = {9},
  Journal   = {Pattern Analysis and Machine Intelligence},
  Year      = {2010}}
    
Deformable part models (DPM).

See also, CACM Research Highlight:
Visual Object Detection with Deformable Part Models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
Communications of the ACM, no. 9 (2013): 97-105

Erdös = 3 (via two paths)


I like this website