SlideShare a Scribd company logo
1 of 16
Download to read offline
@DocXavi
Module 6
Deep Learning for Video:
Object Tracking
22nd March 2018
Xavier Giró-i-Nieto
[http://pagines.uab.cat/mcv/]
2
Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." NIPS 2013
[Project page with code]
DNN were firstly used for feature learning to later be used by a tracker.
Object Tracking: Feature Extractor only
3
Wang, Lijun, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "Visual Tracking with Fully Convolutional Networks." ICCV
2015. [code]
Object Tracking: FCNT: Localization
Despite trained for image classification, feature maps in conv5-3 enable object
localization...but are not discriminative enough to discriminate between instances of
the same class.
4
Wang, Lijun, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "Visual Tracking with Fully Convolutional Networks." ICCV
2015. [code]
Object Tracking: FCNT: Localization
conv4-3 (specific) conv5-3 (general)
On the other hand, feature maps from conv4-3 are more sensitive to intra-class
appearance variation…
5
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object Tracking: FCNT: Localization
Wang, Lijun, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "Visual Tracking with Fully Convolutional Networks." ICCV
2015. [code]
6
Nam, Hyeonseob, and Bohyung Han. "Learning multi-domain convolutional neural networks for visual tracking." CVPR
2016.
7
Nam, Hyeonseob, and Bohyung Han. "Learning multi-domain convolutional neural networks for visual
tracking." CVPR 2016.
Object Tracking: MDNet
8
Valmadre, Jack, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, and Philip HS Torr. "End-to-end
representation learning for Correlation Filter based tracking." CVPR 2017
Object Tracking: CFNET
VOT-17
Learned !!
9
Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." NIPS 2013
[Project page with code]
Object Tracking: RNN
10P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks,” AAAI 2016 [code]
11
Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent
Convolutional Neural Networks for Visual Object Tracking." ISCAS 2017.
Object Tracking: ROLO
12
Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent
Convolutional Neural Networks for Visual Object Tracking." ISCAS 2017.
13
Girdhar, Rohit, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran. "Detect-and-Track:
Efficient Pose Estimation in Videos." CVPR 2018.
Keypoint tracking
14
Real, Esteban, Jonathon Shlens, Stefano Mazzocchi, Xin Pan, and Vincent Vanhoucke.
"Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video." CVPR
2017.
Object tracking: Datasets: YouTube-BB
15
Object Tracking: Challenges
Leal-Taixé, Laura, Anton Milan, Konrad Schindler, Daniel Cremers, Ian Reid, and Stefan Roth. "Tracking
the trackers: an analysis of the state of the art in multiple object tracking."
● MSc course (2017)
● BSc course (2018)
16
Deep Learning online courses by UPC:
● 1st edition (2016)
● 2nd edition (2017)
● 3rd edition (2018)
● 1st edition (2017)
● 2nd edition (2018)
Next edition Autumn 2018 Next edition Winter/Spring 2019Summer School (late June 2018)

More Related Content

What's hot

Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Universitat Politècnica de Catalunya
 
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaSelf-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...
Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...
Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...Universitat Politècnica de Catalunya
 
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Universitat Politècnica de Catalunya
 

What's hot (20)

Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
 
Deep Learning for Video: Action Recognition (UPC 2018)
Deep Learning for Video: Action Recognition (UPC 2018)Deep Learning for Video: Action Recognition (UPC 2018)
Deep Learning for Video: Action Recognition (UPC 2018)
 
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaSelf-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)Learning with Videos  (D4L4 2017 UPC Deep Learning for Computer Vision)
Learning with Videos (D4L4 2017 UPC Deep Learning for Computer Vision)
 
Deep Language and Vision by Amaia Salvador (Insight DCU 2018)
Deep Language and Vision by Amaia Salvador (Insight DCU 2018)Deep Language and Vision by Amaia Salvador (Insight DCU 2018)
Deep Language and Vision by Amaia Salvador (Insight DCU 2018)
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
 
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
 
Disentangle motion, Foreground and Background Features in Videos
Disentangle motion, Foreground and Background Features in VideosDisentangle motion, Foreground and Background Features in Videos
Disentangle motion, Foreground and Background Features in Videos
 
One Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and VisionOne Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and Vision
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
 
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
 
Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...
Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...
Wav2Pix: Speech-conditioned face generation using Generative Adversarial Netw...
 
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
 
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 

Similar to Deep Learning for Video: Object Tracking (UPC 2018)

Real Time Object Detection with Audio Feedback using Yolo v3
Real Time Object Detection with Audio Feedback using Yolo v3Real Time Object Detection with Audio Feedback using Yolo v3
Real Time Object Detection with Audio Feedback using Yolo v3ijtsrd
 
Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016
Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016
Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016Universitat Politècnica de Catalunya
 
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...Juxi Leitner
 
IRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning TechniquesIRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning TechniquesIRJET Journal
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generationijtsrd
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animationijcga
 
Deep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer visionDeep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer visionFares Al-Qunaieer
 
Survey on video object detection & tracking
Survey on video object detection & trackingSurvey on video object detection & tracking
Survey on video object detection & trackingijctet
 
Remote Sensing Image Scene Classification
Remote Sensing Image Scene ClassificationRemote Sensing Image Scene Classification
Remote Sensing Image Scene ClassificationGaurav Singh
 
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...Universitat Politècnica de Catalunya
 
A novel enhanced algorithm for efficient human tracking
A novel enhanced algorithm for efficient human trackingA novel enhanced algorithm for efficient human tracking
A novel enhanced algorithm for efficient human trackingIJICTJOURNAL
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systemsvivatechijri
 
Bio-inspired Active Vision System
Bio-inspired Active Vision SystemBio-inspired Active Vision System
Bio-inspired Active Vision SystemMartin Peniak
 
ALife in Humanoid Robots #ecal2013
ALife in Humanoid Robots #ecal2013ALife in Humanoid Robots #ecal2013
ALife in Humanoid Robots #ecal2013Juxi Leitner
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET Journal
 

Similar to Deep Learning for Video: Object Tracking (UPC 2018) (20)

Real Time Object Detection with Audio Feedback using Yolo v3
Real Time Object Detection with Audio Feedback using Yolo v3Real Time Object Detection with Audio Feedback using Yolo v3
Real Time Object Detection with Audio Feedback using Yolo v3
 
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
 
seminar ppt.pptx
seminar ppt.pptxseminar ppt.pptx
seminar ppt.pptx
 
Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016
Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016
Deep Learning for Computer Vision (2/4): Object Analytics @ laSalle 2016
 
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
 
IRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning TechniquesIRJET- A Survey on Object Detection using Deep Learning Techniques
IRJET- A Survey on Object Detection using Deep Learning Techniques
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generation
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animation
 
Deep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer visionDeep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer vision
 
Survey on video object detection & tracking
Survey on video object detection & trackingSurvey on video object detection & tracking
Survey on video object detection & tracking
 
Remote Sensing Image Scene Classification
Remote Sensing Image Scene ClassificationRemote Sensing Image Scene Classification
Remote Sensing Image Scene Classification
 
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
 
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Deep Learning for Computer Vision: Video Analytics (UPC 2016)Deep Learning for Computer Vision: Video Analytics (UPC 2016)
Deep Learning for Computer Vision: Video Analytics (UPC 2016)
 
Learning with Unpaired Data
Learning with Unpaired DataLearning with Unpaired Data
Learning with Unpaired Data
 
A novel enhanced algorithm for efficient human tracking
A novel enhanced algorithm for efficient human trackingA novel enhanced algorithm for efficient human tracking
A novel enhanced algorithm for efficient human tracking
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systems
 
Bio-inspired Active Vision System
Bio-inspired Active Vision SystemBio-inspired Active Vision System
Bio-inspired Active Vision System
 
ALife in Humanoid Robots #ecal2013
ALife in Humanoid Robots #ecal2013ALife in Humanoid Robots #ecal2013
ALife in Humanoid Robots #ecal2013
 
image classification
image classificationimage classification
image classification
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and Python
 

More from Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Universitat Politècnica de Catalunya
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
 

Recently uploaded

MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 

Recently uploaded (17)

MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 

Deep Learning for Video: Object Tracking (UPC 2018)

  • 1. @DocXavi Module 6 Deep Learning for Video: Object Tracking 22nd March 2018 Xavier Giró-i-Nieto [http://pagines.uab.cat/mcv/]
  • 2. 2 Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." NIPS 2013 [Project page with code] DNN were firstly used for feature learning to later be used by a tracker. Object Tracking: Feature Extractor only
  • 3. 3 Wang, Lijun, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "Visual Tracking with Fully Convolutional Networks." ICCV 2015. [code] Object Tracking: FCNT: Localization Despite trained for image classification, feature maps in conv5-3 enable object localization...but are not discriminative enough to discriminate between instances of the same class.
  • 4. 4 Wang, Lijun, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "Visual Tracking with Fully Convolutional Networks." ICCV 2015. [code] Object Tracking: FCNT: Localization conv4-3 (specific) conv5-3 (general) On the other hand, feature maps from conv4-3 are more sensitive to intra-class appearance variation…
  • 5. 5 SNet=Specific Network (online update) GNet=General Network (fixed) Object Tracking: FCNT: Localization Wang, Lijun, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. "Visual Tracking with Fully Convolutional Networks." ICCV 2015. [code]
  • 6. 6 Nam, Hyeonseob, and Bohyung Han. "Learning multi-domain convolutional neural networks for visual tracking." CVPR 2016.
  • 7. 7 Nam, Hyeonseob, and Bohyung Han. "Learning multi-domain convolutional neural networks for visual tracking." CVPR 2016. Object Tracking: MDNet
  • 8. 8 Valmadre, Jack, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, and Philip HS Torr. "End-to-end representation learning for Correlation Filter based tracking." CVPR 2017 Object Tracking: CFNET VOT-17 Learned !!
  • 9. 9 Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." NIPS 2013 [Project page with code] Object Tracking: RNN
  • 10. 10P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks,” AAAI 2016 [code]
  • 11. 11 Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking." ISCAS 2017. Object Tracking: ROLO
  • 12. 12 Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking." ISCAS 2017.
  • 13. 13 Girdhar, Rohit, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran. "Detect-and-Track: Efficient Pose Estimation in Videos." CVPR 2018. Keypoint tracking
  • 14. 14 Real, Esteban, Jonathon Shlens, Stefano Mazzocchi, Xin Pan, and Vincent Vanhoucke. "Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video." CVPR 2017. Object tracking: Datasets: YouTube-BB
  • 15. 15 Object Tracking: Challenges Leal-Taixé, Laura, Anton Milan, Konrad Schindler, Daniel Cremers, Ian Reid, and Stefan Roth. "Tracking the trackers: an analysis of the state of the art in multiple object tracking."
  • 16. ● MSc course (2017) ● BSc course (2018) 16 Deep Learning online courses by UPC: ● 1st edition (2016) ● 2nd edition (2017) ● 3rd edition (2018) ● 1st edition (2017) ● 2nd edition (2018) Next edition Autumn 2018 Next edition Winter/Spring 2019Summer School (late June 2018)