English  |  正體中文  |  简体中文  |  Items with full text/Total items : 28611/40652
Visitors : 763649      Online Users : 68
RC Version 4.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Adv. Search
LoginUploadHelpAboutAdminister

Please use this identifier to cite or link to this item: http://ntour.ntou.edu.tw:8080/ir/handle/987654321/53900

Title: 基於RGB-D視訊之空間時間對稱樣式非監督式學習及其人類行為分析之應用
Unsupervised Learning of Space-time Symmetric Patterns in RGB-D Videos for Human Activity Detection
Authors: Chen, Yun-Jhu
陳韻竹
Contributors: NTOU:Department of Computer Science and Engineering
國立臺灣海洋大學:資訊工程學系
Keywords: 視訊行為分析;分類
video activity analysis;classification
Date: 2018
Issue Date: 2020-07-09T02:41:28Z
Abstract: 本論文中,提出了在視訊鏡頭中使用3D矩量方法取得時間空間活動向量圖的方法。傳統的影片分類方法是將影片分割成影像片段後,再對於每一影像片段之關鍵畫面進行分類,並沒有考慮到圖片之間的時間關聯性特徵。從影片中分割出的關鍵畫面,前後關鍵畫面間存在之物件運動特徵是行為分析的重要特徵,所以本論文提出先對視訊序列進行前處理,取得重疊之影像片段間的隱含運動向量特徵,結合原始影像資料,輸入以卷積神經網路為基礎之遞迴神經網路(recurrent neural network)進行視訊行為分析。 本論文主要分為兩大部分:取得視訊片段運動向量的資料前處理;結合原始影像資料,輸入以卷積神經網路為基礎之遞迴神經網路(recurrent neural network)進行視訊行為分析。運動向量資料前處理首先是將包含特定行為類別的視訊畫面切割成多個互相重疊的視訊片段。定義每個視訊片段之中間畫面為關鍵畫面,以關鍵畫面之邊緣點為中心,將每個視訊片段分成多個視訊立方體,並利用本論文提出之3D矩量方法擷取各個局部立方體中的物件時間空間特徵,其中之空間特徵表示物件之局部外型;時間特徵表示局部立方體之物件運動特徵,蒐集關鍵畫面各像素的時間空間特徵,我們建構關鍵畫面之時間空間特徵影像。本論文提出之前處理轉換視訊各畫面為相對應之關鍵畫面的時間空間特徵影像。利用這些轉換之時間空間特徵影像序列,輸入到遞迴神經網路(RNN),進一步組合時間空間特徵序列影像,強化準確行為分析所需要的時間空間特徵。本論文提出之時間空間特徵影像表示法,可以提供以原始影像序列之進一步時間空間特徵,所以能夠有較準確的行為分析實驗結果。
This paper presents an approach to obtain the space-time motion vectors in a video shot using the proposed 3D moment method. The recognition of human activity in a video can be formulated as the problem of video classification by first segmenting the input video into video shots, estimating the activity scores of individual shots using a classifier, and accumulating the scores of consecutive shots to determine the activity boundary. However, the accuracy of video classification degrades dramatically when time-related features, i.e. motion vectors or optical flows, across frames are not considered in the training procedure. This paper proposes an approach to embed the implicit motion vector map into individual frames of a video. This enlarges the number of channels of a frame from 3(4) to 6; the first three channels are the original RGB(RGB-D) data and the remainders are the three-dimensional space-time motion vector. The effectiveness of the space-time motion features are first verified by a k-nnr-based video classifier which uses kernel-PCA to construct the activity voting dictionary for activity detection. Also the motion feature maps are embedded into video shots and used to train the recurrent neural network (RNN) for further analyzing the behavior in a video. Using the 3D moment method, the proposed video classifier includes the pixel-wise space-time motion vector which both considers the appearance changing and temporal motion of objects across frames in a video to achieve of the goal of improving the accuracy of human activity recognition and detection. Experimental results demonstrate the effectiveness of the proposed method in terms of recognition accuracy and execution speed.
URI: http://ethesys.lib.ntou.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=G0010557026.id
http://ntour.ntou.edu.tw:8080/ir/handle/987654321/53900
Appears in Collections:[資訊工程學系] 博碩士論文

Files in This Item:

File SizeFormat
index.html0KbHTML13View/Open


All items in NTOUR are protected by copyright, with all rights reserved.

 


著作權政策宣告: 本網站之內容為國立臺灣海洋大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,請合理使用本網站之內容,以尊重著作權人之權益。
網站維護: 海大圖資處 圖書系統組
DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback