Given an input video sequence of one person who conducted a sequence of continuous actions, we consider the problem of jointly segmenting and recognizing actions.
如果一个人做了一系列连续动作,并被拍摄成一段视频,那么如何通过这段视频对动作进行分割和识别是人们要考虑的问题。
Each frame of the input sequence is segmented into arbitrarily shaped image regions (VOP's) such that each VOP describes one semantically meaningful object or video content of interest.
由于输入视频序列的每一帧被分割成任意形状的视频对象平面(VOP),这样每个VOP描述了一个语义意义的对象或所感兴趣的视频内容。
应用推荐