@datafire/google_videointelligence
v6.0.0
Published
DataFire integration for Cloud Video Intelligence API
Downloads
6
Readme
@datafire/google_videointelligence
Client library for Cloud Video Intelligence API
Installation and Usage
npm install --save @datafire/google_videointelligence
let google_videointelligence = require('@datafire/google_videointelligence').create({
access_token: "",
refresh_token: "",
client_id: "",
client_secret: "",
redirect_uri: ""
});
.then(data => {
console.log(data);
});
Description
Detects objects, explicit content, and scene changes in videos. It also specifies the region for annotation and transcribes speech to text. Supports both asynchronous API and streaming API.
Actions
oauthCallback
Exchange the code passed to your redirect URI for an access_token
google_videointelligence.oauthCallback({
"code": ""
}, context)
Input
- input
object
- code required
string
- code required
Output
- output
object
- access_token
string
- refresh_token
string
- token_type
string
- scope
string
- expiration
string
- access_token
oauthRefresh
Exchange a refresh_token for an access_token
google_videointelligence.oauthRefresh(null, context)
Input
This action has no parameters
Output
- output
object
- access_token
string
- refresh_token
string
- token_type
string
- scope
string
- expiration
string
- access_token
videointelligence.videos.annotate
Performs asynchronous video annotation. Progress and results can be retrieved through the google.longrunning.Operations
interface. Operation.metadata
contains AnnotateVideoProgress
(progress). Operation.response
contains AnnotateVideoResponse
(results).
google_videointelligence.videointelligence.videos.annotate({}, context)
Input
- input
object
- body GoogleCloudVideointelligenceV1p3beta1_AnnotateVideoRequest
- $.xgafv
string
(values: 1, 2): V1 error format. - access_token
string
: OAuth access token. - alt
string
(values: json, media, proto): Data format for response. - callback
string
: JSONP - fields
string
: Selector specifying which fields to include in a partial response. - key
string
: API key. Your API key identifies your project and provides you with API access, quota, and reports. Required unless you provide an OAuth 2.0 token. - oauth_token
string
: OAuth 2.0 token for the current user. - prettyPrint
boolean
: Returns response with indentations and line breaks. - quotaUser
string
: Available to use for quota purposes for server-side applications. Can be any arbitrary string assigned to a user, but should not exceed 40 characters. - upload_protocol
string
: Upload protocol for media (e.g. "raw", "multipart"). - uploadType
string
: Legacy upload protocol for media (e.g. "media", "multipart").
Output
- output GoogleLongrunning_Operation
Definitions
GoogleCloudVideointelligenceV1_AnnotateVideoProgress
- GoogleCloudVideointelligenceV1_AnnotateVideoProgress
object
: Video annotation progress. Included in themetadata
field of theOperation
returned by theGetOperation
call of thegoogle::longrunning::Operations
service.- annotationProgress
array
: Progress metadata for all videos specified inAnnotateVideoRequest
.
- annotationProgress
GoogleCloudVideointelligenceV1_AnnotateVideoResponse
- GoogleCloudVideointelligenceV1_AnnotateVideoResponse
object
: Video annotation response. Included in theresponse
field of theOperation
returned by theGetOperation
call of thegoogle::longrunning::Operations
service.- annotationResults
array
: Annotation results for all videos specified inAnnotateVideoRequest
.
- annotationResults
GoogleCloudVideointelligenceV1_DetectedAttribute
- GoogleCloudVideointelligenceV1_DetectedAttribute
object
: A generic detected attribute represented by name in string format.- confidence
number
: Detected attribute confidence. Range [0, 1]. - name
string
: The name of the attribute, for example, glasses, dark_glasses, mouth_open. A full list of supported type names will be provided in the document. - value
string
: Text value of the detection result. For example, the value for "HairColor" can be "black", "blonde", etc.
- confidence
GoogleCloudVideointelligenceV1_DetectedLandmark
- GoogleCloudVideointelligenceV1_DetectedLandmark
object
: A generic detected landmark represented by name in string format and a 2D location.- confidence
number
: The confidence score of the detected landmark. Range [0, 1]. - name
string
: The name of this landmark, for example, left_hand, right_shoulder. - point GoogleCloudVideointelligenceV1_NormalizedVertex
- confidence
GoogleCloudVideointelligenceV1_Entity
- GoogleCloudVideointelligenceV1_Entity
object
: Detected entity from video analysis.- description
string
: Textual description, e.g.,Fixed-gear bicycle
. - entityId
string
: Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API. - languageCode
string
: Language code fordescription
in BCP-47 format.
- description
GoogleCloudVideointelligenceV1_ExplicitContentAnnotation
- GoogleCloudVideointelligenceV1_ExplicitContentAnnotation
object
: Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.- frames
array
: All video frames where explicit content was detected. - version
string
: Feature version.
- frames
GoogleCloudVideointelligenceV1_ExplicitContentFrame
- GoogleCloudVideointelligenceV1_ExplicitContentFrame
object
: Video frame level annotation results for explicit content.- pornographyLikelihood
string
(values: LIKELIHOOD_UNSPECIFIED, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, VERY_LIKELY): Likelihood of the pornography content.. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- pornographyLikelihood
GoogleCloudVideointelligenceV1_FaceAnnotation
- GoogleCloudVideointelligenceV1_FaceAnnotation
object
: Deprecated. No effect.- frames
array
: All video frames where a face was detected. - segments
array
: All video segments where a face was detected. - thumbnail
string
: Thumbnail of a representative face view (in JPEG format).
- frames
GoogleCloudVideointelligenceV1_FaceDetectionAnnotation
- GoogleCloudVideointelligenceV1_FaceDetectionAnnotation
object
: Face detection annotation.- thumbnail
string
: The thumbnail of a person's face. - tracks
array
: The face tracks with attributes. - version
string
: Feature version.
- thumbnail
GoogleCloudVideointelligenceV1_FaceFrame
- GoogleCloudVideointelligenceV1_FaceFrame
object
: Deprecated. No effect.- normalizedBoundingBoxes
array
: Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- normalizedBoundingBoxes
GoogleCloudVideointelligenceV1_FaceSegment
- GoogleCloudVideointelligenceV1_FaceSegment
object
: Video segment level annotation results for face detection.
GoogleCloudVideointelligenceV1_LabelAnnotation
- GoogleCloudVideointelligenceV1_LabelAnnotation
object
: Label annotation.- categoryEntities
array
: Common categories for the detected entity. For example, when the label isTerrier
, the category is likelydog
. And in some cases there might be more than one categories e.g.,Terrier
could also be apet
. - entity GoogleCloudVideointelligenceV1_Entity
- frames
array
: All video frames where a label was detected. - segments
array
: All video segments where a label was detected. - version
string
: Feature version.
- categoryEntities
GoogleCloudVideointelligenceV1_LabelFrame
- GoogleCloudVideointelligenceV1_LabelFrame
object
: Video frame level annotation results for label detection.- confidence
number
: Confidence that the label is accurate. Range: [0, 1]. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- confidence
GoogleCloudVideointelligenceV1_LabelSegment
- GoogleCloudVideointelligenceV1_LabelSegment
object
: Video segment level annotation results for label detection.- confidence
number
: Confidence that the label is accurate. Range: [0, 1]. - segment GoogleCloudVideointelligenceV1_VideoSegment
- confidence
GoogleCloudVideointelligenceV1_LogoRecognitionAnnotation
- GoogleCloudVideointelligenceV1_LogoRecognitionAnnotation
object
: Annotation corresponding to one detected, tracked and recognized logo class.- entity GoogleCloudVideointelligenceV1_Entity
- segments
array
: All video segments where the recognized logo appears. There might be multiple instances of the same logo class appearing in one VideoSegment. - tracks
array
: All logo tracks where the recognized logo appears. Each track corresponds to one logo instance appearing in consecutive frames.
GoogleCloudVideointelligenceV1_NormalizedBoundingBox
- GoogleCloudVideointelligenceV1_NormalizedBoundingBox
object
: Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].- bottom
number
: Bottom Y coordinate. - left
number
: Left X coordinate. - right
number
: Right X coordinate. - top
number
: Top Y coordinate.
- bottom
GoogleCloudVideointelligenceV1_NormalizedBoundingPoly
- GoogleCloudVideointelligenceV1_NormalizedBoundingPoly
object
: Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2 When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.- vertices
array
: Normalized vertices of the bounding polygon.
- vertices
GoogleCloudVideointelligenceV1_NormalizedVertex
- GoogleCloudVideointelligenceV1_NormalizedVertex
object
: A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.- x
number
: X coordinate. - y
number
: Y coordinate.
- x
GoogleCloudVideointelligenceV1_ObjectTrackingAnnotation
- GoogleCloudVideointelligenceV1_ObjectTrackingAnnotation
object
: Annotations corresponding to one tracked object.- confidence
number
: Object category's labeling confidence of this track. - entity GoogleCloudVideointelligenceV1_Entity
- frames
array
: Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames. - segment GoogleCloudVideointelligenceV1_VideoSegment
- trackId
string
: Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time. - version
string
: Feature version.
- confidence
GoogleCloudVideointelligenceV1_ObjectTrackingFrame
- GoogleCloudVideointelligenceV1_ObjectTrackingFrame
object
: Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.- normalizedBoundingBox GoogleCloudVideointelligenceV1_NormalizedBoundingBox
- timeOffset
string
: The timestamp of the frame in microseconds.
GoogleCloudVideointelligenceV1_PersonDetectionAnnotation
- GoogleCloudVideointelligenceV1_PersonDetectionAnnotation
object
: Person detection annotation per video.- tracks
array
: The detected tracks of a person. - version
string
: Feature version.
- tracks
GoogleCloudVideointelligenceV1_SpeechRecognitionAlternative
- GoogleCloudVideointelligenceV1_SpeechRecognitionAlternative
object
: Alternative hypotheses (a.k.a. n-best list).- confidence
number
: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicatingconfidence
was not set. - transcript
string
: Transcript text representing the words that the user spoke. - words
array
: Output only. A list of word-specific information for each recognized word. Note: Whenenable_speaker_diarization
is set to true, you will see all the words from the beginning of the audio.
- confidence
GoogleCloudVideointelligenceV1_SpeechTranscription
- GoogleCloudVideointelligenceV1_SpeechTranscription
object
: A speech recognition result corresponding to a portion of the audio.- alternatives
array
: May contain one or more recognition hypotheses (up to the maximum specified inmax_alternatives
). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer. - languageCode
string
: Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.
- alternatives
GoogleCloudVideointelligenceV1_TextAnnotation
- GoogleCloudVideointelligenceV1_TextAnnotation
object
: Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.- segments
array
: All video segments where OCR detected text appears. - text
string
: The detected text. - version
string
: Feature version.
- segments
GoogleCloudVideointelligenceV1_TextFrame
- GoogleCloudVideointelligenceV1_TextFrame
object
: Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.- rotatedBoundingBox GoogleCloudVideointelligenceV1_NormalizedBoundingPoly
- timeOffset
string
: Timestamp of this frame.
GoogleCloudVideointelligenceV1_TextSegment
- GoogleCloudVideointelligenceV1_TextSegment
object
: Video segment level annotation results for text detection.- confidence
number
: Confidence for the track of detected text. It is calculated as the highest over all frames where OCR detected text appears. - frames
array
: Information related to the frames where OCR detected text appears. - segment GoogleCloudVideointelligenceV1_VideoSegment
- confidence
GoogleCloudVideointelligenceV1_TimestampedObject
- GoogleCloudVideointelligenceV1_TimestampedObject
object
: For tracking related features. An object at time_offset with attributes, and located with normalized_bounding_box.- attributes
array
: Optional. The attributes of the object in the bounding box. - landmarks
array
: Optional. The detected landmarks. - normalizedBoundingBox GoogleCloudVideointelligenceV1_NormalizedBoundingBox
- timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this object.
- attributes
GoogleCloudVideointelligenceV1_Track
- GoogleCloudVideointelligenceV1_Track
object
: A track of an object instance.- attributes
array
: Optional. Attributes in the track level. - confidence
number
: Optional. The confidence score of the tracked object. - segment GoogleCloudVideointelligenceV1_VideoSegment
- timestampedObjects
array
: The object with timestamp and attributes per frame in the track.
- attributes
GoogleCloudVideointelligenceV1_VideoAnnotationProgress
- GoogleCloudVideointelligenceV1_VideoAnnotationProgress
object
: Annotation progress for a single video.- feature
string
(values: FEATURE_UNSPECIFIED, LABEL_DETECTION, SHOT_CHANGE_DETECTION, EXPLICIT_CONTENT_DETECTION, FACE_DETECTION, SPEECH_TRANSCRIPTION, TEXT_DETECTION, OBJECT_TRACKING, LOGO_RECOGNITION, PERSON_DETECTION): Specifies which feature is being tracked if the request contains more than one feature. - inputUri
string
: Video file location in Cloud Storage. - progressPercent
integer
: Approximate percentage processed thus far. Guaranteed to be 100 when fully processed. - segment GoogleCloudVideointelligenceV1_VideoSegment
- startTime
string
: Time when the request was received. - updateTime
string
: Time of the most recent update.
- feature
GoogleCloudVideointelligenceV1_VideoAnnotationResults
- GoogleCloudVideointelligenceV1_VideoAnnotationResults
object
: Annotation results for a single video.- error GoogleRpc_Status
- explicitAnnotation GoogleCloudVideointelligenceV1_ExplicitContentAnnotation
- faceAnnotations
array
: Deprecated. Please useface_detection_annotations
instead. - faceDetectionAnnotations
array
: Face detection annotations. - frameLabelAnnotations
array
: Label annotations on frame level. There is exactly one element for each unique label. - inputUri
string
: Video file location in Cloud Storage. - logoRecognitionAnnotations
array
: Annotations for list of logos detected, tracked and recognized in video. - objectAnnotations
array
: Annotations for list of objects detected and tracked in video. - personDetectionAnnotations
array
: Person detection annotations. - segment GoogleCloudVideointelligenceV1_VideoSegment
- segmentLabelAnnotations
array
: Topical label annotations on video level or user-specified segment level. There is exactly one element for each unique label. - segmentPresenceLabelAnnotations
array
: Presence label annotations on video level or user-specified segment level. There is exactly one element for each unique label. Compared to the existing topicalsegment_label_annotations
, this field presents more fine-grained, segment-level labels detected in video content and is made available only when the client setsLabelDetectionConfig.model
to "builtin/latest" in the request. - shotAnnotations
array
: Shot annotations. Each shot is represented as a video segment. - shotLabelAnnotations
array
: Topical label annotations on shot level. There is exactly one element for each unique label. - shotPresenceLabelAnnotations
array
: Presence label annotations on shot level. There is exactly one element for each unique label. Compared to the existing topicalshot_label_annotations
, this field presents more fine-grained, shot-level labels detected in video content and is made available only when the client setsLabelDetectionConfig.model
to "builtin/latest" in the request. - speechTranscriptions
array
: Speech transcription. - textAnnotations
array
: OCR text detection and tracking. Annotations for list of detected text snippets. Each will have list of frame information associated with it.
GoogleCloudVideointelligenceV1_VideoSegment
- GoogleCloudVideointelligenceV1_VideoSegment
object
: Video segment.- endTimeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the end of the segment (inclusive). - startTimeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the start of the segment (inclusive).
- endTimeOffset
GoogleCloudVideointelligenceV1_WordInfo
- GoogleCloudVideointelligenceV1_WordInfo
object
: Word-specific information for recognized words. Word information is only included in the response when certain request parameters are set, such asenable_word_time_offsets
.- confidence
number
: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicatingconfidence
was not set. - endTime
string
: Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set ifenable_word_time_offsets=true
and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary. - speakerTag
integer
: Output only. A distinct integer value is assigned for every speaker within the audio. This field specifies which one of those speakers was detected to have spoken this word. Value ranges from 1 up to diarization_speaker_count, and is only set if speaker diarization is enabled. - startTime
string
: Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set ifenable_word_time_offsets=true
and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary. - word
string
: The word corresponding to this set of information.
- confidence
GoogleCloudVideointelligenceV1beta2_AnnotateVideoProgress
- GoogleCloudVideointelligenceV1beta2_AnnotateVideoProgress
object
: Video annotation progress. Included in themetadata
field of theOperation
returned by theGetOperation
call of thegoogle::longrunning::Operations
service.- annotationProgress
array
: Progress metadata for all videos specified inAnnotateVideoRequest
.
- annotationProgress
GoogleCloudVideointelligenceV1beta2_AnnotateVideoResponse
- GoogleCloudVideointelligenceV1beta2_AnnotateVideoResponse
object
: Video annotation response. Included in theresponse
field of theOperation
returned by theGetOperation
call of thegoogle::longrunning::Operations
service.- annotationResults
array
: Annotation results for all videos specified inAnnotateVideoRequest
.
- annotationResults
GoogleCloudVideointelligenceV1beta2_DetectedAttribute
- GoogleCloudVideointelligenceV1beta2_DetectedAttribute
object
: A generic detected attribute represented by name in string format.- confidence
number
: Detected attribute confidence. Range [0, 1]. - name
string
: The name of the attribute, for example, glasses, dark_glasses, mouth_open. A full list of supported type names will be provided in the document. - value
string
: Text value of the detection result. For example, the value for "HairColor" can be "black", "blonde", etc.
- confidence
GoogleCloudVideointelligenceV1beta2_DetectedLandmark
- GoogleCloudVideointelligenceV1beta2_DetectedLandmark
object
: A generic detected landmark represented by name in string format and a 2D location.- confidence
number
: The confidence score of the detected landmark. Range [0, 1]. - name
string
: The name of this landmark, for example, left_hand, right_shoulder. - point GoogleCloudVideointelligenceV1beta2_NormalizedVertex
- confidence
GoogleCloudVideointelligenceV1beta2_Entity
- GoogleCloudVideointelligenceV1beta2_Entity
object
: Detected entity from video analysis.- description
string
: Textual description, e.g.,Fixed-gear bicycle
. - entityId
string
: Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API. - languageCode
string
: Language code fordescription
in BCP-47 format.
- description
GoogleCloudVideointelligenceV1beta2_ExplicitContentAnnotation
- GoogleCloudVideointelligenceV1beta2_ExplicitContentAnnotation
object
: Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.- frames
array
: All video frames where explicit content was detected. - version
string
: Feature version.
- frames
GoogleCloudVideointelligenceV1beta2_ExplicitContentFrame
- GoogleCloudVideointelligenceV1beta2_ExplicitContentFrame
object
: Video frame level annotation results for explicit content.- pornographyLikelihood
string
(values: LIKELIHOOD_UNSPECIFIED, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, VERY_LIKELY): Likelihood of the pornography content.. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- pornographyLikelihood
GoogleCloudVideointelligenceV1beta2_FaceAnnotation
- GoogleCloudVideointelligenceV1beta2_FaceAnnotation
object
: Deprecated. No effect.- frames
array
: All video frames where a face was detected. - segments
array
: All video segments where a face was detected. - thumbnail
string
: Thumbnail of a representative face view (in JPEG format).
- frames
GoogleCloudVideointelligenceV1beta2_FaceDetectionAnnotation
- GoogleCloudVideointelligenceV1beta2_FaceDetectionAnnotation
object
: Face detection annotation.- thumbnail
string
: The thumbnail of a person's face. - tracks
array
: The face tracks with attributes. - version
string
: Feature version.
- thumbnail
GoogleCloudVideointelligenceV1beta2_FaceFrame
- GoogleCloudVideointelligenceV1beta2_FaceFrame
object
: Deprecated. No effect.- normalizedBoundingBoxes
array
: Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- normalizedBoundingBoxes
GoogleCloudVideointelligenceV1beta2_FaceSegment
- GoogleCloudVideointelligenceV1beta2_FaceSegment
object
: Video segment level annotation results for face detection.
GoogleCloudVideointelligenceV1beta2_LabelAnnotation
- GoogleCloudVideointelligenceV1beta2_LabelAnnotation
object
: Label annotation.- categoryEntities
array
: Common categories for the detected entity. For example, when the label isTerrier
, the category is likelydog
. And in some cases there might be more than one categories e.g.,Terrier
could also be apet
. - entity GoogleCloudVideointelligenceV1beta2_Entity
- frames
array
: All video frames where a label was detected. - segments
array
: All video segments where a label was detected. - version
string
: Feature version.
- categoryEntities
GoogleCloudVideointelligenceV1beta2_LabelFrame
- GoogleCloudVideointelligenceV1beta2_LabelFrame
object
: Video frame level annotation results for label detection.- confidence
number
: Confidence that the label is accurate. Range: [0, 1]. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- confidence
GoogleCloudVideointelligenceV1beta2_LabelSegment
- GoogleCloudVideointelligenceV1beta2_LabelSegment
object
: Video segment level annotation results for label detection.- confidence
number
: Confidence that the label is accurate. Range: [0, 1]. - segment GoogleCloudVideointelligenceV1beta2_VideoSegment
- confidence
GoogleCloudVideointelligenceV1beta2_LogoRecognitionAnnotation
- GoogleCloudVideointelligenceV1beta2_LogoRecognitionAnnotation
object
: Annotation corresponding to one detected, tracked and recognized logo class.- entity GoogleCloudVideointelligenceV1beta2_Entity
- segments
array
: All video segments where the recognized logo appears. There might be multiple instances of the same logo class appearing in one VideoSegment. - tracks
array
: All logo tracks where the recognized logo appears. Each track corresponds to one logo instance appearing in consecutive frames.
GoogleCloudVideointelligenceV1beta2_NormalizedBoundingBox
- GoogleCloudVideointelligenceV1beta2_NormalizedBoundingBox
object
: Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].- bottom
number
: Bottom Y coordinate. - left
number
: Left X coordinate. - right
number
: Right X coordinate. - top
number
: Top Y coordinate.
- bottom
GoogleCloudVideointelligenceV1beta2_NormalizedBoundingPoly
- GoogleCloudVideointelligenceV1beta2_NormalizedBoundingPoly
object
: Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2 When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.- vertices
array
: Normalized vertices of the bounding polygon.
- vertices
GoogleCloudVideointelligenceV1beta2_NormalizedVertex
- GoogleCloudVideointelligenceV1beta2_NormalizedVertex
object
: A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.- x
number
: X coordinate. - y
number
: Y coordinate.
- x
GoogleCloudVideointelligenceV1beta2_ObjectTrackingAnnotation
- GoogleCloudVideointelligenceV1beta2_ObjectTrackingAnnotation
object
: Annotations corresponding to one tracked object.- confidence
number
: Object category's labeling confidence of this track. - entity GoogleCloudVideointelligenceV1beta2_Entity
- frames
array
: Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames. - segment GoogleCloudVideointelligenceV1beta2_VideoSegment
- trackId
string
: Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time. - version
string
: Feature version.
- confidence
GoogleCloudVideointelligenceV1beta2_ObjectTrackingFrame
- GoogleCloudVideointelligenceV1beta2_ObjectTrackingFrame
object
: Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.- normalizedBoundingBox GoogleCloudVideointelligenceV1beta2_NormalizedBoundingBox
- timeOffset
string
: The timestamp of the frame in microseconds.
GoogleCloudVideointelligenceV1beta2_PersonDetectionAnnotation
- GoogleCloudVideointelligenceV1beta2_PersonDetectionAnnotation
object
: Person detection annotation per video.- tracks
array
: The detected tracks of a person. - version
string
: Feature version.
- tracks
GoogleCloudVideointelligenceV1beta2_SpeechRecognitionAlternative
- GoogleCloudVideointelligenceV1beta2_SpeechRecognitionAlternative
object
: Alternative hypotheses (a.k.a. n-best list).- confidence
number
: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicatingconfidence
was not set. - transcript
string
: Transcript text representing the words that the user spoke. - words
array
: Output only. A list of word-specific information for each recognized word. Note: Whenenable_speaker_diarization
is set to true, you will see all the words from the beginning of the audio.
- confidence
GoogleCloudVideointelligenceV1beta2_SpeechTranscription
- GoogleCloudVideointelligenceV1beta2_SpeechTranscription
object
: A speech recognition result corresponding to a portion of the audio.- alternatives
array
: May contain one or more recognition hypotheses (up to the maximum specified inmax_alternatives
). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer. - languageCode
string
: Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.
- alternatives
GoogleCloudVideointelligenceV1beta2_TextAnnotation
- GoogleCloudVideointelligenceV1beta2_TextAnnotation
object
: Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.- segments
array
: All video segments where OCR detected text appears. - text
string
: The detected text. - version
string
: Feature version.
- segments
GoogleCloudVideointelligenceV1beta2_TextFrame
- GoogleCloudVideointelligenceV1beta2_TextFrame
object
: Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.- rotatedBoundingBox GoogleCloudVideointelligenceV1beta2_NormalizedBoundingPoly
- timeOffset
string
: Timestamp of this frame.
GoogleCloudVideointelligenceV1beta2_TextSegment
- GoogleCloudVideointelligenceV1beta2_TextSegment
object
: Video segment level annotation results for text detection.- confidence
number
: Confidence for the track of detected text. It is calculated as the highest over all frames where OCR detected text appears. - frames
array
: Information related to the frames where OCR detected text appears. - segment GoogleCloudVideointelligenceV1beta2_VideoSegment
- confidence
GoogleCloudVideointelligenceV1beta2_TimestampedObject
- GoogleCloudVideointelligenceV1beta2_TimestampedObject
object
: For tracking related features. An object at time_offset with attributes, and located with normalized_bounding_box.- attributes
array
: Optional. The attributes of the object in the bounding box. - landmarks
array
: Optional. The detected landmarks. - normalizedBoundingBox GoogleCloudVideointelligenceV1beta2_NormalizedBoundingBox
- timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this object.
- attributes
GoogleCloudVideointelligenceV1beta2_Track
- GoogleCloudVideointelligenceV1beta2_Track
object
: A track of an object instance.- attributes
array
: Optional. Attributes in the track level. - confidence
number
: Optional. The confidence score of the tracked object. - segment GoogleCloudVideointelligenceV1beta2_VideoSegment
- timestampedObjects
array
: The object with timestamp and attributes per frame in the track.
- attributes
GoogleCloudVideointelligenceV1beta2_VideoAnnotationProgress
- GoogleCloudVideointelligenceV1beta2_VideoAnnotationProgress
object
: Annotation progress for a single video.- feature
string
(values: FEATURE_UNSPECIFIED, LABEL_DETECTION, SHOT_CHANGE_DETECTION, EXPLICIT_CONTENT_DETECTION, FACE_DETECTION, SPEECH_TRANSCRIPTION, TEXT_DETECTION, OBJECT_TRACKING, LOGO_RECOGNITION, PERSON_DETECTION): Specifies which feature is being tracked if the request contains more than one feature. - inputUri
string
: Video file location in Cloud Storage. - progressPercent
integer
: Approximate percentage processed thus far. Guaranteed to be 100 when fully processed. - segment GoogleCloudVideointelligenceV1beta2_VideoSegment
- startTime
string
: Time when the request was received. - updateTime
string
: Time of the most recent update.
- feature
GoogleCloudVideointelligenceV1beta2_VideoAnnotationResults
- GoogleCloudVideointelligenceV1beta2_VideoAnnotationResults
object
: Annotation results for a single video.- error GoogleRpc_Status
- explicitAnnotation GoogleCloudVideointelligenceV1beta2_ExplicitContentAnnotation
- faceAnnotations
array
: Deprecated. Please useface_detection_annotations
instead. - faceDetectionAnnotations
array
: Face detection annotations. - frameLabelAnnotations
array
: Label annotations on frame level. There is exactly one element for each unique label. - inputUri
string
: Video file location in Cloud Storage. - logoRecognitionAnnotations
array
: Annotations for list of logos detected, tracked and recognized in video. - objectAnnotations
array
: Annotations for list of objects detected and tracked in video. - personDetectionAnnotations
array
: Person detection annotations. - segment GoogleCloudVideointelligenceV1beta2_VideoSegment
- segmentLabelAnnotations
array
: Topical label annotations on video level or user-specified segment level. There is exactly one element for each unique label. - segmentPresenceLabelAnnotations
array
: Presence label annotations on video level or user-specified segment level. There is exactly one element for each unique label. Compared to the existing topicalsegment_label_annotations
, this field presents more fine-grained, segment-level labels detected in video content and is made available only when the client setsLabelDetectionConfig.model
to "builtin/latest" in the request. - shotAnnotations
array
: Shot annotations. Each shot is represented as a video segment. - shotLabelAnnotations
array
: Topical label annotations on shot level. There is exactly one element for each unique label. - shotPresenceLabelAnnotations
array
: Presence label annotations on shot level. There is exactly one element for each unique label. Compared to the existing topicalshot_label_annotations
, this field presents more fine-grained, shot-level labels detected in video content and is made available only when the client setsLabelDetectionConfig.model
to "builtin/latest" in the request. - speechTranscriptions
array
: Speech transcription. - textAnnotations
array
: OCR text detection and tracking. Annotations for list of detected text snippets. Each will have list of frame information associated with it.
GoogleCloudVideointelligenceV1beta2_VideoSegment
- GoogleCloudVideointelligenceV1beta2_VideoSegment
object
: Video segment.- endTimeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the end of the segment (inclusive). - startTimeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the start of the segment (inclusive).
- endTimeOffset
GoogleCloudVideointelligenceV1beta2_WordInfo
- GoogleCloudVideointelligenceV1beta2_WordInfo
object
: Word-specific information for recognized words. Word information is only included in the response when certain request parameters are set, such asenable_word_time_offsets
.- confidence
number
: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicatingconfidence
was not set. - endTime
string
: Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set ifenable_word_time_offsets=true
and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary. - speakerTag
integer
: Output only. A distinct integer value is assigned for every speaker within the audio. This field specifies which one of those speakers was detected to have spoken this word. Value ranges from 1 up to diarization_speaker_count, and is only set if speaker diarization is enabled. - startTime
string
: Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set ifenable_word_time_offsets=true
and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary. - word
string
: The word corresponding to this set of information.
- confidence
GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoProgress
- GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoProgress
object
: Video annotation progress. Included in themetadata
field of theOperation
returned by theGetOperation
call of thegoogle::longrunning::Operations
service.- annotationProgress
array
: Progress metadata for all videos specified inAnnotateVideoRequest
.
- annotationProgress
GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoResponse
- GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoResponse
object
: Video annotation response. Included in theresponse
field of theOperation
returned by theGetOperation
call of thegoogle::longrunning::Operations
service.- annotationResults
array
: Annotation results for all videos specified inAnnotateVideoRequest
.
- annotationResults
GoogleCloudVideointelligenceV1p1beta1_DetectedAttribute
- GoogleCloudVideointelligenceV1p1beta1_DetectedAttribute
object
: A generic detected attribute represented by name in string format.- confidence
number
: Detected attribute confidence. Range [0, 1]. - name
string
: The name of the attribute, for example, glasses, dark_glasses, mouth_open. A full list of supported type names will be provided in the document. - value
string
: Text value of the detection result. For example, the value for "HairColor" can be "black", "blonde", etc.
- confidence
GoogleCloudVideointelligenceV1p1beta1_DetectedLandmark
- GoogleCloudVideointelligenceV1p1beta1_DetectedLandmark
object
: A generic detected landmark represented by name in string format and a 2D location.- confidence
number
: The confidence score of the detected landmark. Range [0, 1]. - name
string
: The name of this landmark, for example, left_hand, right_shoulder. - point GoogleCloudVideointelligenceV1p1beta1_NormalizedVertex
- confidence
GoogleCloudVideointelligenceV1p1beta1_Entity
- GoogleCloudVideointelligenceV1p1beta1_Entity
object
: Detected entity from video analysis.- description
string
: Textual description, e.g.,Fixed-gear bicycle
. - entityId
string
: Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API. - languageCode
string
: Language code fordescription
in BCP-47 format.
- description
GoogleCloudVideointelligenceV1p1beta1_ExplicitContentAnnotation
- GoogleCloudVideointelligenceV1p1beta1_ExplicitContentAnnotation
object
: Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.- frames
array
: All video frames where explicit content was detected. - version
string
: Feature version.
- frames
GoogleCloudVideointelligenceV1p1beta1_ExplicitContentFrame
- GoogleCloudVideointelligenceV1p1beta1_ExplicitContentFrame
object
: Video frame level annotation results for explicit content.- pornographyLikelihood
string
(values: LIKELIHOOD_UNSPECIFIED, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, VERY_LIKELY): Likelihood of the pornography content.. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- pornographyLikelihood
GoogleCloudVideointelligenceV1p1beta1_FaceAnnotation
- GoogleCloudVideointelligenceV1p1beta1_FaceAnnotation
object
: Deprecated. No effect.- frames
array
: All video frames where a face was detected. - segments
array
: All video segments where a face was detected. - thumbnail
string
: Thumbnail of a representative face view (in JPEG format).
- frames
GoogleCloudVideointelligenceV1p1beta1_FaceDetectionAnnotation
- GoogleCloudVideointelligenceV1p1beta1_FaceDetectionAnnotation
object
: Face detection annotation.- thumbnail
string
: The thumbnail of a person's face. - tracks
array
: The face tracks with attributes. - version
string
: Feature version.
- thumbnail
GoogleCloudVideointelligenceV1p1beta1_FaceFrame
- GoogleCloudVideointelligenceV1p1beta1_FaceFrame
object
: Deprecated. No effect.- normalizedBoundingBoxes
array
: Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- normalizedBoundingBoxes
GoogleCloudVideointelligenceV1p1beta1_FaceSegment
- GoogleCloudVideointelligenceV1p1beta1_FaceSegment
object
: Video segment level annotation results for face detection.
GoogleCloudVideointelligenceV1p1beta1_LabelAnnotation
- GoogleCloudVideointelligenceV1p1beta1_LabelAnnotation
object
: Label annotation.- categoryEntities
array
: Common categories for the detected entity. For example, when the label isTerrier
, the category is likelydog
. And in some cases there might be more than one categories e.g.,Terrier
could also be apet
. - entity GoogleCloudVideointelligenceV1p1beta1_Entity
- frames
array
: All video frames where a label was detected. - segments
array
: All video segments where a label was detected. - version
string
: Feature version.
- categoryEntities
GoogleCloudVideointelligenceV1p1beta1_LabelFrame
- GoogleCloudVideointelligenceV1p1beta1_LabelFrame
object
: Video frame level annotation results for label detection.- confidence
number
: Confidence that the label is accurate. Range: [0, 1]. - timeOffset
string
: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.
- confidence
GoogleCloudVideointelligenceV1p1beta1_LabelSegment
- GoogleCloudVideointelligenceV1p1beta1_LabelSegment
object
: Video segment level annotation results for label detection.- confidence
number
: Confidence that the label is accurate. Range: [0, 1]. - segment GoogleCloudVideointelligenceV1p1beta1_VideoSegment
- confidence
GoogleCloudVideointelligenceV1p1beta1_LogoRecognitionAnnotation
- GoogleCloudVideointelligenceV1p1beta1_LogoRecognitionAnnotation
object
: Annotation corresponding to one detected, tracked and recognized logo class.- entity GoogleCloudVideointelligenceV1p1beta1_Entity
- segments
array
: All video segments where the recognized logo appears. There might be multiple instances of the same logo class appearing in one VideoSegment. - tracks
array
: All logo tracks where the recognized logo appears. Each track corresponds to one logo instance appearing in consecutive frames.
GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingBox
- GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingBox
object
: Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].- bottom
number
: Bottom Y coordinate. - left
number
: Left X coordinate. - right
number
: Right X coordinate. - top
number
: Top Y coordinate.
- bottom
GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingPoly
- GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingPoly
object
: Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2 When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.- vertices
array
: Normalized vertices of the bounding polygon.
- vertices
GoogleCloudVideointelligenceV1p1beta1_NormalizedVertex
- GoogleCloudVideointelligenceV1p1beta1_NormalizedVertex
object
: A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.- x
number
: X coordinate. - y
number
: Y coordinate.
- x
GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingAnnotation
- GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingAnnotation
object
: Annotations corresponding to one tracked object.- confidence
number
: Object category's labeling confidence of this track. - entity GoogleCloudVideointelligenceV1p1beta1_Entity
- frames
array
: Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames. - segment GoogleCloudVideointelligenceV1p1beta1_VideoSegment
- trackId
string
: Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time. - version
string
: Feature version.
- confidence
GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingFrame
- GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingFrame
object
: Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.- normalizedBoundingBox GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingBox
- timeOffset
string
: The timestamp of the frame in microseconds.
GoogleCloudVideointelligenceV1p1beta1_PersonDetectionAnnotation
- GoogleCloudVideointelligenceV1p1beta1_PersonDetectionAnnotation
object
: Person detection annotation per video.- tracks
array
: The detected tracks of a person. - version
string
: Feature version.
- tracks
GoogleCloudVideointelligenceV1p1beta1_SpeechRecognitionAlternative
- GoogleCloudVideointelligenceV1p1beta1_SpeechRecognitionAlternative
object
: Alternative hypotheses (a.k.a. n-best list).- confidence
number
: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicatingconfidence
was not set. - transcript
string
: Transcript text representing the words that the user spoke. - words
array
: Output only. A list of word-specific information for each recognized word. Note: Whenenable_speaker_diarization
is set to true, you will see all the words from the beginning of the audio.
- confidence
GoogleCloudVideointelligenceV1p1beta1_SpeechTranscription
- GoogleCloudVideointelligenceV1p1beta1_SpeechTranscription
object
: A speech recognition result corresponding to a portion of the audio.- alternatives
array
: May contain one or more recognition hypotheses (up to the maximum specified inmax_alternatives
). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer. - languageCode
string
: Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.
- alternatives
GoogleCloudVideointelligenceV1p1beta1_TextAnnotation
- GoogleCloudVideointelligenceV1p1beta1_TextAnnotation
object
: Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.- segments
array
: All video segments where OCR detected text appears. - text
string
: The detected text. - version
string
: Feature version.
- segments
GoogleCloudVideointelligenceV1p1beta1_TextFrame
- GoogleCloudVideointelligenceV1p1beta1_TextFrame
object
: Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.- rotatedBoundingBox GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingPoly
- timeOffset
string
: Timestamp of this frame.
GoogleCloudVideointelligenceV1p1beta1_TextSegment
- GoogleCloudVideointelligenceV1p1beta1_TextSegment
object
: Video segment level annotation results for text detection.- confidence
number
: Confidence for the track of detected text. It is calculated as the highest over all frames where OCR detected text appears. - frames
array
: Information related to the frames where OCR detected text appears. - segme
- confidence