npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@datafire/google_videointelligence

v6.0.0

Published

DataFire integration for Cloud Video Intelligence API

Downloads

6

Readme

@datafire/google_videointelligence

Client library for Cloud Video Intelligence API

Installation and Usage

npm install --save @datafire/google_videointelligence
let google_videointelligence = require('@datafire/google_videointelligence').create({
  access_token: "",
  refresh_token: "",
  client_id: "",
  client_secret: "",
  redirect_uri: ""
});

.then(data => {
  console.log(data);
});

Description

Detects objects, explicit content, and scene changes in videos. It also specifies the region for annotation and transcribes speech to text. Supports both asynchronous API and streaming API.

Actions

oauthCallback

Exchange the code passed to your redirect URI for an access_token

google_videointelligence.oauthCallback({
  "code": ""
}, context)

Input

  • input object
    • code required string

Output

  • output object
    • access_token string
    • refresh_token string
    • token_type string
    • scope string
    • expiration string

oauthRefresh

Exchange a refresh_token for an access_token

google_videointelligence.oauthRefresh(null, context)

Input

This action has no parameters

Output

  • output object
    • access_token string
    • refresh_token string
    • token_type string
    • scope string
    • expiration string

videointelligence.videos.annotate

Performs asynchronous video annotation. Progress and results can be retrieved through the google.longrunning.Operations interface. Operation.metadata contains AnnotateVideoProgress (progress). Operation.response contains AnnotateVideoResponse (results).

google_videointelligence.videointelligence.videos.annotate({}, context)

Input

  • input object
    • body GoogleCloudVideointelligenceV1p3beta1_AnnotateVideoRequest
    • $.xgafv string (values: 1, 2): V1 error format.
    • access_token string: OAuth access token.
    • alt string (values: json, media, proto): Data format for response.
    • callback string: JSONP
    • fields string: Selector specifying which fields to include in a partial response.
    • key string: API key. Your API key identifies your project and provides you with API access, quota, and reports. Required unless you provide an OAuth 2.0 token.
    • oauth_token string: OAuth 2.0 token for the current user.
    • prettyPrint boolean: Returns response with indentations and line breaks.
    • quotaUser string: Available to use for quota purposes for server-side applications. Can be any arbitrary string assigned to a user, but should not exceed 40 characters.
    • upload_protocol string: Upload protocol for media (e.g. "raw", "multipart").
    • uploadType string: Legacy upload protocol for media (e.g. "media", "multipart").

Output

Definitions

GoogleCloudVideointelligenceV1_AnnotateVideoProgress

  • GoogleCloudVideointelligenceV1_AnnotateVideoProgress object: Video annotation progress. Included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

GoogleCloudVideointelligenceV1_AnnotateVideoResponse

  • GoogleCloudVideointelligenceV1_AnnotateVideoResponse object: Video annotation response. Included in the response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

GoogleCloudVideointelligenceV1_DetectedAttribute

  • GoogleCloudVideointelligenceV1_DetectedAttribute object: A generic detected attribute represented by name in string format.
    • confidence number: Detected attribute confidence. Range [0, 1].
    • name string: The name of the attribute, for example, glasses, dark_glasses, mouth_open. A full list of supported type names will be provided in the document.
    • value string: Text value of the detection result. For example, the value for "HairColor" can be "black", "blonde", etc.

GoogleCloudVideointelligenceV1_DetectedLandmark

  • GoogleCloudVideointelligenceV1_DetectedLandmark object: A generic detected landmark represented by name in string format and a 2D location.

GoogleCloudVideointelligenceV1_Entity

  • GoogleCloudVideointelligenceV1_Entity object: Detected entity from video analysis.
    • description string: Textual description, e.g., Fixed-gear bicycle.
    • entityId string: Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API.
    • languageCode string: Language code for description in BCP-47 format.

GoogleCloudVideointelligenceV1_ExplicitContentAnnotation

  • GoogleCloudVideointelligenceV1_ExplicitContentAnnotation object: Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.

GoogleCloudVideointelligenceV1_ExplicitContentFrame

  • GoogleCloudVideointelligenceV1_ExplicitContentFrame object: Video frame level annotation results for explicit content.
    • pornographyLikelihood string (values: LIKELIHOOD_UNSPECIFIED, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, VERY_LIKELY): Likelihood of the pornography content..
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1_FaceAnnotation

GoogleCloudVideointelligenceV1_FaceDetectionAnnotation

  • GoogleCloudVideointelligenceV1_FaceDetectionAnnotation object: Face detection annotation.

GoogleCloudVideointelligenceV1_FaceFrame

  • GoogleCloudVideointelligenceV1_FaceFrame object: Deprecated. No effect.
    • normalizedBoundingBoxes array: Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame.
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1_FaceSegment

GoogleCloudVideointelligenceV1_LabelAnnotation

GoogleCloudVideointelligenceV1_LabelFrame

  • GoogleCloudVideointelligenceV1_LabelFrame object: Video frame level annotation results for label detection.
    • confidence number: Confidence that the label is accurate. Range: [0, 1].
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1_LabelSegment

  • GoogleCloudVideointelligenceV1_LabelSegment object: Video segment level annotation results for label detection.

GoogleCloudVideointelligenceV1_LogoRecognitionAnnotation

GoogleCloudVideointelligenceV1_NormalizedBoundingBox

  • GoogleCloudVideointelligenceV1_NormalizedBoundingBox object: Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].
    • bottom number: Bottom Y coordinate.
    • left number: Left X coordinate.
    • right number: Right X coordinate.
    • top number: Top Y coordinate.

GoogleCloudVideointelligenceV1_NormalizedBoundingPoly

  • GoogleCloudVideointelligenceV1_NormalizedBoundingPoly object: Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2 When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.

GoogleCloudVideointelligenceV1_NormalizedVertex

  • GoogleCloudVideointelligenceV1_NormalizedVertex object: A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
    • x number: X coordinate.
    • y number: Y coordinate.

GoogleCloudVideointelligenceV1_ObjectTrackingAnnotation

  • GoogleCloudVideointelligenceV1_ObjectTrackingAnnotation object: Annotations corresponding to one tracked object.
    • confidence number: Object category's labeling confidence of this track.
    • entity GoogleCloudVideointelligenceV1_Entity
    • frames array: Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames.
    • segment GoogleCloudVideointelligenceV1_VideoSegment
    • trackId string: Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time.
    • version string: Feature version.

GoogleCloudVideointelligenceV1_ObjectTrackingFrame

  • GoogleCloudVideointelligenceV1_ObjectTrackingFrame object: Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.

GoogleCloudVideointelligenceV1_PersonDetectionAnnotation

  • GoogleCloudVideointelligenceV1_PersonDetectionAnnotation object: Person detection annotation per video.

GoogleCloudVideointelligenceV1_SpeechRecognitionAlternative

  • GoogleCloudVideointelligenceV1_SpeechRecognitionAlternative object: Alternative hypotheses (a.k.a. n-best list).
    • confidence number: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
    • transcript string: Transcript text representing the words that the user spoke.
    • words array: Output only. A list of word-specific information for each recognized word. Note: When enable_speaker_diarization is set to true, you will see all the words from the beginning of the audio.

GoogleCloudVideointelligenceV1_SpeechTranscription

  • GoogleCloudVideointelligenceV1_SpeechTranscription object: A speech recognition result corresponding to a portion of the audio.
    • alternatives array: May contain one or more recognition hypotheses (up to the maximum specified in max_alternatives). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer.
    • languageCode string: Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.

GoogleCloudVideointelligenceV1_TextAnnotation

  • GoogleCloudVideointelligenceV1_TextAnnotation object: Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.

GoogleCloudVideointelligenceV1_TextFrame

  • GoogleCloudVideointelligenceV1_TextFrame object: Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.

GoogleCloudVideointelligenceV1_TextSegment

  • GoogleCloudVideointelligenceV1_TextSegment object: Video segment level annotation results for text detection.

GoogleCloudVideointelligenceV1_TimestampedObject

GoogleCloudVideointelligenceV1_Track

GoogleCloudVideointelligenceV1_VideoAnnotationProgress

  • GoogleCloudVideointelligenceV1_VideoAnnotationProgress object: Annotation progress for a single video.
    • feature string (values: FEATURE_UNSPECIFIED, LABEL_DETECTION, SHOT_CHANGE_DETECTION, EXPLICIT_CONTENT_DETECTION, FACE_DETECTION, SPEECH_TRANSCRIPTION, TEXT_DETECTION, OBJECT_TRACKING, LOGO_RECOGNITION, PERSON_DETECTION): Specifies which feature is being tracked if the request contains more than one feature.
    • inputUri string: Video file location in Cloud Storage.
    • progressPercent integer: Approximate percentage processed thus far. Guaranteed to be 100 when fully processed.
    • segment GoogleCloudVideointelligenceV1_VideoSegment
    • startTime string: Time when the request was received.
    • updateTime string: Time of the most recent update.

GoogleCloudVideointelligenceV1_VideoAnnotationResults

GoogleCloudVideointelligenceV1_VideoSegment

  • GoogleCloudVideointelligenceV1_VideoSegment object: Video segment.
    • endTimeOffset string: Time-offset, relative to the beginning of the video, corresponding to the end of the segment (inclusive).
    • startTimeOffset string: Time-offset, relative to the beginning of the video, corresponding to the start of the segment (inclusive).

GoogleCloudVideointelligenceV1_WordInfo

  • GoogleCloudVideointelligenceV1_WordInfo object: Word-specific information for recognized words. Word information is only included in the response when certain request parameters are set, such as enable_word_time_offsets.
    • confidence number: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
    • endTime string: Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
    • speakerTag integer: Output only. A distinct integer value is assigned for every speaker within the audio. This field specifies which one of those speakers was detected to have spoken this word. Value ranges from 1 up to diarization_speaker_count, and is only set if speaker diarization is enabled.
    • startTime string: Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
    • word string: The word corresponding to this set of information.

GoogleCloudVideointelligenceV1beta2_AnnotateVideoProgress

  • GoogleCloudVideointelligenceV1beta2_AnnotateVideoProgress object: Video annotation progress. Included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

GoogleCloudVideointelligenceV1beta2_AnnotateVideoResponse

  • GoogleCloudVideointelligenceV1beta2_AnnotateVideoResponse object: Video annotation response. Included in the response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

GoogleCloudVideointelligenceV1beta2_DetectedAttribute

  • GoogleCloudVideointelligenceV1beta2_DetectedAttribute object: A generic detected attribute represented by name in string format.
    • confidence number: Detected attribute confidence. Range [0, 1].
    • name string: The name of the attribute, for example, glasses, dark_glasses, mouth_open. A full list of supported type names will be provided in the document.
    • value string: Text value of the detection result. For example, the value for "HairColor" can be "black", "blonde", etc.

GoogleCloudVideointelligenceV1beta2_DetectedLandmark

  • GoogleCloudVideointelligenceV1beta2_DetectedLandmark object: A generic detected landmark represented by name in string format and a 2D location.

GoogleCloudVideointelligenceV1beta2_Entity

  • GoogleCloudVideointelligenceV1beta2_Entity object: Detected entity from video analysis.
    • description string: Textual description, e.g., Fixed-gear bicycle.
    • entityId string: Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API.
    • languageCode string: Language code for description in BCP-47 format.

GoogleCloudVideointelligenceV1beta2_ExplicitContentAnnotation

  • GoogleCloudVideointelligenceV1beta2_ExplicitContentAnnotation object: Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.

GoogleCloudVideointelligenceV1beta2_ExplicitContentFrame

  • GoogleCloudVideointelligenceV1beta2_ExplicitContentFrame object: Video frame level annotation results for explicit content.
    • pornographyLikelihood string (values: LIKELIHOOD_UNSPECIFIED, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, VERY_LIKELY): Likelihood of the pornography content..
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1beta2_FaceAnnotation

GoogleCloudVideointelligenceV1beta2_FaceDetectionAnnotation

  • GoogleCloudVideointelligenceV1beta2_FaceDetectionAnnotation object: Face detection annotation.

GoogleCloudVideointelligenceV1beta2_FaceFrame

  • GoogleCloudVideointelligenceV1beta2_FaceFrame object: Deprecated. No effect.
    • normalizedBoundingBoxes array: Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame.
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1beta2_FaceSegment

GoogleCloudVideointelligenceV1beta2_LabelAnnotation

GoogleCloudVideointelligenceV1beta2_LabelFrame

  • GoogleCloudVideointelligenceV1beta2_LabelFrame object: Video frame level annotation results for label detection.
    • confidence number: Confidence that the label is accurate. Range: [0, 1].
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1beta2_LabelSegment

  • GoogleCloudVideointelligenceV1beta2_LabelSegment object: Video segment level annotation results for label detection.

GoogleCloudVideointelligenceV1beta2_LogoRecognitionAnnotation

GoogleCloudVideointelligenceV1beta2_NormalizedBoundingBox

  • GoogleCloudVideointelligenceV1beta2_NormalizedBoundingBox object: Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].
    • bottom number: Bottom Y coordinate.
    • left number: Left X coordinate.
    • right number: Right X coordinate.
    • top number: Top Y coordinate.

GoogleCloudVideointelligenceV1beta2_NormalizedBoundingPoly

  • GoogleCloudVideointelligenceV1beta2_NormalizedBoundingPoly object: Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2 When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.

GoogleCloudVideointelligenceV1beta2_NormalizedVertex

  • GoogleCloudVideointelligenceV1beta2_NormalizedVertex object: A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
    • x number: X coordinate.
    • y number: Y coordinate.

GoogleCloudVideointelligenceV1beta2_ObjectTrackingAnnotation

  • GoogleCloudVideointelligenceV1beta2_ObjectTrackingAnnotation object: Annotations corresponding to one tracked object.
    • confidence number: Object category's labeling confidence of this track.
    • entity GoogleCloudVideointelligenceV1beta2_Entity
    • frames array: Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames.
    • segment GoogleCloudVideointelligenceV1beta2_VideoSegment
    • trackId string: Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time.
    • version string: Feature version.

GoogleCloudVideointelligenceV1beta2_ObjectTrackingFrame

  • GoogleCloudVideointelligenceV1beta2_ObjectTrackingFrame object: Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.

GoogleCloudVideointelligenceV1beta2_PersonDetectionAnnotation

  • GoogleCloudVideointelligenceV1beta2_PersonDetectionAnnotation object: Person detection annotation per video.

GoogleCloudVideointelligenceV1beta2_SpeechRecognitionAlternative

  • GoogleCloudVideointelligenceV1beta2_SpeechRecognitionAlternative object: Alternative hypotheses (a.k.a. n-best list).
    • confidence number: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
    • transcript string: Transcript text representing the words that the user spoke.
    • words array: Output only. A list of word-specific information for each recognized word. Note: When enable_speaker_diarization is set to true, you will see all the words from the beginning of the audio.

GoogleCloudVideointelligenceV1beta2_SpeechTranscription

  • GoogleCloudVideointelligenceV1beta2_SpeechTranscription object: A speech recognition result corresponding to a portion of the audio.
    • alternatives array: May contain one or more recognition hypotheses (up to the maximum specified in max_alternatives). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer.
    • languageCode string: Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.

GoogleCloudVideointelligenceV1beta2_TextAnnotation

  • GoogleCloudVideointelligenceV1beta2_TextAnnotation object: Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.

GoogleCloudVideointelligenceV1beta2_TextFrame

  • GoogleCloudVideointelligenceV1beta2_TextFrame object: Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.

GoogleCloudVideointelligenceV1beta2_TextSegment

GoogleCloudVideointelligenceV1beta2_TimestampedObject

GoogleCloudVideointelligenceV1beta2_Track

GoogleCloudVideointelligenceV1beta2_VideoAnnotationProgress

  • GoogleCloudVideointelligenceV1beta2_VideoAnnotationProgress object: Annotation progress for a single video.
    • feature string (values: FEATURE_UNSPECIFIED, LABEL_DETECTION, SHOT_CHANGE_DETECTION, EXPLICIT_CONTENT_DETECTION, FACE_DETECTION, SPEECH_TRANSCRIPTION, TEXT_DETECTION, OBJECT_TRACKING, LOGO_RECOGNITION, PERSON_DETECTION): Specifies which feature is being tracked if the request contains more than one feature.
    • inputUri string: Video file location in Cloud Storage.
    • progressPercent integer: Approximate percentage processed thus far. Guaranteed to be 100 when fully processed.
    • segment GoogleCloudVideointelligenceV1beta2_VideoSegment
    • startTime string: Time when the request was received.
    • updateTime string: Time of the most recent update.

GoogleCloudVideointelligenceV1beta2_VideoAnnotationResults

GoogleCloudVideointelligenceV1beta2_VideoSegment

  • GoogleCloudVideointelligenceV1beta2_VideoSegment object: Video segment.
    • endTimeOffset string: Time-offset, relative to the beginning of the video, corresponding to the end of the segment (inclusive).
    • startTimeOffset string: Time-offset, relative to the beginning of the video, corresponding to the start of the segment (inclusive).

GoogleCloudVideointelligenceV1beta2_WordInfo

  • GoogleCloudVideointelligenceV1beta2_WordInfo object: Word-specific information for recognized words. Word information is only included in the response when certain request parameters are set, such as enable_word_time_offsets.
    • confidence number: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
    • endTime string: Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
    • speakerTag integer: Output only. A distinct integer value is assigned for every speaker within the audio. This field specifies which one of those speakers was detected to have spoken this word. Value ranges from 1 up to diarization_speaker_count, and is only set if speaker diarization is enabled.
    • startTime string: Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
    • word string: The word corresponding to this set of information.

GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoProgress

  • GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoProgress object: Video annotation progress. Included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoResponse

  • GoogleCloudVideointelligenceV1p1beta1_AnnotateVideoResponse object: Video annotation response. Included in the response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

GoogleCloudVideointelligenceV1p1beta1_DetectedAttribute

  • GoogleCloudVideointelligenceV1p1beta1_DetectedAttribute object: A generic detected attribute represented by name in string format.
    • confidence number: Detected attribute confidence. Range [0, 1].
    • name string: The name of the attribute, for example, glasses, dark_glasses, mouth_open. A full list of supported type names will be provided in the document.
    • value string: Text value of the detection result. For example, the value for "HairColor" can be "black", "blonde", etc.

GoogleCloudVideointelligenceV1p1beta1_DetectedLandmark

  • GoogleCloudVideointelligenceV1p1beta1_DetectedLandmark object: A generic detected landmark represented by name in string format and a 2D location.

GoogleCloudVideointelligenceV1p1beta1_Entity

  • GoogleCloudVideointelligenceV1p1beta1_Entity object: Detected entity from video analysis.
    • description string: Textual description, e.g., Fixed-gear bicycle.
    • entityId string: Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API.
    • languageCode string: Language code for description in BCP-47 format.

GoogleCloudVideointelligenceV1p1beta1_ExplicitContentAnnotation

  • GoogleCloudVideointelligenceV1p1beta1_ExplicitContentAnnotation object: Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.

GoogleCloudVideointelligenceV1p1beta1_ExplicitContentFrame

  • GoogleCloudVideointelligenceV1p1beta1_ExplicitContentFrame object: Video frame level annotation results for explicit content.
    • pornographyLikelihood string (values: LIKELIHOOD_UNSPECIFIED, VERY_UNLIKELY, UNLIKELY, POSSIBLE, LIKELY, VERY_LIKELY): Likelihood of the pornography content..
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1p1beta1_FaceAnnotation

GoogleCloudVideointelligenceV1p1beta1_FaceDetectionAnnotation

  • GoogleCloudVideointelligenceV1p1beta1_FaceDetectionAnnotation object: Face detection annotation.

GoogleCloudVideointelligenceV1p1beta1_FaceFrame

  • GoogleCloudVideointelligenceV1p1beta1_FaceFrame object: Deprecated. No effect.
    • normalizedBoundingBoxes array: Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame.
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1p1beta1_FaceSegment

GoogleCloudVideointelligenceV1p1beta1_LabelAnnotation

GoogleCloudVideointelligenceV1p1beta1_LabelFrame

  • GoogleCloudVideointelligenceV1p1beta1_LabelFrame object: Video frame level annotation results for label detection.
    • confidence number: Confidence that the label is accurate. Range: [0, 1].
    • timeOffset string: Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

GoogleCloudVideointelligenceV1p1beta1_LabelSegment

GoogleCloudVideointelligenceV1p1beta1_LogoRecognitionAnnotation

GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingBox

  • GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingBox object: Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].
    • bottom number: Bottom Y coordinate.
    • left number: Left X coordinate.
    • right number: Right X coordinate.
    • top number: Top Y coordinate.

GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingPoly

  • GoogleCloudVideointelligenceV1p1beta1_NormalizedBoundingPoly object: Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2 When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.

GoogleCloudVideointelligenceV1p1beta1_NormalizedVertex

  • GoogleCloudVideointelligenceV1p1beta1_NormalizedVertex object: A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
    • x number: X coordinate.
    • y number: Y coordinate.

GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingAnnotation

  • GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingAnnotation object: Annotations corresponding to one tracked object.
    • confidence number: Object category's labeling confidence of this track.
    • entity GoogleCloudVideointelligenceV1p1beta1_Entity
    • frames array: Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames.
    • segment GoogleCloudVideointelligenceV1p1beta1_VideoSegment
    • trackId string: Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time.
    • version string: Feature version.

GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingFrame

  • GoogleCloudVideointelligenceV1p1beta1_ObjectTrackingFrame object: Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.

GoogleCloudVideointelligenceV1p1beta1_PersonDetectionAnnotation

  • GoogleCloudVideointelligenceV1p1beta1_PersonDetectionAnnotation object: Person detection annotation per video.

GoogleCloudVideointelligenceV1p1beta1_SpeechRecognitionAlternative

  • GoogleCloudVideointelligenceV1p1beta1_SpeechRecognitionAlternative object: Alternative hypotheses (a.k.a. n-best list).
    • confidence number: Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
    • transcript string: Transcript text representing the words that the user spoke.
    • words array: Output only. A list of word-specific information for each recognized word. Note: When enable_speaker_diarization is set to true, you will see all the words from the beginning of the audio.

GoogleCloudVideointelligenceV1p1beta1_SpeechTranscription

  • GoogleCloudVideointelligenceV1p1beta1_SpeechTranscription object: A speech recognition result corresponding to a portion of the audio.
    • alternatives array: May contain one or more recognition hypotheses (up to the maximum specified in max_alternatives). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer.
    • languageCode string: Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.

GoogleCloudVideointelligenceV1p1beta1_TextAnnotation

  • GoogleCloudVideointelligenceV1p1beta1_TextAnnotation object: Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.

GoogleCloudVideointelligenceV1p1beta1_TextFrame

  • GoogleCloudVideointelligenceV1p1beta1_TextFrame object: Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.

GoogleCloudVideointelligenceV1p1beta1_TextSegment

  • GoogleCloudVideointelligenceV1p1beta1_TextSegment object: Video segment level annotation results for text detection.
    • confidence number: Confidence for the track of detected text. It is calculated as the highest over all frames where OCR detected text appears.
    • frames array: Information related to the frames where OCR detected text appears.
    • segme