Bulk analysis
Amazon Rekognition Bulk Analysis lets you process a large collection of images asynchronously by using a manifest file with the StartMediaAnalysisJob operation. The output for each individual image matches the output returned by the operation that you use for analysis.
Currently, Rekognition supports analysis with the DetectModerationLabels operation.
You will be charged for the number of images that have been successfully processed by the job. The results of a finished job are outputted to a specified Amazon S3 bucket.
Note that Bulk Analysis does not support the Amazon A2I integration.
The API can detect animated or illustrated content types, and information about the detected content type is returned as part of the response.
Processing images in bulk
You can start a new bulk analysis job by submitting a manifest file and calling the StartMediaAnalysisJob operation. The input manifest file contains references to images in an Amazon S3 bucket and it is formatted as follows:
{"source-ref": "s3://foo/bar/1.jpg"}
To create a bulk analysis job (CLI)
-
If you haven't already:
-
Create or update a user with
AmazonRekognitionFullAccess
andAmazonS3ReadOnlyAccess
permissions. For more information, see Step 1: Set up an AWS account and create a User. -
Install and configure the AWS CLI and the AWS SDKs. For more information, see Step 2: Set up the AWS CLI and AWS SDKs.
-
-
Upload images to your S3 bucket.
For instructions, see Uploading Objects into Amazon S3 in the Amazon Simple Storage Service User Guide.
-
Use the following commands to create and retrieve bulk analysis jobs.
StartMediaAnalysisJob output manifests
The bulk analysis job generates an output manifest file that contains the job results, as well as a manifest summary which contains statistics and details on any errors when processing the input manifest entries.
If duplicated entries were included in the input manifest, the job won’t attempt to filter out unique inputs, and will instead process all provided entries.
The output manifest file is formatted as follows:
// Output manifest for content moderation {"source-ref":"s3://foo/bar/1.jpg", "detect-moderation-labels": {"ModerationLabels":[],"ModerationModelVersion":"7.0","ContentTypes":[{"Confidence":72.7257,"Name":"Animated"}]}}
The output manifest summary is formatted as follows:
{ "version": "1.0", # Schema version, 1.0 for GA. "statistics": { "total-json-lines": Number, # Total number json lines (images) in the input manifest. "valid-json-lines": Number, # Total number of JSON Lines (images) that contain references to valid images. "invalid-json-lines": Number # Total number of invalid JSON Lines. These lines were not handled. }, "errors": [ { "line-numer": Number, # The number of the line in the manifest where the error occured. "source-ref": "String", # Optional. Name of the file if was parsed. "code": "String", # Error code. "message": "String" # Description of the error. } ] }
Content type
Information on the type of media content analyzed by StartMediaAnalysisJob operation is returned by the GetMediaAnalysisJob operation. ContentType can be one of two different categories:
Animated content, which includes video game and animation (e.g., cartoon, comics, manga, anime).
Illustrated content, which includes drawing, painting, and sketches.
Prediction verification and adapter training
Bulk Analysis can also be leveraged through the Rekognition console
Currently, you can create adapters for use with the Rekognition Custom Moderation
feature. By creating an adapter and providing it to the DetectModerationLabels
For more information about Custom Moderation, see Enhancing accuracy with Custom Moderation. See Bulk analysis and verification for an explanation of how to verify predictions made with Bulk analysis. For a tutorial covering how to use the Rekognition console to verify predictions and create an adapter, see Custom Moderation adapter tutorial.