Using Custom S3 Resource in Spring Batch Application

In the last article titled Simple S3 ItemReader for Spring Batch Application, I explained how to write a custom FileItemReader for S3. You might have noticed, the solution works great but you will have to restart your application/server for any changes in the S3 file to take effect.

Here, we will try another approach where we will extend org.springframework.core.io.AbstractResource class and implment a S3 resource provider. For this, we will have to implement the below methods:

public String getDescription()
public InputStream getInputStream()
public boolean exists()
public long contentLength()
public long lastModified()
public String getFilename()
public URL getURL()

Let's implement these methods.

Method: getDescription()

This should return a short description on the aws resource including the bucket name and the object name.

StringBuilder builder = new StringBuilder("S3 resource [bucket='");
        builder.append(this.bucketName);
        builder.append("' and key='");
        builder.append(this.key);
        builder.append("']");
        return builder.toString();

Method: getInputStream()

This method should return a S3ObjectInputStream as given below.

GetObjectRequest getObjectRequest = new GetObjectRequest(this.bucketName, this.key);
        return this.amazonS3.getObject(getObjectRequest)
                .getObjectContent();

Method: exists()

To check whether the specified resource exists or not. We can use AmazonS3.getObjectMetadata() to decide whether the resource exists or not.

GetObjectMetadataRequest metadataRequest = new GetObjectMetadataRequest(
                        this.bucketName, this.key);
        return this.amazonS3.getObjectMetadata(metadataRequest);

Method: contentLength()

We can derive this from the above returned ObjectMetadata.

return objectMetadata.getContentLength();

Method: lastModified()

Similar to the above method.

return objectMetadata.getLastModified().getTime();

Method: getFilename()

This is nothing but the resource name.

return this.key;

Method: getURL()

The url can be constructed using the bucket and resource name.

Region region = this.amazonS3.getRegion()
                .toAWSRegion();
        return new URL("https", region.getServiceEndpoint(AmazonS3Client.S3_SERVICE_NAME),
                "/" + this.bucketName + "/" + this.key);

Integrating S3 Resource with FileItemReader

We call the setResource() method of FlatFileItemReader and set the S3 resource instance.

public ItemReader reader() throws IOException {
        FlatFileItemReader reader = new FlatFileItemReader<>();
        reader.setResource(new S3Resource(s3Client(), "bucketName", "fileName"));
        lineMapper.setLineTokenizer(your tokenizer);
        lineMapper.setFieldSetMapper(your field mapper);
        reader.setLineMapper(your line mapper);
        return reader;
    }

Conclusion

So, we built a resource class for aws S3 and set it as the resource provider for FlatFileItemReader. Similarly you can implement a S3 writer as well, let me know if you need any help on that.

Note: In Spring Cloud, there is in-built support available for reading and writing files from aws S3. Refer this for details.

2 Comments

Did you enjoy this post? Why not leave a comment below and continue the conversation, or subscribe to our feed and get articles like this delivered automatically to your feed reader? Like our Facebook Page.

Anonymous5 June 2023 at 14:53
Hello,
I am trying to read files directly from S3 ,
reader.setResource(new S3Resource(s3Client(), "bucketName", "fileName"));
not sure the how the above line works. S3Resource is an interface.
It would be great if you can post working solution !
Anonymous25 April 2024 at 02:02
reader.setResource(new S3Resource(s3Client(), "bucketName", "fileName"));

Couple of questions for the above line:
1- is S3Resource a custom class or imported
2- s3Client() - what goes inside this method

Using Custom S3 Resource in Spring Batch Application - Part 2