Skip to main content

Simple S3 ItemReader for Spring Batch Application - Part 1

Spring Batch is one of the most popular Open Source batch processing frameworks available today. Also it supports most advanced features such as optimization and partitioning techniques, thus makes it the most suitable framework for high volume and high performance enterprise applications.

In this article, we will discuss about using Spring batch to process files from aws S3(Simple Storage Service).


The lifecycle of a batch process is, read large chunk of data, process it and then write the transformed data back to some storage. So, the main components of a batch process are: a reader, a processor and a writer.

Batch Reader

Spring Batch provides various item readers such as:
  • FlatFileItemReader
  • HibernatePagingItemReader
  • IbatisPagingItemReader
  • JdbcPagingItemReader
  • JmsItemReader
  • MongoItemReader
As you might be knowing, there is no in-built reader available for S3. You can write your own item reader by implementing the interface ItemReader. But here, I will show you how to build an item reader for S3 with some simple steps!

The approach

Here I will use FlatFileItemReader as the ItemReader implementation with a custom resource. The resource will be a ByteArrayResource for which the input will be the bytes read from S3, simple isn't it?

The code to read bytes from S3 will look like:
public byte[] getBytes() throws IOException {
        S3Object object = getClient().getObject(new GetObjectRequest("bucket", "file"));
        try (InputStream is = object.getObjectContent()) {
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            IOUtils.copy(is, out);
            return out.toByteArray();
        }
    }
And here goes the code for building the ItemReader:
public ItemReader reader() throws IOException {
        FlatFileItemReader reader = new FlatFileItemReader<>();
        reader.setResource(new ByteArrayResource(bytes(), "s3 bytes"));
        lineMapper.setLineTokenizer(your tokenizer);
        lineMapper.setFieldSetMapper(your field mapper);
        reader.setLineMapper(your line mapper);
        return reader;
    }
That's it! Now we have a S3 item reader which can be used in your Spring Batch application. But there are some issues with this approach, continue to part 2 of this article where I will show you a better way to implement S3 file reader.

ALso I will be writing a detailed article on how to build an S3 item writer as well. Stay tuned!

Comments

Popular

HDFC Bank introduces Missed Call Service to know Account Balance

Missed call is a powerful business tool in developing countries like India where customers give a miss call to specific phone numbers for getting account details, providing feedback, voting etc. On receiving a missed call from a registered phone number, the underlying app performs a phone number lookup and sends the data to the caller via text message(SMS) or records the call details for future processing.HDFC Bank recently introduced missed call service for its retail customers which allows to retrieve bank account details, mini statement etc. by simply giving a miss call to their toll free numbers.Following services are now available: 1800 270 3333 - Account Balance 1800 270 3355 - Mini Statement 1800 270 3366 - Request for new Cheque Book 1800 270 3377 - Request for Bank account statement Also you can download HDFC Mobile Banking Application by giving a missed call to : 1800 270 3344. Other banks providing missed call serviceAxis bank(known as Axis Dial) - 09225892258Bank of India…

Induction Cooker Showing an Error Code? Induction Cooker Error Codes Explained

Are you searching for Induction Cook-top error codes? Here you can find the error codes of all popular induction cooktops and how to troubleshoot it.

These are for your reference only, do not try to open your cooktop without proper safety measures, we advise you to call the service person if any servicing is needed.

If you want to know how Induction Cook-top works, read our previous article titled What is Induction Cooker? How Induction Cooker Works?.

Whirlpool Induction Cooker Error CodesError CodeErrorSolutionF0An internal error was detected.Disconnect power. Wait 5 seconds before reconnecting power. If the symbol appears again, call for service.F2The surface cooking area is too hot and has turned off.Remove the pans from the surface cooking area. "F2" will disappear when the surface cooking area has cooled. If you turn the surface cooking area back on and "F2" reappears, the cooktop is still too hot. Turn off the surface cooking area and allow it to cool.F4The po…

Income Tax Return eFiling - Must Know Facts

Income Tax filing is a legal obligation of every citizen of India whose total income of the previous year exceeds the limit defined by the IT law. Thanks to Information Technology, now you can file your tax return online via https://incometaxindiaefiling.gov.in or third party services such as myitreturn.com, Taxsmile.com etc.(Refer list of e-filing portals below). Income Tax Department has received a record number of 1.64 crores e-Returns in the F.Y. 2011-12.

You need to file the return on or before 31st of July of the assessment year. This article does not provide information on how to file your tax but the steps to follow after filing your returns.

Once you are done with the Income Tax e-filing, the acknowledgement form will be available for download normally in 2-7 days depends on the portal through which you filed your return. You need to download, sign and send the Income Tax Return Verification Form(ITR-V) to the Income Tax department within 120 days of submission via ordinar…