Skip to main content

Simple S3 ItemReader for Spring Batch Application - Part 1

Spring Batch is one of the most popular Open Source batch processing frameworks available today. Also it supports most advanced features such as optimization and partitioning techniques, thus makes it the most suitable framework for high volume and high performance enterprise applications.

In this article, we will discuss about using Spring batch to process files from aws S3(Simple Storage Service).


The lifecycle of a batch process is, read large chunk of data, process it and then write the transformed data back to some storage. So, the main components of a batch process are: a reader, a processor and a writer.

Batch Reader

Spring Batch provides various item readers such as:
  • FlatFileItemReader
  • HibernatePagingItemReader
  • IbatisPagingItemReader
  • JdbcPagingItemReader
  • JmsItemReader
  • MongoItemReader
As you might be knowing, there is no in-built reader available for S3. You can write your own item reader by implementing the interface ItemReader. But here, I will show you how to build an item reader for S3 with some simple steps!

The approach

Here I will use FlatFileItemReader as the ItemReader implementation with a custom resource. The resource will be a ByteArrayResource for which the input will be the bytes read from S3, simple isn't it?

The code to read bytes from S3 will look like:
public byte[] getBytes() throws IOException {
        S3Object object = getClient().getObject(new GetObjectRequest("bucket", "file"));
        try (InputStream is = object.getObjectContent()) {
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            IOUtils.copy(is, out);
            return out.toByteArray();
        }
    }
And here goes the code for building the ItemReader:
public ItemReader reader() throws IOException {
        FlatFileItemReader reader = new FlatFileItemReader<>();
        reader.setResource(new ByteArrayResource(bytes(), "s3 bytes"));
        lineMapper.setLineTokenizer(your tokenizer);
        lineMapper.setFieldSetMapper(your field mapper);
        reader.setLineMapper(your line mapper);
        return reader;
    }
That's it! Now we have a S3 item reader which can be used in your Spring Batch application. But there are some issues with this approach, continue to part 2 of this article where I will show you a better way to implement S3 file reader.

ALso I will be writing a detailed article on how to build an S3 item writer as well. Stay tuned!

Comments

Popular posts from this blog

HDFC Bank introduces Missed Call Service to know Account Balance

Missed call is a powerful business tool in developing countries like India where customers give a miss call to specific phone numbers for getting account details, providing feedback, voting etc. On receiving a missed call from a registered phone number, the underlying app performs a phone number lookup and sends the data to the caller via text message(SMS) or records the call details for future processing.HDFC Bank recently introduced missed call service for its retail customers which allows to retrieve bank account details, mini statement etc. by simply giving a miss call to their toll free numbers.Following services are now available: 1800 270 3333 - Account Balance 1800 270 3355 - Mini Statement 1800 270 3366 - Request for new Cheque Book 1800 270 3377 - Request for Bank account statement Also you can download HDFC Mobile Banking Application by giving a missed call to : 1800 270 3344. Other banks providing missed call serviceAxis bank(known as Axis Dial) - 09225892258Bank of India…

Induction Cooker Showing an Error Code? Induction Cooker Error Codes Explained

Are you searching for Induction Cook-top error codes? Here you can find the error codes of all popular induction cooktops and how to troubleshoot it.

These are for your reference only, do not try to open your cooktop without proper safety measures, we advise you to call the service person if any servicing is needed.

If you want to know how Induction Cook-top works, read our previous article titled What is Induction Cooker? How Induction Cooker Works?.

Whirlpool Induction Cooker Error CodesError CodeErrorSolutionF0An internal error was detected.Disconnect power. Wait 5 seconds before reconnecting power. If the symbol appears again, call for service.F2The surface cooking area is too hot and has turned off.Remove the pans from the surface cooking area. "F2" will disappear when the surface cooking area has cooled. If you turn the surface cooking area back on and "F2" reappears, the cooktop is still too hot. Turn off the surface cooking area and allow it to cool.F4The po…

LICHFL - Generating Home Loan Statements Online

Generating an online statement from LIC Housing Finance Ltd is very easy, simply follow the below steps to create an online account with LICHFL and generate statements online! You may use the online generated statement as a proof for principal paid for a housing loan(under section 80C) and interest paid(under section 24) while filing income tax returns.* Want to know how to save maximum income tax? Read our most read article how to save maximum income tax (opens in new tab).* Paying high interest to LICHFL? Learn how to reduce interest on your existing home loan (opens in new tab).
Before reading further, make sure you have the following information with you: Your Loan Account NumberSanctioned AmountStep 1 - Open LICHFL websiteVisit LICHFL website and click on the 'New Customers? Click Here' link(refer the below screenshot). Step 2 - Enter your loan account detailsFill in the following details: New Loan NumberSanctioned AmountDate of BirthSecurity Codeand click on the Submit b…