In this post I will show how to use Spring Batch in a web container (Tomcat). I will upload vacancy related data from a flat file to the database using Spring Batch. Before I show how I have done this, a brief introduction to Spring Batch is necessary.
Spring Batch - An Introduction
Spring Batch is a lightweight batch processing framework. Spring Batch is designed for bulk processing to perform business operations. Moreover it also provides logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management. The below diagram shows the processing strategy provided by Spring Batch (source: http://static.springsource.org/spring-batch/reference/html/whatsNew.html)
A batch Job has one or more step(s).
A JobInstance is a representation of a Job. JobInstances are distinguished from each other with the help of JobParameter. JobParameters is a set of parameters used to start a batch job. Each run of of a JobInstance is a JobExecution.
A Step contains all of the information necessary to define and control the actual batch processing. In our case the "vacancy_step" is responsible to upload vacancy data from a flat file to database.
ItemReader is responsible retrieval of input for a Step, one item at a time, whereas ItemWriter represents the output of a Step, one batch or chunk of items at a time.
JobLauncher is used to launch a Job with a given set of JobParameters.
JobRepository is used to to store runtime information related to the batch execution.
A tasklet is an object containing any custom logic to be executed as a part of a job.
I have used SpringSource Tool Suite (STS) and Spring Roo to develop a simple web application which is responsible for initiating the batch processing upon receiving a request from a user. Below figure shows how batch processing will be started upon receiving the request (source:
http://static.springsource.org/spring-batch/reference/html/)
Spring Roo is very good to develop a prototype application in a short period of time using Spring best practices. You can also use Eclipse to implement this.
If you have Spring STS then open it and create Spring Roo Project.
File -> New -> Spring Roo Project.
Give project name and top level package name.
Now open the Roo shell in your STS and execute the below commands:
roo > persistence setup --database MYSQL --provider HIBERNATE
roo > entity --class ~.model.Vacancy --testAutomatically
roo > field string --fieldName referenceNo
roo > field string --fieldName title
roo > field string --fieldName salary
Here is my Vacancy Entity Class
@RooJavaBean
@RooToString
@RooEntity
public class Vacancy {
private String referenceNo;
private String title;
private String salary;
}
I have used MYSQL as my backend database (you can use any database). I have created "batchsample" database. So please create a database and enter the below details in the "database.properties" file
database.password=admin
database.url=jdbc\:mysql\://localhost\:3306/batchsample
database.username=root
database.driverClassName=com.mysql.jdbc.Driver
I have also written a simple integration test to find out whether my database configuration is ok or not.
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = "classpath:/META-INF/spring/applicationContext.xml")
@Transactional
public class VacancyIntegrationTest {
private SimpleJdbcTemplate jdbcTemplate;
@Autowired
public void initializeJdbcTemplate(DataSource ds){
jdbcTemplate = new SimpleJdbcTemplate(ds);
}
@Test
public void testBatchDbConfig() {
Assert.assertEquals(0, jdbcTemplate.queryForInt("select count(0) from vacancy"));
}
}
Run this test. If the test is passed then execute the below roo command to create web infrastructure for this application.
roo > controller all --package ~.web
Roo will create necessary web structure. A controller called "VacancyController" will also be created by Roo to handle the request.
I have slightly modified the VacancyController to meet my needs. Here is the controller:
@Controller
@RequestMapping("/vacancy/*")
public class VacancyController {
private static Log log = LogFactory.getLog(VacancyController.class);
@Autowired
private ApplicationContext context;
@RequestMapping("list")
public String list(Model model) {
model.addAttribute("vacancies", Vacancy.findAllVacancys());
return "vacancy/list";
}
@RequestMapping("handle")
public String jobLauncherHandle(){
JobLauncher jobLauncher = (JobLauncher)context.getBean("jobLauncher");
Job job = (Job)context.getBean("vacancyjob");
log.info(jobLauncher);
log.info(job);
ExitStatus exitStatus = null;
try {
JobExecution jobExecution = jobLauncher.run(
job,
new JobParametersBuilder()
.addDate("date", new Date())
.toJobParameters()
);
exitStatus = jobExecution.getExitStatus();
log.info(exitStatus.getExitCode());
}
catch(JobExecutionAlreadyRunningException jobExecutionAlreadyRunningException) {
log.info("Job execution is already running.");
}
catch(JobRestartException jobRestartException) {
log.info("Job restart exception happens.");
}
catch(JobInstanceAlreadyCompleteException jobInstanceAlreadyCompleteException) {
log.info("Job instance is already completed.");
}
catch(JobParametersInvalidException jobParametersInvalidException){
log.info("Job parameters invalid exception");
}
catch(BeansException beansException) {
log.info("Bean is not found.");
}
return "vacancy/handle";
}
}
Now it is the time to include the batch configuration in the applicationContext.xml.
applicationContext.xml
<context:property-placeholder location="classpath*:META-INF/spring/*.properties">
<context:spring-configured>
<context:component-scan base-package="com.mega">
<context:exclude-filter expression=".*_Roo_.*" type="regex">
<context:exclude-filter expression="org.springframework.stereotype.Controller" type="annotation">
</context:exclude-filter></context:exclude-filter></context:component-scan>
<bean class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close" id="dataSource">
<property name="driverClassName" value="${database.driverClassName}">
<property name="url" value="${database.url}">
<property name="username" value="${database.username}">
<property name="password" value="${database.password}">
<property name="validationQuery" value="SELECT 1 FROM DUAL">
<property name="testOnBorrow" value="true">
</property></property></property></property></property></property></bean>
<bean class="org.springframework.orm.jpa.JpaTransactionManager" id="transactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory">
</property></bean>
<tx:annotation-driven mode="aspectj" transaction-manager="transactionManager">
<bean class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean" id="entityManagerFactory">
<property name="dataSource" ref="dataSource">
</property></bean>
<import resource="classpath:/META-INF/spring/batch-context.xml">
<bean class="org.springframework.batch.core.launch.support.SimpleJobLauncher" id="jobLauncher">
<property name="jobRepository" ref="jobRepository">
<property name="taskExecutor">
<bean class="org.springframework.core.task.SimpleAsyncTaskExecutor">
</bean></property>
</property></bean>
<bean class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" id="jobRepository" p:datasource-ref="dataSource" p:tableprefix="BATCH_" p:transactionmanager-ref="transactionManager">
<property name="isolationLevelForCreate" value="ISOLATION_DEFAULT">
</property></bean>
</import></tx:annotation-driven></context:spring-configured></context:property-placeholder>
I have kept batch job related configuration in a sperate file "batch-context.xml"
batch-context.xml
<description>Batch Job Configuration</description>
<job id="vacancyjob" xmlns="http://www.springframework.org/schema/batch">
<step id="vacancy_step" parent="simpleStep">
<tasklet>
<chunk reader="vacancy_reader" writer="vacancy_writer"/>
</tasklet>
</step>
</job>
<bean id="vacancy_reader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="classpath:META-INF/data/vacancies.csv"/>
<property name="linesToSkip" value="1" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="names" value="reference,title,salary"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="com.mega.batch.fieldsetmapper.VacancyMapper"/>
</property>
</bean>
</property>
</bean>
<bean id="vacancy_writer" class="com.mega.batch.item.VacancyItemWriter" />
<bean id="simpleStep"
class="org.springframework.batch.core.step.item.SimpleStepFactoryBean"
abstract="true">
<property name="transactionManager" ref="transactionManager" />
<property name="jobRepository" ref="jobRepository" />
<property name="startLimit" value="100" />
<property name="commitInterval" value="1" />
</bean>
I have written VacancyItemWriter to save the vacancy related data in the Database.
public class VacancyItemWriter implements ItemWriter<Vacancy> {
private static final Log log = LogFactory.getLog(VacancyItemWriter.class);
/**
* @see ItemWriter#write(Object)
*/
public void write(List<? extends Vacancy> vacancies) throws Exception {
for (Vacancy vacancy : vacancies) {
log.info(vacancy);
vacancy.persist();
log.info("Vacancy is saved.");
}
}
}
You will find other additional helper classes such as VacancyMapper, ProcessorLogAdvice, SimpleMessageApplicationEvent etc. in the attached ZIP file. Once the configuration is completed please run the application in your tc / tomcat server.
In this article I have demonstrated Spring Batch in a web container by building a simple Spring application. Additional information is available in Spring Batch Reference Document. Please download the application by clicking the below link and have fun !!!!
Note: Spring Batch related monitoring tables can be created by executing the commands found in "schema-mysql.sql" file available in spring-batch-core-2.1.1.RELEASE.jar in your mysql command prompt.
References:
1. http://static.springsource.org/spring-batch/reference/html/
2. http://java.dzone.com/news/spring-batch-hello-world-1
3. http://static.springsource.org/spring-roo/reference/html/