Authors: Farzad Kamrani, Pontus Hörling, Thomas Jansson, Pontus Svenson
Nowadays, enormous volumes of goods are transported all over the world in containers, mostly in container ships. It is a formidable task to ensure that no illegal activities such as smuggling of prohibited goods that violate customs rules and regulations occur. Many indications on potential smuggling can be found just by scrutinizing all documentation that follows these containers, and the goods within them. In the EU?s container security project (CONTAIN), one among many important tasks has been to present a prototype document analysis system for customs, called CustAware, to find these indications. The standardized way to submit necessary information to customs today is via Entry Summary Declarations (ENS) messages, normally in electronic form. We have developed a simulator (the ENS-Simulator) that generates a high flow of realistic customs-oriented data contained in such ENS messages. By adding a much smaller amount of ENS messages that could indicate peculiar shipments the data is used to train CustAware to find the needles (indications on strange activities) in the haystack (all normal ENS messages). In this paper, we describe the process of generating a high rate of normal messages in a realistic way, to represent a typical message inflow at a large customs risk assessment and management centre.