Call for Grand Challenge solutions

Join the 2018 DEBS Grand Challenge and use machine learning to make maritime transportation more reliable! Explore multiple gigabytes of real maritime spatio-temporal streaming data and compete with peers from academia and industry for the Grand Challenge prize of 1000 USD.

Challenge start: ~~15th of January 2018~~ (i.e. HOBBIT platform is available for testing).
Submission deadline: ~~May 7th, 2018~~

The Grand Challenge data is provided by MarineTraffic and hosted by BigDataOcean project, which has received funding from the European Union’s H2020 research and innovation action program under grant agreement number No 732310. The evaluation platform is provided by the HOBBIT, EU Horizon 2020 project.

General Description

The DEBS Grand Challenge is a series of competitions, that started in 2010, in which both academics and professionals compete with the goal of building faster and more accurate distributed and event based system. Every year, the DEBS Grand Challenge participants have a chance to explore a new data set and a new problem and can compare their results based on the common evaluation criteria.

The 2018 DEBS Grand Challenge focuses on the application of machine learning to spatio-temporal streaming data. The goal of the challenge is to make the naval transportation industry more reliable by providing predictions for vessels' destinations and arrival times. Predicting both correct destinations and arrival times of vessels are relevant problems, that once solved, will boost the efficiency of the overall supply chain management.

The Grand Challenge data is provided by the MarineTraffic company and hosted by the Big Data Ocean, EU Horizon 2020 project. The evaluation platform is provided by the HOBBIT project represented by AGT International (http://www.agtinternational.com/), an EU Horizon 2020 project. The HOBBIT project has received funding from the European Union’s H2020 research and innovation action program under grant agreement number 688227.

Awards

Participants of the challenge compete for two awards: (1) the performance award and (2) the audience award. The winner of the performance award will be determined through the automated evaluation of the HOBBIT platform, according to the evaluation criteria. These criteria factor in speed as well as accuracy of the solution. The winning team will receive 1000 USD as price money.

The winner of the audience award will be determined amongst the finalists who present in the Grand Challenge session of the DEBS. In this session, the audience will be asked to vote for the solution with the most interesting concepts (highest number of votes wins). The intention is to award qualities of the solutions that are not tied to performance. Specifically, the audience will be encouraged to pay attention to the following aspects:

Novelty/originality of the solution
Quality of the solution architecture (e.g. flexibility, reusability, extensibility, generality, …)

There are two ways how teams can become finalists and get a presentation slot in the Grand Challenge session. (1) The two teams with the best performance (according to the HOBBIT platform) will be nominated. (2) The Grand Challenge organizers will review the submitted papers for each solution and nominate additional teams with the most interesting concepts.

All submissions of sufficient quality that do not make it to the finals will get a chance to present theirs solution as a poster. (The sufficiency of the quality will be determined through the review of the papers).

How to Participate

Register at EasyChair: The first step is to register your submission in the EasyChair Grand Challenge Track. At this point, this is only to state your intent to participate and to establish communication with the organizers. Therefore, it is sufficient to submit an interims title for your work.
Submit a solution to HOBBIT: You need to submit your solution to the HOBBIT platform in order to get it benchmarked in the challenge. The platform gives you feedback and allows to update your solution. Thereby you can continuously improve your system until the closing date (t.b.d.). We will evaluate the latest solution that you uploaded before the closing date.
Submit a short paper: Finally you need to upload a short paper (2 pages, plus optional appendix) about your solution to EasyChair. The paper will be reviewed to assess the merit and originality of your solution. All solutions of sufficient quality will at least get the chance to present a poster on the DEBS conference.

Data Description

Static information: The queries require knowledge about the location of ports around the world. The locations are specified via bounding boxes that are defined through coordinates. You can find the complete list of ports here

Data Stream: We provide a stream of comma separated tuples that are ordered by time. A ship sends a tuple according to its behaviour based on the AIS specifications. The schema of the tuples is provided below

Schema <SHIP_ID, SPEED, LON, LAT, COURSE, HEADING, TIMESTAMP, Departure PORT_NAME, Reported_Draught>

SHIP_ID is the anonymized id of the ship
SHIPTYPE is defined according to the reference
SPEED is measured in knots (divide value by 10)
LON is the longitude of the current ship position
LAT is the latitude of the current ship position
COURSE is the direction in which ship moves (see: https://en.wikipedia.org/wiki/Course_(navigation))
HEADING (see: https://en.wikipedia.org/wiki/Course_(navigation))
TIMESTAMP is the time at which the message was sent (UTC)
DEPARTURE_PORT_NAME is the name of the last port visited by the vessel
REPORTED_DRAUGHT of a ship's hull is the vertical distance between the waterline and the bottom of the hull (keel) https://en.wikipedia.org/wiki/Draft_(hull)

Two sample files are provided here and here.

A training dataset can be found here. Please notice that the compressed zip file is password protected. In order to obtain the password, please read the document BDO-Data Access and License agreement for academic purposes 2018.pdf (available at the given address) and send a mail to the organizers stating you agree with the given terms. For the challenge we define a set of ports to consider. Each port is considered to be specified by a circle a radius around its coordinates that is defined per port (see ports.csv in the provided data).

Query Description

Query 1: Predicting destinations of ships

Predicting the correct destination of a vessel is a relevant problem for a wide range of stakeholders including port authorities, vessel operators and many more. The prediction problem is to generate a continuous stream of predictions for the destination port of any vessel given the following information: (1) unique ID of the ship, (2) actual position of the ship, (3) name of the port of departure, (4) time stamp, and (5) vessel’s draught. The above data is provided as a continuous stream of tuples and the goal of the system is to provide for every input tuple one output tuple containing the name of the destination port. A solution is considered correct at time stamp T if for a tuple with this timestamp as well as for all subsequent tuples the predicted destination port matches the actual destination port. The goal of any solution is not only to predict a correct destination port but also to predict it as soon as possible counting from the moment when a new port of origin appears for a given vessel. After port departure and until arrival, the solution must emit one prediction per position update. The input and output format of the data for Query 1 is specified by the example implementation that can be viewed on the github under the following link (line 50).

Evaluation for Query 1

The evaluation takes into account how early the correct predictions are made (Rank A1) and the total runtime of the system (Rank B1).

Rank A1 ranks according to the prediction time (the average time span between a prediction and the arrival at the port). Only correct predictions are considered. The arrival at a port is defined by the first event that is reported from within the respective bounding box. More formally:

Score A1 = offset of the first tuple of the last correctly predicted sequence before trip ends / total_trip_duration. Offset = (tripEndTimestamp-firstCorrectTupleTimestamp).

Example Score A1:

Time:01, Predicted Dest: A (Start of Trip)
Time:02, Predicted Dest: B
Time:03, Predicted Dest: A
Time:04, Predicted Dest: B
Time:05, Predicted Dest: B
Time:06, Predicted Dest: B (Arrival at B)

Score A1: (06-04)/(06-01) [higher is better]

Rank A1 is calculated as the position in the list of all participants sorted by the Score A1 in decreasing order.

The overall ranking for query 1 (Rank Q1) is then computed as Rank Q1 = 0.75*Rank A1 + 0.25*Rank B1.

At any point in time there is only one tuple per ship in the queue.

Query 2: Predicting arrival times of ships

There is a set of ports defined by respective bounding boxes of coordinates. Once a ship leaves a port (i.e. the respective bounding box), the task is to predict the arrival time at its destination port (i.e. when the next defined bounding boxes will be entered). Also for this query, after port departure and until arrival, the solution must emit one prediction per position update. The input and output format of the data for Query 2 is specified by the example implementation that can be viewed on the github under the following link (line 52).

Evaluation for Query 2

The evaluation takes into account the accuracy of predictions (Rank A2) and the total runtime (Rank B2).

Score A2 is defined by the prediction accuracy (i.e. mean average error of all predicted arrival times) while Rank B2 ranks according to the total runtime. Rank A2 ranks systems according to the Score A2 in increasing order.

The overall ranking for query 2 (Rank Q2) is then computed as Rank Q2 = 0.75*Rank A2 + 0.25*Rank B2. The final ranking is given by the sum of ranks Rank Q1 and Rank Q2.

Platform Overview

Submitted solutions will be benchmarked with the HOBBIT platform deployed online at http://master.project-hobbit.eu/. A detailed description of the platform is available here.

The evaluation cluster of the online platform has three working nodes allocated for solutions. Each node is 2×64 bit Intel Xeon E5-2630v3 (8-Cores, 2,4 GHz, Hyperthreading, 20MB Cache, each proc.), 256 GB RAM, 1Gb Ethernet.

Hobbit How-To

In order to participate in challenge participant need to:

Develop a system adapter connecting his system to the HOBBIT platform
Upload the system to the HOBBIT platform so that it can be benchmarked
Register the system for the DEBS 2018 Grand Challenge for final evaluation

Instructions for developing a HOBBIT system adapter are available at the HOBBIT Wiki. A simple Hello World example for this challenge is available here. The hobbit-java-sdk and published sources (to be updated) should help participants to debug and their system locally and to prepare docker image for uploading into the online platform. Detailed information about upload procedure is documented here. After submitting your system to the HOBBIT platform, you can use the DEBS 2018 Benchmark (to be published) to test the correctness of your implementation.

In order to register your system for the Challenge you have to use the “DEBS 2018 Grand Challenge” item under the “Challenges” tab in the platform GUI. The detailed description of the registration procedure is described here. Participants need to register their systems for all tasks defined in DEBS 2018 Grand Challenge at the moment.

FAQ

The Frequently Asked Questions will appear here. Please notice an issue tracker is available here.

Updated deadline: 7th of May (AOE)

We decided to extend the submission deadline to the 7th of May (AOE). Please be reminded that this years challenge offers the opportunity to be qualified for presentation at the conference based on both originality (determined by the peer review results) and performance (as determined by the benchmark results). Ideal solutions convince with both aspects but interesting solutions that do not participate in the benchmark can be presented during the conference as well. Please not that only solutions that participate in the benchmark are considered for the performance award.

The final version of your paper needs to be uploaded to EasyChair by the 7th of May (AOE). Please just update your initial submission. We will review the latest version that you uploaded.

If you have an implementation that is ready to participate in the benchmark (required to win the performance award), your solution must be uploaded to the HOBBIT system by the 7th of May (AOE). We will take your latest upload for the final benchmark.

Organization

Vincenzo Gulisano, Chalmers University of Technology, Sweden - vincenzo.gulisano@chalmers.se
Zbigniew Jerzak, SAP SE, Germany - zbigniew.jerzak@sap.com
Pavel Smirnov – AGT International, Germany - PSmirnov@agtinternational.com
Martin Strohbach – AGT International, Germany - MStrohbach@agtinternational.com
Holger Ziekow, Furtwangen University, Germany - zie@hs-furtwangen.de
Dimitris Zissis, University of the Aegean, Greece - dzissis@marinetraffic.com