Data Collection and Transformation

Data Collection and Cleaning

In the course of the project the data was obtained from the official Clash Royale API from Supercell. We received this in JSON format. Our data collection process began by pulling the best 200 clans in Germany. In the next step we filtered only the tag, an identification number, from the clan data. Using the clan tags, we were then able to obtain all the players in each clan. This could be up to 50 players per clan. We therefore received up to 10,000 players. We then filtered these 10,000 players so that we only kept the players who had at least 8,500 trophies. In Clash Royale, trophies are a points system with which you can climb up a kind of rank. We then saved the player tags of these filtered players. As a result we had 4,868 players to consider for our project.


With the player tags we were able to display the last fights per player. This was up to 30 fights per player. So we ended up with around 125,000 different fights during the first data acquisition. This amount of raw data was then acquired on a daily basis. We then filtered the acquired fights for certain attributes that were important for the analysis. We then removed duplicates. This cleaning process resulted in a data acquisition of around 40,000 new fights per day.


We then divided this cleaned data into the winning decks of German players, losing decks of German players and winning decks of the opponents of the German players under consideration.


At the end of our data acquisition, we had 445,974 different cleaned battles that were used to create the graphics on this website and answer the research questions.


Data Transformation

From this cleaned and selected data, we then transformed the data according to the research question so that we could answer the questions.