<value><string>This transformation recalcualtes the scores of every simulation we have processed. The output shows the best simulations we have currently seen.</string></value>
# Drone Simmulations with the help of headless-chromium servers and wendelin\n
Given the truth values of a drone flight (for a given flight), including the coordinates at each timestep, the ground speed, the climb rate and the ASML (Above Sea Level Maximum), we want to find the parameters such that our simulated drones have a similar flight pattern.\n
To achieve this we have two parts working together:\n
\n
1. Headless chrome-servers: This contains two parts, the python script that generates the parameters, starts the headless-chromium servers, retrieves the generated simulation data writes a file that contains the data (the headless-chromium servers do not write anything, the script reads their output and writes the files) and the fluentd agent that sends the data to our wendelin instance on another SlapOS instance.\n
2. Wendelin: Given the simulation data create data arrays, use these data arrays to calculate the score (see Score Plot for the formula) and plot the data here.\n
\n
\n
The next few section will display some information about the best 5 parameters and their simulation. In the next section we can see a flowchart describing the whole process of gathering and processing the simulation data.\n
This notebook centers on the Wendelin instance (SlapOS 2). While potential modifications to the concrete implementation of the instance may arise in the future, the overarching workflow is expected to remain largely unchanged. As the author of this notebook was not directly involved in the core functionalities of the SlapOS 1 instance, detailed explanations of its inner workings are beyond the scope of this document. Instead, our approach will be to provide a brief overview of the Python script and the headless Chrome servers within SlapOS 1, acknowledging that the author\'s responsibility lies primarily in assuming the availability and transmission of data to Wendelin.\n
\n
1. We start with a python script inside the first SlapOS instance. This script is responsible for requesting and starting the selenium headless-chrome servers. The script also generates the parameters for each simulation. Currently we can only set the drone speed and the drone max roll parameters, while drone min pitch and drone max pitch are not used, or at least can not be chosen.\n
After the headless-chrome servers have generated the simulation data, the script process the data and saves it localy. This data is then read by the fluentd agent which sends this data to wendelin.\n
\n
2. The wendelin instance, on another SlapOS instance, recieves the raw simulation data, which is formated as a csv file, which is then processed and stored inside a data array. This is done by the \n
<aid="convert_drone_raw_data">Convert Drone Raw Data</a> transformation for the simulated flight data and <aid ="convert_drone_raw_real_data">Convert Drone Raw Real Data</a> transformation for the real flight data.\n
\n
3. The next step is to use the simulated data and compare it to the ground truth. In this step we calculate the score (see [Score Plot](#plot_div5)) and store it in two data arrays: one containing the score data and the other containing some additional information, such as the deviation between the simulation and the ground truth at each timestamp. The coresponding transformation is called <aid ="create_score_list_array">Create Score List Array</a>.\n
\n
4. In the last step we use the previously calculated data arrays (both the data array containing the score data and the data array containing some additional information) to find the best 5 simulations (and with it the best 5 parameters). This generates two data arrays: one containing the score data for the best 5 simulations and the other containing some addtional information, such as the deviation between the best 5 simulations and the ground truth at each timestamp, which is used in the next step to create the plots we can see in this notebook. The coresponding transformation for this step is called <aid ="recalculate-score-list-array">Recalculate Score List Array</a>.\n
\n
5. This notebook uses the data created from the last step to display some information, which includes the scores but also the deviations.\n
\n
\n
The data arrays created in the last step are now used for two things: \n
1. Plot information about the simulations\n
2. Inform the python script from the SlapOS 1 instance that the data has been processed.\n
\n
The plot data can be seen in this notebook.\n
The python script from the SlapOS 1 instance, after generating the simulation data, periodically performs a GET request on the generated score data array. This array contains a column called "Iteration" which the python script can use to determine if the data processing has finished. The script can also use the score data to determine which parameters should be chosen for the next iteration, but this is currently not used. Rather it will choose from a random set of parameters instead.\n
\n
%% md\n
# Links to the Data Arrays\n
<divclass="array_links"id="array_links"/>\n
\n
%% md\n
# Bar Plot\n
## Reciprocal Score\n
The reciprocal Score for the best 5 parameters with have found. The higher the score, the better the parameters. The maximal score is 1, which means that the simulation and the real drone flight are identical. The score is calculated by this formula:\n
This line plot illustrates the deviation between the simulation data and the ground truth values, focusing on the common timestamps. It\'s important to note that the simulations may end at different times, resulting in a representation of a shorter timeframe for some simulations compared to others.\n
<divclass="plot_div"id="plot_div"/>\n
\n
%% md\n
# Deviation Plot\n
## ASML difference\n
This line plot illustrates the deviation between the simulation data and the ground truth values, focusing on the common timestamps. It\'s important to note that the simulations may end at different times, resulting in a representation of a shorter timeframe for some simulations compared to others.\n
<divclass="plot_div2"id="plot_div2"/>\n
\n
\n
%% md\n
# Deviation Plot\n
## Ground speed difference\n
This line plot illustrates the deviation between the simulation data and the ground truth values, focusing on the common timestamps. It\'s important to note that the simulations may end at different times, resulting in a representation of a shorter timeframe for some simulations compared to others.\n
<divclass="plot_div3"id="plot_div3"/>\n
\n
\n
%% md\n
# Deviation Plot\n
## Climb rate difference\n
This line plot illustrates the deviation between the simulation data and the ground truth values, focusing on the common timestamps. It\'s important to note that the simulations may end at different times, resulting in a representation of a shorter timeframe for some simulations compared to others.\n
# Take the data from the score list array. See if the array is larger than this one. If it is not, do not do anything. Else recalculate the scores for each simulation we have seen (just sort the data).
# Now, to save space, we just overwrite the old array with the new one, that contains the new scores
# This new array will give us the final overview of the ranking of the parameters. This can be used by our genetic algorithm to decide when we can stop and which parameters worked the best.
# The ingestion operation requires not only the message attribute to be present (which fluentd does automatically) but also needs the filepath attribute (in the source section of the fluentd conf set path_key filepath)
# The "filepath" contains the filepath of the log file. Because we only care about the file name, we will need to extract it, which will be used as the bucket key for the file.
# If the same "filepath" is used multiple times in different messages, the bucket data is overwritten.
nr_lines=random.randint(10,100)# We have files with somewhere like 18k lines. But we could also reduce them here, just to save space
timestamp=sorted([random.uniform(1,100)for_inrange(nr_lines)])#We only go up to 100, because otherwise it is possible that the smallest timestamp in the mock simulation is larger than the largest timestamp in the mock real data.
nr_lines=random.randint(10,100)# We have files with somewhere like 18k lines. But we could also reduce them here, just to save space
timestamp=sorted([random.uniform(1,100)for_inrange(nr_lines)])#We only go up to 100, because otherwise it is possible that the smallest timestamp in the mock simulation is larger than the largest timestamp in the mock real data.