General

Run-with-data is a project which combines two topics I am really passionate about, namely running, particularly long-distance running and data science. I started to think about the project during the last year of my PhD. During this time I rediscovered my passion for running, after years of preferring tennis and other racket sports. However, the data analysis and the writing of my thesis involved long hours in front of my laptop, a lack of free-time and social contacts. This is when I realized that I needed some activity that balanced the exhausting writing process. While I still play (and love) tennis, running, unsurprisingly, turned out to be a more spontaneous activity that required little planning and could be done in almost any weather. Moreover, during a run I can think about anything or nothing at all, which is quite different from playing tennis. This aspect is particularly important when trying to balance work-related stress. So, in February 2025 I started to go on runs regularly.

I should note that by end of 2024, I experienced an inflammation in my gut, which affected my digestion throughout the whole year of 2025. Sincerely, having digestive issues and practicing (long-distance) running can be a risky combination (if you know you know). In fact, digestive symptoms during running are well known [1]. It is therefore not surprising that the ongoing digestive issues I had affected my running ability, which, depending on how pronounced these were, was quite draining (and demotivating).

While I thought that there is a correlation between having digestive issues and my running performance, this was a hypothesis based purely on my experience, rather than being backed by data. Being a scientist by training, I wanted to back or dismiss this hypothesis based on data, i.e. to analyze quantitatively if and how running affects my gut issues and vice versa, hoping this could help me understand how to avoid having issues during runs. While fitness data is nowadays easily accessible, e.g. through tracking with a fitness watch, tracking any health-issues in a way that allows for a quantitative analysis is not as straight forward.

This is when I started to think about turning this into a project, i.e. run-with-data. However, as time went on, I got more and more stressed with finishing my PhD and finding a job, hence, I simply did not have time to work on any other project. However, I did continue running and finished my first half-marathon in July. While my registration of the half-marathon was primarily a means to motivate me to continue training, it was the race itself that proved to me that long-distance running is my passion. I enjoyed the whole run immensely (although I did have digestive issues on race-day). While it was the longest distance I had run to this date, I felt amazing afterwards. Particularly, I realized that I want to continue running. Moreover, I found running the half-marathon not as exhausting as I thought it would be, both physically and mentally. This is when I started to think about signing up for a full marathon.

Summer went on and while I was thinking about signing up last-minute for the Munich marathon, I did not due so mainly because the it is right after the Oktoberfest, where I typically get sick afterwards (which again proved to be the case). Instead, I signed up for the Madrid marathon (April 2026), with the aim to have a solid reason for training throughout winter. However, during autum I started to experience sever hip pain, which got worse the more I run. The hip pain sticked around for way longer than I thought, eventually leading to me consulting a doctor and starting physical therapy, aiming to built up muscle around the hips. The rehab-process was (and still is) full of ups and downs. While the pain did get better, it is not gone. Since I signed up for the marathon though, understanding how training and running affects the pain I feel in my hips is essential during my preparation. This is when I started to think again about run-with-data. Now, besides tracking my gut issues, I’d also like to track how my training relates to my hip problems (and vice versa).

While there are many philosophies on running, I am somewhere between the two extremes given by fully data-backed runners with a Strava subscription analyzing every run, and the running-purists, that don’t even wear a watch. For me, it is sometimes still the pure act of running, spacing out and enjoying the tranquility that gets me motivated. Nevertheless, run-with-data obviously aims at gathering data and analyzing it. The concept is simple. Fitness data is obtained by a fitness-watch, while health-related data is provided by myself, e.g. before and/or after a run, in a form which is ideally well quantifiable. The fitness data is then analyzed together with the self-monitored health data with the aim to extract how and if the aspects I am interested in correlate. In this way, ideally I can improve my training/ lifestyle habits, track my rehab and get to know interesting findings.

Generally, tracking of the health-related data should be as easy and non-disruptive in my daily life as possible. Otherwise, I simply will not do it. As this blog is a combination of data science related content and running content, the implementation of the data gathering and the data analysis will be covered in future blog posts. I will exemplify the analysis based on my marathon preparation, covering my training progress and models predicting my finishing time. Stay tuned.

[1] R. Costa, R. Snipe, C. Kitic, P. Gibson „Systematic review: Exercise-induced gastrointestinal syndrome—Implications for health and intestinal disease.“ Aliment. Pharmacol. Therapeut., 46, 246–265 (2017)