Run-with-data is a project which combines two topics I am really passionate about, namely running, particularly long-distance running and data science. I started to think about the project during the last year of my PhD. Since the writing of the thesis involved long hours in front of my laptop as well as a lack of free-time and social contacts, I needed some activity that balanced the work-related stress. In this regard, to me, running being a spontaneous activity which can be done under almost any weather conditions, is a perfect sport. It is one of the best ways to exhaust myself while at the same time being able to relax my mind, by thinking about random stuff or just spacing out. This is quite different to other sports I like doing, like tennis or squash. While these can be great fun, especially when playing with fun partners, I've already experienced several occasions where a match that started with a good mood ended in a somewhat quiet way home. While I’m not a bad loser and neither are most of the people I’m playing with, I won’t try to analyze whether the problem is me, my partners, or the nature of the sport itself. Instead, I’ll only emphasize that this is in complete contrast to my running experience, as I can’t think of a single run which made me feel mentally worse afterwards, no matter what happened during the run.
Going on regular runs resulted in sining up and finishing my first half-marathon in July 2025. Since the half-marathon was scheduled to start at 8:00 am, I tried to increase the amount of morning runs in my preparation, however, I did feel like I was running low on energy when training in the morning. Particularly, it felt like I was performing worse than when I train in the afternoon/evening. Every now and then I further experienced some issues with my digestion during my running sessions, which started after suffering from an inflammation in my gut, which could luckily be treated well with antibiotics, however, with the sideeffects on the digestion that come with the consumption of antibiotics. Experiencing digestive issues during (long-distance) running is in general quite frequent [1-5], however, going for a run while having symptoms of this kind is a quite risky combination (if you know you know). Whenever this was the case, I got the sense that it affected my running performance considerably. In general, I felt like there is a correlation between my running performance and the aspects I mentioned, such as the time of the run or the presence of digestive issues. However, this was a hypothesis based purely on my perception, rather than being backed by any sort of data.
Being a scientist by training, I wanted to verify or dismiss this hypothesis by analyzing quantitatively if and how running affects any digestive issues and vice versa, and whether I do really performe worse in the morning (e.g. in the form of a higher heart rate or lower average pace), or whether it was just m,y personal perception. I further wanted to check whether the exhaustion I felt in the morning is something that gets better the more I get used to running at early hours, or whether I might just not be a morning person. Luckily, fitness data is nowadays easily accessible, e.g. through a fitness watch. However, the default analysis in the respective App covers only a small protion of the data that is being stored and the alysis I envisioned was not covered (unsurprisingly, as I'm aware that this might be quite niche thinking). Moreover, tracking any health-issues in a way that allows for a quantitative analysis (e.g. the presence of digestive issues, or the perception of one's energy) is not as straight forward. This is when I started to think about turning this into a project, i.e. run-with-data.
However, as time went on, I got more and more stressed with finishing my PhD and finding a job, and simply did not have time to work on a side-project. While my registration of the half-marathon was primarily a means to motivate me to continue training, it was the race itself that proved to me that long-distance running is something I want to continue doing even when my thesis wiritng is finished. While the half-marathon was the longest distance I had run to this date, I enjoyed the whole run immensely and even I felt quite well afterwards. Perticularly, I didn't find running the half-marathon not as exhausting as I thought it would be, both physically and mentally. Hence, I started to think about signing up for a full marathon. (I'm fully aware that this hints towards a mini quater-life crisis, which I'd say, isn't that uncommon among PhD students).
Summer went on and while I was thinking about signing up last-minute for the Munich marathon, I didn't do so mainly because of it's timing, which is one week after the Oktoberfest. I typically get sick after attending it, participating in a marathon a week later seemed like a bad idea. This proved to be a good choice, as the post-Oktoberfest flu got me once more. Instead, I signed up for the Madrid marathon (April 2026), with the aim to have a solid motivation for training throughout winter. Unfortunately, during autum, I started to experience severe hip pain, which got worse the more I run. The hip pain sticked around for way longer than I thought, eventually leading to me consulting a doctor and starting physical therapy, aiming to built up muscle around the hips. The rehab-process was (and still is) full of ups and downs. While the pain did get better, it is not gone. Since I signed up for the marathon though, understanding how training and running affects the pain I feel in my hips is essential during my preparation. This is when I started to think again about run-with-data. Now, I’d like to extend my symptoms tracking to be able to analyze how any sort of training relates to hip problems (and vice versa).
While there are many philosophies on running, I am somewhere between the two extremes given by fully data-backed runners with a Strava subscription analyzing every run, and the running-purists, that don’t even wear a watch. For me, it's still the pure act of running, spacing out and enjoying the tranquility that gets me motivated. Nevertheless, initializing a project called 'run-with-data' obviously aims at gathering data and analyzing it. The concept is simple. I obtain fitness data from a fitness-watch, while I collect health-related data myself, e.g. before and/or after a run, in a form which is ideally well quantifiable. The fitness data is then analyzed together with the self-monitored health data with the aim to extract any correlation between the aspects I'm interested in. In this way, ideally I can improve my training/ lifestyle habits, track my rehab and get to know other interesting findings, which might not get visualized by my fitness watch/ its app.
Generally, the tracking of the health-related data should be as easy and non-disruptive in my daily life as possible. Otherwise, I simply won't do it. As this blog is a combination of data science related content and running content, the implementation of the data gathering and the data analysis will be covered in future blog posts. I plan on exemplifying the analysis based on my marathon preparation, covering the training progress and models predicting my finishing time.
[1] H.P.F. Peters et al., „Gastrointestinal symptoms in long-distance runners, cyclists, and triathletes: prevalence, medication, and etiology“, The American Journal of Gastroenterology, 94, 1570-1581 (1999)
[2] Elisa Karhu et al., „Exercise and gastrointestinal symptoms: running-induced changes in intestinal permeability and markers of gastrointestinal function in asymptomatic and symptomatic runners“, Aliment. Pharmacol. Therapeut., 46, 246–265 (2017)
[3] R. Costa, et al., „Systematic review: Exercise-induced gastrointestinal syndrome—Implications for health and intestinal disease“, Eur. J. Appl. Physiol., 14, 2519–2526 (2017)
[4] E. Ribichini, et al., „Exercise-Induced Gastrointestinal Symptoms in Endurance Sports: A Review of Pathophysiology, Symptoms, and Nutritional Management“ Dietetics, 2, 289-307 (2023)
[5] X. Zhao, et al., „Gastrointestinal symptoms among recreational long distance runners in China: prevalence, severity, and contributing factors“ Front. Nutr., 12 (2025)
