Resource to search all marathon swims past and present?
gnome4766
Member
I know how to use google but I am aware that not everyone has documented their swim into a webpage and would like to see any resource like this.
Comments
Anyone out there with PHP/mySQL coding chops? We'd be glad to host such a database here at marathonswimmers.org, but I don't have the bandwidth to do it right now.
The channel swimming community is a little... "old school." The most comprehensive E.C. database is literally an Excel spreadsheet. Catalina lists their swims in an HTML table. NYC Swim has a great database, but is limited to swims that can be independently verified, and is generally only used by entrants to their events.
http://www.dover.uk.com/channelswimming/swims/
"I never met a shark I didn't like"
loneswimmer.com
Molly Nance, Lincoln, Nebraska
Is it possible to create a site to host the data... construct an office building.
Make suites available to all sanctioning bodies who would then be responsible to assign someone to enter their own data in a standardized format? This would seem like an easy way to reassign what has been described here as the tedious work... no?
Could this forum dedicate a thread to each sanctioning body with only authorized reps able to enter data?
thinking out loud.... then there is the question of how to pay for it.....
...anything worth doing is worth overdoing.
Dave's/Niek's approach(es) would be a decent first step. By doing this at least the data would be accessible.
After that, I would do this:
2) Come up with a common reporting app that would send a standardized report back to the central database in XML or JSON and send it to all the people/committees/etc who might input data. They might even use it.
3) Come up with a program that would extract the information from the above form and put it into a standardized database.
4) As time permits, go back and extract the data from the original "suites" of each sanctioning body and put that into the database. This is somewhat time intensive since a custom program or programs would have to be written for each suite's extraction.
As a proof of concept, I'd use the following software (this reflects my own prejudices and did NOT come off the mountain on tablets with Moses):
SQLite as the database. This is much easier to deal with than MySQL. Eventually, MySQL would probably be necessary, but not at first.
R - Can be used for a host of things, like data reporting, conversion, data loading.
Talend or openstructs - data conversion.
Awk or Perl - converting text documents
I'm not a front-end guy, but Drupal is seems a reasonable choice for taking some of the pain out of that and administering the database.
I'd be willing to help with the technical end of things, but am not organized enough to actually lead something like this - I couldn't organize a 3 house paper route.
-LBJ
“Moderation is a fatal thing. Nothing succeeds like excess.” - Oscar Wilde
- Find a project manager.
- Install CMS (Drupal or Joomla, with necessary add-on modules) on marathonswimmers.org domain.
- Design form for self-reporting of swim data. Need to distinguish between established swims for which there are official data (EC, CC, MIMS) and non-sanctioned solo swims.
- Connect form to SQLite database with necessary fields. Self-reporting of swims can begin immediately.
- Design search interface so website visitors can search the database and view custom results.
- Design form for batch reporting of official swim data by sanctioning organizations.
- Liaise with representatives of sanctioning bodies of established swims.
- Allow these representatives to choose between (a) submitting data in a standardized format through our batch-reporting form; or (b) sending over whatever data they have, and letting us do the work of converting it into a standardized form.
- For organizations that choose Option B, we must convert original data into format that matches our database (R and Perl).
- Official swim data from sanctioning bodies, once obtained, will "override" self-reported data for those swims.
- Design process for verifying self-reported data.
Did I capture everyone's suggestions? Other ideas?We're all just carbon, water, starlight, oxygen and dreams
http://www.marathonswimmers.org/forum/discussion/284/voluntary-reporting-of-a-swim-the-peer-pressure-of-social-media-inspiring-cleaner-swims
Seems like a good start... but it'll need to be in web format and probably have some conditional routing included.
We have a bunch of interested people here. I was hoping to start moving further in this direction this year, I think @evmo was also.
I'm happy to volunteer to put my name against a non-coding task such as contacting the various bodies or doing some writing or forms etc.
With Evan's task list can we consider forming a working group?
I have a dream, as I know others do, of information-sharing across the world. This I think is the big task of our swim generation, that can most help swimmers and swim organisations across the globe. Evan and I have always hoped that we can use this forum to achieve further support and community for swimmers.
If anyone is interested in this, PM or email Evan & I and we'll see it we can get it rolling? (This is NOT a claim by myself to be in charge or Project leader). Or am I jumping the gun? Further discussion can and should of course happen here.
loneswimmer.com
I think we're close to being able to form a working group here. At some point we should probably take this off the public forum. There are certain unscrupulous individuals... [I will leave the rest of that sentence unspoken.]
For now, let's keep this thread going.
If you're interested in being part of the working group, either "Like" this post, or send a PM to me or @loneswimmer.
Molly Nance, Lincoln, Nebraska
Please feel free to add your swims to this list:
Boston Light Swim, Cadiz Freedom Swim, Catalina Channel, Clean Half Marathon Swim, Cook Strait, Ederle Swim, English Channel, Ijsselmeerzwemmarathon, Indian National Long Distance Swim Championships, International Marathon Swimming Beltquerung, International Self-Transcendence Marathon Swim, North) Channel, Jarak - Šabac Marathon Swim, Jersey Solo, Kalamata - Koroni Marathon, Lake Ontario, Lake Tahoe, Lake Windermere, Loch Lomond, Lough Erne, Manhattan Island Marathon Swim, Maratón Patagones Viedma, Maui Channel, Molokai Channel, Pennock Island Challenge, Rottnest Channel Swim, Santa Barbara Channel, Strait of Gibraltar, Swim Across The Sound, Swim Around Key West, Tampa Bay Marathon Swim, Traversee Internationale du lac St-Jean, Tsugaru Channel, and World 25K Championships.
Steven Munatones
www.worldopenwaterswimmingassociation.com
Huntington Beach, California, U.S.A.
While I don't have any experience with web pages and online databases, I do have experience pushing around data in spreadsheets. I am also willing to do more menial tasks. Bottom line is I think its a great idea so count me in and I'll do what I can to help.
Thanks @Munatones, that's really useful for building an association list, we can look at that. We'd have to populate with appropriate contact details for communication, that should be easy given the combined knowledge here, I can certainly look at that, unless Steve already has something you wouldn't mind sharing with us? I imagine the greater interest/agreement we have from the various associations early on, the easier further down the road?
loneswimmer.com
We're all just carbon, water, starlight, oxygen and dreams
http://openwaterpedia.com/index.php?title=Kevin_Murphy
http://openwaterpedia.com/index.php?title=Dave_Barra
http://openwaterpedia.com/index.php?title=Donal_Buckley
Anyone can freely report their swims and exploits in Openwaterpedia, along with hyperlinks, photos and videos.
Steven Munatones
www.worldopenwaterswimmingassociation.com
Huntington Beach, California, U.S.A.
- What measures have you taken to verify the accuracy of any of the data in the World Majors database?
- Have you coordinated with the sanctioning bodies of these events, to obtain official swim data, and to verify self-reported data?
- Is there any ongoing process of adding data for the latest/most recent swims, or is that entirely dependent on the motivation of the swimmer to self-report?
I love the World Majors idea as a "bucket list." As a reliable, accurate database... not as much.
I also love Openwaterpedia for many reasons, but wikis are a completely different technology than databases, with different purposes and strengths.
I have had trouble entering/editing data into the ocean's majors database. I am not the most tech savvy individual, so the process is cumbersome for me. Some of the dates and times are incorrect.
...anything worth doing is worth overdoing.
My sense based on the evidence is that when the World Majors database was first launched, it was "pre-populated" with swim data for a few well-known swimmers such as yourself, Penny Palfrey, Kevin Murphy, Alison Streeter, etc. However, they didn't consider what would happen if one of those swimmers actually wanted to edit their own data.
So if you, David Barra, tried to create your own account and edit the pre-populated data, this would actually create a second account associated with your name, meaning your name would show up twice in the database.
When the World Majors database first launched a couple of years ago, I signed up for an account to add my swim data. Then I forgot about it until @Munatones brought it up on this thread. Of course, by now I had forgotten my password. As far as I can tell, there's no way to automatically recover/reset passwords via email, an option that is available on the vast majority of other websites (including this one).
In other words, I'm now blocked from editing my data, unless @Munatones can reset my password manually. @david_barra, you are not alone!
I would be extremely interested in helping out with a project like this. I have worked on data heavy websites/apps etc and also do work on data visualisation so hopefully I could be of help.
I am primarily a UX/IU and visual designer but I have pretty good hands on skills with PHP/MySQL as well as open source frameworks like WordPress, Bootstrap, JQuery and many of the visualisation tools like Google charts/maps, Kendo Visualisation etc.
Obviously a much smaller dater set but recently did this website for my local swimmers to track our swims through the winter months. All dynamic content running off a MySQL database.
http://lab.zoho.co.uk/lab/ice-miles/
Regarding data entry there are ways to simplify this if the information is already online somewhere. I have built page scrapping scripts as well as data bridges that will allow for spreadsheets etc to be ported over.
So as long as someone else is happy to be in charge I am more than happy to help. :-)
Philip
Recent post from Channel Swimmers Google Group relating to swimmer stats:
Fellow swimmers:
Thanks to Julian Critchlow's efforts we now have an updated solo Channel
swims database on CS&PF website:
http://cspf.co.uk/swims-database
We hope it is the most comprehensive list of Channel swims that includes
both CSA and CS&PF swimmers. And if you like statistics, we have
http://cspf.co.uk/solo-swims-statistics and
http://cspf.co.uk/solo-swims-statistics-2
"I never met a shark I didn't like"
I started a Google Docs spreadsheet of 36 Organisations: Websites, Presidents, Secretaries, Contact info, Rules, Membership, Relevant Swims as a baseline.
There are a few extra swims in there, and few still to add, a few are unclear. I figure we'd need this as a first step. I used @Monatones Openwaterpedia.com, Google and even a few links from here!
What's the minimum distance? (I'd suggest 10 miles and if so I have more to go in).
Right now those with edit access are @Evmo, @Niek, @phodgeszoho, @Mike_Gemelli & @Leonard_Jansen as you've indicated interest in involvement. Anyone else wanting access can request but I'll only grant it to those who post in this thread.
At the least it'll be a useful resource for all forum members here to have in one place (I've gone through a lot of websites so far) once we polish it up.
EDIT: BTW, I haven't included any pro swims. Open to debate but since most of us aren't pros nor ever will be... Where there is a solo amateur option available I've included it (Napoli-Capri, Memphre, St Jean)
EDIT 2: Obviously some Associations regulate more than one swim.
I think once I/we have the contacts populated we could make initial inquiries whether the associations would be willing to participate?
loneswimmer.com
We need some cutoff point. We could have 10k as the minimum but once we drop to below 10 miles the list would become much larger. It's based on practicality for now. We need to concentrate on the large associations. If at some future point we have most of those, then we can look at others?
loneswimmer.com
thanks
Paul
loneswimmer.com
loneswimmer.com
In the case of some swims, participant records are actually available on some websites, but I need to go back through the whole list to check again since I didn't do this first time around.
The Iist now stands at 43 swims/organisations with data, another 4 that I can't find info for, one of those (Bangla) has been swim by forum member @MvG who can help. Duplicate swims by the same organisation are not entered since for now the purpose is data collection. We could expand it for those organisation's multiple swims later (e.g. NYCSWim, Kingdom, BLDSA, ILDSA).
loneswimmer.com
If the goal is a centralized database of swims, I'd suggest the following (after re-thinking it):
1) Come up with an Excel or other spreadsheet that can be used to record either individual swims or larger multi-person races/attempts. Lock the format so that fields can't be moved (only exception might be to allow the first name and last name columns to be switchable). If it centrally located and will be filled out via web page, the format doesn't much matter. If it will also be sent to people to fill out, Excel or Open Office Document formats are probably the best bet since most people can read/write these.
2) Make the Spreadsheet publicly available on a website, but you may want to also email it to various people/races to get things kick-started.
3) Pick a database that will run on your server. There are free ones like SQLite or the one that comes with Open Office and ones you pay for like Microsoft Access. Or, the server might already have this capability.
4) Depending on your environment, you could import the data to the database in several ways:
a) In an all-Microsoft environment, you can use Powershell to extract Excel data and put it in Access (or Sql-Server). (I have code for this, BTW.)
b) In a non-Microsoft environment, things might be more (or less) complicated, but at WORST, you could export the data from the spreadsheet into a CSV (comma separated values) file and import it into the database. In most environments, this could be automated. You can also take this approach in a Microsoft environment.
5) Depending on the server environment, extracting the data dynamically and presenting it will be anything from "out of the box" to a custom piece of code. However, it shouldn't be too hard unless you want insanely fancy pages.
6) For bulk downloading, I suggest XML. As much as I prefer JSON, there are alot more things that work with XML, and people who know it, than JSON.
7) The user interface for queries for 5) and 6) will take a bit of thought since you are not likely to require users to learn SQL query syntax.
I will try to dummy up some things by the end of this coming (long) weekend to demo some of this.
Another approach might be to see if Google or some other website has the tools to host & pull this off. I work with highly confidential data (HIV/AIDS clients) so don't go the public route, but for something like this, I don't see a problem. The only thing would be to occasionally be able to dump the entire database locally to always have a backup copy. I'll look into this as well, but if anyone knows of a reliable site with the necessary tools, sing out.
I'm not trying to be the techno-cszar here, just making some suggestions and offering to help on the technical side of things - I can pretty well work in any environment (Windows, Unix, Linux, but NOT Macintosh - I don't have one) and with almost any language, database, or package.
-LBJ
“Moderation is a fatal thing. Nothing succeeds like excess.” - Oscar Wilde
...anything worth doing is worth overdoing.
Just to clarify for everyone, as there was some confusion earlier. Currently this project only consists of a basic spreadsheet with a list of marathon swimming organizations & contact info. The eventual goal is a searchable, interactive database of all historical marathon swims - but we are a long, long way from achieving that, as of right now.
To request access, please PM either me or @loneswimmer and include an email address. Thanks.
Not had much of a chance to look at the Google spreadsheet yet (started a new contract recently so bit manic at the moment) but just wanted to quickly post to let you know I am still around and still interested in being involved.
I am primarily a UX/UI and visual designer but I know my way around HTML/CSS and MySQL/PHP enough to put together a site to host this and provide users with an interactive front end to search and view the data.
Looking forward to it. Should be fun and hopefully things calm down enough in the next week or so that I can be more involved.
Philip
1) Menu-based entry and/or data-validation: for example, to avoid Eire and Ireland, Czechoslovakia and Czech Republic (the former no longer exists), Channel Islands (not a country), C.S.A. and CSA, all the variations of CSPF,
2) General style guide: for example, lists of names separated by semi-colons, rather than commas, slashes, 'and' etc...
3) Plenty of fields so the data can be properly, even excessively, parsed at the time of entry: for example, it would be helpful to have a field dedicated to fundraising URLs.
4) The ability for people to request their data be depersonalized, such as removing their age, or name, or ....
5) Open editing, as in StackExchange where anyone can edit anyone's Q or A with the edit going live only after moderation.
6) The ability to annotate a 'cell' with the source(s) of that information, as well automatically annotating it with the time/date of insertion/edit and the id of the person making the contribution.
7) Precise definitions for the data fields. For example, are the city/state/country fields intended for the swimmer's birth place? current residence? or ???? For example the second EC swimmer was English by birth, but had been living in Paris with a French wife for the previous 20+ years. In the databases, he is entered as UK. The fourth was an Argentine by birth, but was living in Italy at the time. Dissimilarly, he is in the databases as an Italian.
I have used R for static content but if you want to do some very cool interactive data visualisation I would suggest using D3 (http://d3js.org/) and/or NVD3 (http://nvd3.org/).
D3.js is a JavaScript library for manipulating documents based on bound data. Very cool. NVD3 is a library of re-usable charts and chart components for d3.js
We're all just carbon, water, starlight, oxygen and dreams
We know that the 2 EC organisations for example don't publish incomplete swims. This is unlikely to change in the near future, since every swim is also a commercial agreement between the swimmer and the pilot. It's perfectly understandable that people want to see the full data set,didn't every one of us ask this question at some point.
It's also perfectly understandable that swimmers who are unsuccessful might not want that published. I think for this to change would require Committee level changes in both EC organisations at least. Some other swims do publish DNFs. That field in the spreadsheet is one that needs to be completed for each organisation.
I'd foresee we continue with partial data in this area (unsuccessful/attempts) from whichever organisation provides that?
I'm writing a first draft contact letter to the organisations asking for an initial expression of interest which we obviously need to know. I'll have another look over it and put it here soon for your input?
loneswimmer.com
The way it reads to me, the scope of this thread, so far, encompasses two projects:
1) meta data: the well-on-it's-way spreadsheet of venues/organizations/contactinfo/rules/officers/etc ...
2) results data: the unborn database comprising all the "so-and-so's" swim across various bodies of water
loneswimmer.com
Some changes were made to the spreadsheet without me knowing what they were, so I've reduced the Edit permissions back to where they were. I'll post the spreadsheet as viewable to all in a separate new thread.
loneswimmer.com
It took a few years, but.... Yes.