[ipv6-hackathon] Project Proposal
Christian Teuschel christian.teuschel at ripe.net
Tue Oct 31 18:23:21 CET 2017
Hello participants, As promised, where are details to the project proposal! (It's also on our etherpad, so you can read all projects nicely collected in one place: https://pad.riseup.net/p/ripe_ncc-ipv6-hackathon) Drivers for IPv6 Deployment >From organizations that successfully implemented IPv6 we know that IPv6 is technical feasible and practically nobody questions that IPv6 is going to be the future of the Internet. Yet, today's Internet is run on IPv4 and companies seem to be reluctant to do the transition. In this project I suggest to take a data-driven approach to identify the factors that encourage (respectively discourage) companies to use IPv6. Since we will have only a limited time to work on it, I'd like to bootstrap this project with the recent work I did on this topic. In a nutshell, I introduced a metric that measures the IPv6 readiness of a company based on routing (BGP) data. The data crunching has been carried out for all ~80,000 global ASNs for the past 10 years and is readily available for the team. The analysis was done on global, regional (RIR), national and industry level. The project offers multiple, interdisciplinary items to work on but the main ones are: a) Produce reports based on the data: Analyze and interpret global, regional and/or national developments. This involves finding correlations with corresponding events. No programming skills required. b) Improve the metric The current metric associates IPv4 addresses with /48 in IPv6. This is very simple and does not take the specific requirements of an organization into account. c) Improve and extend ASN/organization to industry mapping The analysis on an industry level required a data set that mapped an organization to an industry type, which was done through the parsing and analysis of 22,000 company webpages for the RIPE NCC region. The scraping of webpages as well as the parsing require improving as the current precision is roughly 30%. The outcome would not only be useful for this topic but publishing this data set would benefit a whole group of Internet researchers. (Your name will be of course published along with the data set :)! This item would benefit of a multidisciplinary team: analytical, natural language, web development and data mining skills. All the best, Christian