With a project so squarely focused on the R community, it seemed fitting that the only reasonable way to kick off the project was by reaching out to the community for some help:
This worked outstandingly well, and actually ended up driving more engagement than any tweet I’ve ever written previously at the time of writing it has nearly 5,000 impressions and over 100 engagements, with 7 replies!
https://t.co/y7BxaXVS0X may have some items and if you come up with any fav idioms while working on the project feel free to add chapters (and yourself as an author)
— boB Rudis (@hrbrmstr) July 19, 2018
If it doesn't fit into memory, you might want to use a database backend. Just normal relational or special like neo4j (but interface to R is not that mature @_ColinFay creates neo4r )
— Roel (@RMHoge) July 20, 2018
.@thomasp85 gave an excellent talk at #rstudioconf about network wrangling, analysis and viz with tidygraph and ggraph 📦s https://t.co/y4bexZhnOG
— Dan (@TheDanBooth) July 20, 2018
@kearneymw you can tell them a lot about that!
— Pachá 帕夏 (@pachamaltese) July 19, 2018
This has given me quite a solid reading list, which now consists of:
- 21 Recipes for Mining Twitter Data with rtweet by Bob Rudis
- Neo4j - graph database if my data gets too large. It used to have R drivers available, but the RNeo4j package was been removed from CRAN due to failing checks - proceed with caution!
- Tidygraph and ggraph presentation from rstudio::conf 2018 by Thomas Lin Pedersen
- graph-tool - could help if igraph isn’t quick enough
- Exploring the CRAN social network - a blog post by Francois Keck which pretty much does exactly what I wanted to do with the GitHub part of the analysis
In addition to this, I’ll be working my way through these courses on DataCamp: