This is part 2 of a 3-part series which might help some of our students in their first steps in data visualization. Make sure to also check out the other 2 parts: get to know the user and explore visual designs.
Some things you want to find out:
- How many dimensions are there? What are the types for each dimension: categorical, numeric, geo-spatial, …?
- For each of these dimensions, or at least the most important ones: how are the data distributed?
- Are there any correlations between the dimensions?
- What does a principal component analysis, independent component analysis, or singular value decomposition reveal?
- What does a hierarchical clustering show?
- Are there any local clusters? Have a look at topological data analysis (perhaps using the R TDA module), which can reveal such local clusters in a global context.
Create loads of simple plots and really take your time for this.