Visualization: Measuring Immigration

Visualizes the impact of immigrants in the United States. Who are they? What do they do? What is their impact?

Using a data set scraped from Wikipedia's curated list of famous personalities in the U.S. with foreign descent, we built a parallel coordinates graph and linked data table which supports filtering and allows rapid exploration of the impact of immigrants in the United States.

This project was completed for the class 05839: The Data Pipeline: Collecting and Using Data for Interactive Systems and can be found here.


Project Description

Our data set consisted of key features about important individuals from a variety of countries. For each individual, we extracted the following key features: Source Country, Name, Field of Influence, and Description. We categorized individuals as First Generation or Second Generation or Higher using a Naive Bayes classifier. We calculated the feature Influence Score as the length of the person's Wikipedia page.

List of countries and respective number of individuals:

  1. Taiwan: 21

  2. Vietnam: 86

  3. South Korea: 280

  4. Japan: 321

  5. India: 527

  6. China: 424

  7. France: 567

  8. Germany: 1097

  9. Norway: 500

  10. Greece: 464

  11. Spain: 272

  12. Mexico: 742

  13. Cuba: 352

  14. Iran: 198

We decided that a parallel coordinates chart was the best way to visualize the information. Each line in the chart corresponds to an individual. This allows you to quickly view both trends and outliers in the data.

  • You can filter by Source Country, Field of Influence, Generation, and Influence Score by filtering along one of the four vertical axis.

  • You can choose the region (Europe, Asia, Other) using the buttons on the right.

We linked the data table in the lower portion of the visualization to the parallel coordinates chart to update its display after filtering in real-time.

  • Individuals are sorted decreasingly based on Influence Score and by Country.

  • Clicking on an individual entry redirects you to his or her Wikipedia page.

  • Hovering on an individual entry highlights that individual in the parallel coordinates chart.


More Information

The site is hosted on Google App Engine and the visualization was made using D3.js

Github