Sunday, October 25, 2015

10 Things you must know about Charts, Graphs & Visualization

"In God we trust, all others must bring data" said W. Edwards Deming. But we all know it is easier said than understood when data is actually presented in front of you. Often you need a statistician to interpret it. And they are rare! What you need is a straight and simple way to represent data that brain can interpret relatively easily. Here comes the need for Charts, Graphs and other visualization.

I started exploring the various Charts & Graphs libraries for generating insight for my hobby project www.fanffair.com. Goal was to identify the top posts based on statistical analysis of Like, Share and Comments data associated each post and share the top posts on fanffair Facebook Page. That in turn led to second hobby project of mine www.statspanda.com. In this blog post, I am going to share my notes, learning & observations as I sifted through various charting libraries ranging from Google Charts, morris.js, raphael.js, xCharts, nvd3, Flot charts, dygraphs, Rickshaw, Highcharts et al to D3.js.

All of the above are the charting libraries that either you download or use as hosted-library and invoke the JavaScript methods to create a chart or graph in your web application, albeit some of them are pretty straight and simple to use while libraries like D3.js requires some amount of programming skills and knowledge of HTML, Javascript etc. But unlike Charting libraries, D3.js gives all the power and flexibility for creating exotic visualizations. You can as well pick some of the visualization created and curated by Mike Bostok.

If you want to avoid either of the options, then you can go for online services that take your data and generate the charts and graphs online. There are many such services, some of them are really big. Many of them are often categorized as data analytics companies as they deal a lot in the data layer. I have provided a list of such services/companies in later part of this post. In fact, statspanda.com will qualify for this category.

There is a fourth category where companies sell ready-to-use Dashborad softwares. They have pre-built charts and graphs but not limited to charting and graphs. One needs to customize, implement and host them on their own.

To start with, I wil
l cover various chart types and some basics around their usage. I will also briefly touch upon the technologies used behind these charting libraries. Then I will move on to talk very briefly about some of the libraries I have explored closely. This will be followed by a list of companies who sell Charts and Graphs as service or as Dashboard software.  


Popular Chart Types (#1)


Line Chart

Spline Chart
Bar Chart
Pie Chart
Doughnut Chart
Area Chart
Bubble Chart
Scatter Plot
Bullet Chart
Gauge Chart
Combination Chart
Creative Visualization

Sample Charts generated using StatsPanda.com



Factors Behind Making a Choice (#2)

There are host of factors that may influence your choice of a particular type of chart or visualization. Here are few factors that I could compile 

Size of Data points - this is probably the most important factor behind making a choice. For example, a Pie Chart may not work very well if your data point is more than say 30 but Line Chart may still be good choice. Similarly, an Area Chart or Line chart may represent a very large set of data. However, remember number of data points also indicates whether it's a low level or high level data. Millions of raw data feeds may translate into couple of data points when aggregated by those factors.

Types of Data Points - Next, check if you are dealing with only one type of data or there multiple different types of data that need to be plotted. Often combination charts are used if data types are different or different set of data need to represented. Example - using a combination of Bar Chart & Line chart for representing revenue and profit numbers of one company viz-a-viz using multiple Line Charts for representing revenue numbers of different companies.

Relationship of Data Points - If multiple Data Types are being presented in a Combination Chart, then if the different types data are related or not also influence the choice of a Chart type.

Complexity Of Relationship - How the data sets are related matters in the selection of a chart. Complex relationships like many-to-many relations or multiple-relations between two data points are not easy represent in regular Charts & Graphs and calls for creative visualization.

Static vs Animation vs Flow Representation - Most of the Charts and Visualizations really capture a snapshot or static state of data. In certain cases you may need to show a transition from one snapshot of data to the other (i.e. animated) or capture the states that the data pass through (i.e. data flow). A good example is Sankey Chart.   

A general rule of thumb as per a paper from IBM
  • Pie chart: 3-10
  • Bar chart: fewer than 50
  • Line chart: fewer than 500
  • Bubble plot: fewer than 500
  • Scatter plot: fewer than 10,000
  • Creative Visualization comes handy for data points > 10,000  
Rendering Engines (#3)

SVG - Scalable Vector Graphics (SVG) is an XML-based vector image format for two-dimensional graphics with support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium (W3C) since 1999. Widely supported. Traditionally, MS-Internet Explorer has been a laggard but starting with IE 8 SVG is supported in IE. Some of the popular libraries like morris.js, D3.js and others built on top of D3.js use SVG.


Canvas (HTML5) - Introduced as part of HTML 5 spec. example: flot, chartjs, jqPlot and other Javascript based visualisation libraries.

VML - Vector Markup Language (VML) is an XML-based file format for two-dimensional vector graphics. Developed and promoted by Microsoft.

Chart Performance (#4)


Check out Charts performance at http://jsperf.com/charts-comparison-d3-js-kendo-highcharts-echart-flot-gr 



HTML5 Based Libraries (#5)


chartjs.org - Offers six chart types. HTML5 Canvas based, responsive. One of the best options for the chart types it supports.


canvasjs.com - Pretty good & rich collection of Charts. Free for non-commercial use.


D3.js & D3.js Based Libraries (#6)

d3.js is the best and gives highest amount of flexibility but that comes at the cost of added complexity in terms of programming skills while developing a chart or graph. If you want to avoid that, please refer to some of the libraries and that pretty straight to use.

NVD3 d3.js based charts library, has good options. Some of them are unique when compared to other generally available charts e.g. Scatter / Bubble Chart, Stacked Area Chart. Reads data from CSV and other text formats.

C3.js - Another D3.js based charting library. Pretty comprehensive set of charts, simple and easy to use, provides good documentation and available under The MIT License (MIT). C3.js has a pretty active Google Group, easy to get support.

Rickshaw built on top of d3.js. Very neat library for creating time series graphs, line chart, bar chart etc. It is free, open source and available under MIT License

http://d3pie.org - nice pie charts, offers good options. Available under The MIT License (MIT)

xCharts - D3.js based library. Uses HTML, CSS, SVG. Default charts have polished look but very limited options. It was developed by https://www.tenxer.com/ and has been made free with no strings attached.

dimplejs - it's crazy, it's very good. Reads data from CSV and other text formats. It's an open source project by http://align-alytics.com/

dc.js - It's amazing. Just look at this - http://dc-js.github.io/dc.js/ . It's available under Apache License, Version 2.0 (the "License")
Few cool examples - http://dc-js.github.io/dc.js/examples/cust.html

http://d3plus.org Built on top of d3.js, simple to use. It has limited set of examples/apis but some of them are pretty good e.g. Geo Map, Tree Map, Simple Network.

https://github.com/mbostock/d3/wiki/Gallery - d3 example library in one page.

http://bl.ocks.org/mbostock - Amazing Charts and Visualization examples by Mike Bostock, the creator of D3.js and many other libraries used for rich visualization.

http://bost.ocks.org/mike/ - Another repository by Mike Bostock.



Pure JavaScript (#7)

Flot Charts - Flot Charts are pure Javascript library. 


Google Charts - It has very rich set of charts & graphs, simple to use. You may like look at AngularJs Google Chart Tools directive as well.


jqPlot - Pure Javascript charting library.  Offers good set of options. Look and feel is not very polished. 

morris.js - raphael.js based library. Simple to use. Cool fluid look and feel. Limited options, most of the common scenarios are covered. Uses HTML, CSS, SVG

dygraphs - has good options for time series graphs but look and feel is not very polished. However, dygraphs can handle huge datasets running into millions of data points. It takes .txt and .csv file as inputs.


Mini Charting Libraries (#8)


Peity is a simple jquery.com plugin that converts an element's content into a simple mini pie line or bar chart.

jQuery Sparkline generates sparklines (small inline charts) directly in the browser using data supplied either inline in the HTML, or via javascript.


Commercial Offerings (#9)

This section covers a list of Charts and Graphs libraries. 

HighCharts - Offers very rich set of options. Probably one of the best libraries. Available free for non-commercial use, under Creative Commons License

Canvas JS - HTML5 JavaScript Charting Library with a simple API and 10x better performance compared SVG/Flash based charts. Charts are responsive & can run across devices including iPhone, Android, Desktops, etc. Offers good set of Charts and Graphs.

ZingChart - Javascript Charting library. Offers a rich set of options. Visually appealing and good for large data sets. 

JSChart - Javascript Charts. Limited options.

Ember Charts - Open source. Offers limited popular options of Charts, Tables etc.


Charts & Graphs as Service (#10)

statspanda.com : Offers creation and hosting of Charts, Graphs & Dashboards as Service. Purely REST Api driven, also provides a REST API Console.

tableau.com : Grand Daddy of Visualisation. Offers Desktop, Cloud based service as well as Back-end data integration. Has huge array of very creative Charts Graphs and Dashboards. 

datawrapper.de : Creation of embeddable Charts, Maps. Primarily used by publishing, news and media companies. 

chartio.com : Provides data integration layer with various sources like CSV, Amazon RedShift or Stripe etc. It then convert the data into intuitive Visualisation such as Charts, Graphs etc. 

chartblocks.com : Basic chart building tool. It reads the data from spreadsheet and gives tools to customise the cha. The charts can be shared in popular social media or can be embedded as iFrame. 

infogr.am : Create charts and infographics and publish them easily, primarily used by the news and media companies. 

jaspersoft.com : Create Charts, Graphs, Dashboard and embed or publish. Good integration with Amazon AWS (RDS, Redshift) . Owned by TIBCO. Focuses on Apps and On-prem Applications for embedding the Charts, Graphs and Dashboards created/hosted on their platform.

Amazon QuickSight : Works with AWS dataset to create visualisation. Offers good set of Graphs, Tables and other Visualisation. Amazon QuickSight uses a new, Super-fast, Parallel, In-memory Calculation Engine (“SPICE”) to perform advanced calculations and render visualizations rapidly. 

domo.com : Offers good set of connectors. Dashboards are designed for specific roles and for different industry verticals. Rich set of chart, graphs and other visualization are offered. Provides good integration with Amazon. 

gooddata.com : Primary positioning as Business Intelligence platform for real time analytics. Offers good set of charts, graphs and dashboard options. Has good set of Connectors.

keen.io : Keen IO is an API platform that lets developers collect and study custom events at a massive scale and converts them to visualisation. It's designed in a way so that users can embed the visualisation like Charts, Dashboards in their apps, websites. All popular Charts and Graphs are available for the Dashboards. 

chartbeat.com : Focused at Advertising and Publishing industries. Provides insights using Charts, Graphs and other visualisation.

vida.io : Create attractive visualisation, embed and use them, can create Dashboard. 

plot.ly : Plotly is the data visualization and collaboration platform for engineers and data scientists. Provides integration with Python, Excel, MATLAB & R. Offering available in three different formats - Cloud, On Prem, and Desktop Tool. 

zoomdata.com : Focused at Visualisation for Big Data, can handle large volume of data. Provides integration with Hadoop, NoSQL, ElasticSearch, Solr and Spark.

datahero.com : Provides connectors to a large array of Cloud Based data sources like Box, Dropbox, Stripe, Hubspot, MailChimp, Google Drive, Google Analytics, MixPanel etc. You can then convert the data to a Chart, Graph and other visualisation. You can create one dashboard composed of multiple Charts, Graphs from different Datasources. Have focused offerings for various industry verticals.

getdataseed.com : Provides data exploration tool sets that can import data from Excel Spreadsheet and from fetch Dropbox, Google Drive or any public url. Can convert the data into Charts, Graphs, Maps and other Visualisation. Provides REST Api integration. 

anaplan.com : Excel Spreadsheet on Cloud with ability to create crisp Charts, Graphs, Dashboards and other visualisation on the fly. Excel Spreadsheet like data is editable. Has focused offerings around Finance, Sales, Operations and HR.

collabion.com : Creates Charts, Graphs and Dashboards from SharePoint Data. Need to Downloaded and installed, need to Server Licenses.

www.domo.com : Offers good options of Charts, Graphs and other visualisation. Provides wide range of connectors for various datasources and Apps ranging from Excel, Google sheets, Box, Facebook, Marketo, Salesforce etc. Solutions are tailored for various roles, industry verticals and operations.

www.klipfolio.com : Offers creation of company Dashboards composed of cool Charts, Graphs and other visualisation. Provides a wide range of connectors to all major datasources and various options for data infusion. 

Microsoft Power BI : Available as Desktop Application and mobile Apps for iOS, Android and Windows Phones. Wide range of Charts, Graphs can be glued to a Dashboard quickly, supports drag and drop features. Provides wide range of Datasource connectors. Supports natural language query inside a Dashboard.

RJMetrics.com : Offers creation of Charts, Graphs and Dashboards on Cloud. Has good Data Integration with popular Datasources. RJMetrics Pipeline transfers the to Redshift. Good for large volume of Data - from less than 5 million rows to upside of 500 million rows sync up. 

http://kilometer.io : SaaS Analytics Tool

https://chartmogul.com : Subscription Analytics

gramener.com They are into creative visualization. Provides visualization of the data from various business operations like Sales, Marketing, HR etc. 

Finally, I still feel the blog looks like a work in progress. In fact, I would spend more time taking a deeper look into all the companies listed under #10. However, I hope you found the post useful even in current shape and form.