So you want to use a BI tool to visualize your scientific data
Why use dashboards for scientific data?
You might want visualizations because you want to become a “data-driven organization”, or because your scientists want to spend less time making Powerpoint presentations.
I’m going to assume in this post that:
- You are somehow storing your scientific data in a central location that is queryable, like a database, or even just some Excel sheets.
- These data are being updated on a regular basis, so a static visualization does not meet your needs.
- Your needs do not require drawing chemical structures. I know I know, you like drawing chemical structures. But if you can get by without it, your software options explode, so consider it!
You might have even set up some basic “reporting” on your data… many labs monitor fridge temperatures with cloud services, or have simple web browser views into a table of results from each experiment. But answering a question like “how is this assay performing over time?” or “how much sequencing data do we generate each month?” or “how does the assay vary by cell type?” typically means scientists export data and use their Excel/Powerpoint magic to filter and pivot to create charts, which are presented to management, and then forgotten about until a month later when the question is asked again. Scientists, I know this isn’t fun for you, and there is a better way! Spend your time once creating a dashboard that pulls the latest data into the charts you like, and then sit back and watch it update when new data appears in the datastore.
Who should build your visualization dashboards?
We could discuss the spectrum of options here, depending on how much of this work will fall on the scientists vs the software developers. But I will instead take an opinionated stance: scientists should build (or at least maintain) their own visualizations.
Yes, your software developers can create beautifully custom visualizations using free open source software packages. And you are already paying the developers, so why pay extra for 3rd-party software? The thing is, these visualizations will require both up-front and ongoing effort by the developers to keep the dashboards running. In my experience scientists are typically happy with v1.0 of dashboards like this (because they helped design it). However, when the scientists need changes to be made, those developers have probably moved on to other priorities. If there is even half a day turnaround on a desired feature, the scientist ends up turning back to Excel/Powerpoint because they need the plot to look a bit different for their presentation.
You could have an informaticist or software developer create the first version of a dashboard, because this is where the largest amount of technical work is done. But if you go this route, make sure the scientists are involved in the process so they see how data is being pulled and selected, in case they need to make changes.
Which BI tool to use?
I’ll assume I’ve convinced you that scientists should make their own visualizations, but how can we convince them to make visualizations that are:
- visually consistent
- available to everyone, not “siloed”
- updates automatically with new data
?
Let’s use a tool designed for “business intelligence”! These requirements are not unique to science - every sales organization has similar needs. BI tools all allow for flexible charting, dashboard creation, and dashboard sharing. You could literally just google “Business Intelligence software” and find something pretty decent (assuming, again, you are not planning to draw chemical structures). However I’ll recommend one specifically because I’ve used it and it worked well: Tableau.
The pricing for Tableau makes the most sense if you plan to just have one or a few “dashboard designer” scientists, with the majority of the organization “viewing” or “consuming” those dashboards. Luckily this is consistent with our desire to have consistent visualizations across the organization. You don’t really want each scientist building their own dashboard. Consider instead each major data collection would have one scientist in charge of the relevant dashboard. You will pay for the server and those few developer licenses, but the number of “viewers” who visit the designed dashboard URLs is free and unlimited*. These viewers can still click to hide/show and filter different groups of data, or move axes, but they can’t change the underlying data or the overall dashboard layout, fonts, or colors. The scientist does need to understand how to query and select data before they can use the drag-and-drop dashboard designer interface, but as discussed earlier, most scientists are whizzes with Excel, so this should not be a problem.
And if you really like chemical structures, as an advanced move you can even have small structure images pop up over your dots when the user hovers over them! Just don’t ask to be able to draw them:)
* unlimited by the license; the server bandwidth still limits the number of simultaneous viewers