Let's say you're a Data Scientist. Well, maybe not a data scientist- I mean, those online data analysis courses were definitely worth it, and you'd made it this far without being quizzed on Bayesian linear regression. So maybe you're an analyst or something, but whatever: you use Tableau, So you must be a Scientist™.
I've admitted a few times in the past to have purchased a personal Tableau Server license in my more ignorant years (aka a few months ago). While BI tools are great for understanding preexisting data, they don't allow us to go much further. This is is entirely by design. Sure, you can clean and slice your data and put it into a cute iFrame dashboard, but Tableau explicitly makes one thing explicitly clear in their product design choices: your data is with them, and it's not going anywhere else. Today we're going to take a step towards changing that.
Proprietary Product Design: Crimes Against Customers
Tableau has explicit hierarchies for information, but let's start with workbooks. Workbooks are basically spreadsheets, or in other words, collections of SQL query outputs against a data source (or multiple data sources) via a clean UI. The resulting tabular data is referred to as views. An Excel user might equate these to "sheets", but a SQL user understands that these function more like a materialized view of sorts. One would think the tables we create (from our own data) inherently belongs to us, but it doesn't. Not until you get clever.
I realize Tableau maybe be at the top of the market for its niche.... so the things I'm claiming may seem a little farfetched. Why am I so convinced that Tableau wants to lock your data? Stay with me here, and let me count the ways.
Common Courtesy API Visibility
Common knowledge suggest that visible APIs attracts development talent. The more intelligent people are exposed to your product, the more like they are to contribute. What happens if we check out the API response calls in our browser when viewing Worksheet on Tableau Server?
While this level of unnecessary paranoia on Tableau's part is distasteful, let's not forget that we're dealing with a product archaic enough to preview Windows server support over Linux. The narrative begins to make sense.
Postgres Hide and Seek
Tableau Server is running a Postgres database; really nothing magical happening here. Well, other than the database has been renamed, protected, and obfuscated in a way that even the server owner would struggle with. The default commands to interact with PostgreSQL are hidden from server admins altogether.
What if we do a search?
$ locate postgresql
/etc/postgresql-common
/etc/postgresql-common/user_clusters
/opt/tableau-postgresql-odbc_9.5.3_amd64.deb
/opt/tableau/tableau_driver/postgresql-odbc
/opt/tableau/tableau_driver/postgresql-odbc/psqlodbcw.so
/opt/tableau/tableau_server/packages/bin.20181.18.0510.1418/repo-jars/postgresql-9.4.1209.jar
/opt/tableau/tableau_server/packages/clientfileservice.20181.18.0510.1418/postgresql-9.4.1208.jar
/opt/tableau/tableau_server/packages/lib.20181.18.0510.1418/postgresql-9.4.1208.jar
/opt/tableau/tableau_server/packages/lib.20181.18.0510.1418/postgresql-9.4.1209.jar
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/_int.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/adminpack.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/ascii_and_mic.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/auth_delay.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/auto_explain.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/autoinc.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/btree_gin.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/btree_gist.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/chkpass.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/citext.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/cube.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/cyrillic_and_mic.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/dblink.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/dict_int.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/dict_snowball.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/dict_xsyn.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/earthdistance.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/euc2004_sjis2004.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/euc_cn_and_mic.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/euc_jp_and_sjis.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/euc_kr_and_mic.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/euc_tw_and_big5.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/file_fdw.so
/opt/tableau/tableau_server/packages/pgsql.20181.18.0510.1418/lib/postgresql/fuzzystrmatch.so
...And so forth. There are over a thousand results. Postgres is definitely up and running, Tableau just hates you. Unfortunately for Tableau, this drove me to hate them back.
Enter TSM: The Linux Tableau CLI
On Linux exclusively, TSM is intended to be your one tool to configure Tableau Server. It's a fine tool, but it just so happens to omit critical information and capabilities that somebody who owns their data might want to know. At first glance, it seems innocent and helpful:
Command | Explanation |
---|---|
tsm configuration [parameters] | -- Set customization for Tableau Server. |
tsm customize [parameters] | -- Set customization for Tableau Server. |
tsm data-access [parameters] | -- Category of commands related to data-access. |
tsm help | [category] | -- Help for tsm commands |
.tsm initialize [parameters] | -- Initialize Tableau Server |
tsm jobs [parameters] | -- Category of commands related to async jobs. |
tsm licenses [parameters] | -- Category of commands related to licensing. |
tsm login [parameters] | -- Sign in to the TSM agent |
tsm logout | -- Sign out from the TSM agent |
tsm maintenance [parameters] | -- Category of commands related to maintenance. |
tsm pending-changes [parameters] | -- Category of commands for pending changes. |
tsm register [parameters] | -- Register the product |
tsm reset [parameters] | -- Clears the initial admin user so you can enter a new one. Once reset is completed you will need to use the tabcmd initialuser command to create a new initial user before remote users can sign in again |
tsm restart [parameters] | -- Restart Tableau Server |
tsm security [parameters] | -- Category of commands related to security configuration |
tsm settings [parameters] | -- Category of commands related to configuration and topology settings |
tsm sites [parameters] | -- Category of commands related to site import and export |
tsm start [parameters] | -- Start Tableau Server |
tsm status [parameters] | -- View Tableau Server status |
tsm stop [parameters] | -- Stop Tableau Server |
tsm topology [parameters] | -- Category of commands related to server topology |
tsm user-identity-store [parameters] | -- Category of commands related to user-identity-store |
tsm version | -- Displays version information. |
The Red Herring
Tableau owns Google results, period. Any search query containing the word "Tableau" is dominated with pages of content Tableau would prefer you abide by, and of these things is the creation of a readonly user to access the Postgres database. The catch here is that the readonly user can't read all tables at all: there are certain tables reserved specifically for a Postgres tableau "Superuser", which is utterly and entirely undocumented on Linux. For all I know, I my be the first to publish an article of this sort, but let's hope not.
First, let's see which users exist on Postgres using TSM:
$ tsm data-access repository-access list
User Access
Tableau true
Readonly true
There's that Readonly user we talked about: feel free to play around with that user to create meaningless insights if you so please. On the other hand we have a Tableau user, which happens to be a Postgres superuser. If you don't feel comfortable accessing Superuser privileges, I suggest you leave now. This is Hackers And Slackers, and we don't fuck around; especially when software to the tune of 1 thousand dollars hides our data from us.
Operation Shock and Awe
There's a little command called tsm configuration
which lets you set some cute variables for your server. The documentation is here, but there's just one piece missing, and it's the one we need.
Tableau may be our Postgres Superuser, but what would its password possibly be? This isn't documented anywhere. Consider this my gift to you:
$ tsm configuration get -k pgsql.adminpassword
145v756270d3467bv3140af5f01v5c7e4976bcee
Could it be? Did Tableau intentionally prevent users from access PostgreSQL directly from command line and hide an undocumented password? Yes, it does all of those things. It's time to fuck shit up.
Claim Ownership
We've made it this far. The bullet is in the chamber. Go ahead and take what is rightfully yours.
tsm data-access repository-access enable --repository-username Tableau
--repository-password 145v756270d3467bv3140af5f01v5c7e4976bcee
Just make sure port 8060 is open on your VPC and you're in. Considering that there are zero search results for accomplishing this on Linux, it looks like it's just you and me now. One of us may likely go mad with power and turn on one another. That is the way of the Sith. Welcome.
Moving on Up
Feel free to cruise the workgroup database for now and wreck havoc. As fun as this has been, I have another trick up my sleeve. You've spent a lot of time building Worksheets and views; what if you could programmatically sync to an external database and autogenerate a schema for these views, updated on a scheduler, to source data for products you're building?
Thats sounds a lot like what a useful product would do. Stick around, and next time we'll be beating Tableau down for everything it's worth.