We recently moved to a new database at work, snowflake. This of course meant changing a bunch of connections, or in many cases adding new connections to the new database.
I wanted to communicate with the database using R. We've tended to use the
RODBC package, which works pretty well, until I tried including it directly in
Shiny applications, where it gave an atrocious seg fault, very sad. Apparently it's a known bug, and a weird one, related to callstack depth. Oh well. There's a solution posted that did not work for me, but the update from the link says to use the
odbc package instead. To be honest I had meant to start using that anyway, since it's faster and uses more conventional practices, so this was a good opportunity.
I'm currently putting together a small package to do some routine operations for us on snowflake (which hopefully will be open-sourced very soon, assuming it's useful for other people). Currently, it uses our
RODBC connections, which was fine for our purposes up until this
OK no big deal, I'll just generalise it a bit to use both. So I start plugging away, changing connections to be of the form
DBI::dbConnect(odbc::odbc(), dsn = 'snowflake'), and making all functions a bit more general, using whatever methods are appropriate for the connection used.
I'm testing all this on my mac, and it's all working fine in my R session, good times. I then open a clean R session, and run
devtools::test(), to disaster: all tests involving the new connection failed. Hmm. I of course test the connection myself, to get a nice message:
Error: nanodbc/nanodbc.cpp:950: HY000: [Snowflake][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function.
Not cool. Wait, did I do something to break my system
ODBC connections? I check with
RODBC::odbcConnect(dsn = 'snowflake'), and that still works. OK so it's something specific to
odbc? I run the
odbc connection again, and this time it succeeds. Ugh. Fresh R session, try again, same result:
odbc fails unless
RODBC is run first - and in fact any
RODBC connection solves it, and any
odbc connection fails prior and succeeds after, so it's not directly related to snowflake. Seems
impala gives some people the same issue. So something is messed up behind the scenes.
Googling the error message lead me to this thread, where a user points out that the issue could be a misspecified
ODBCInstLib variable, in the product-specify
.ini files, which in Redshift's case is in the folder
/opt/amazon/redshift/lib/ on my mac. For snowflake's case, it's in
/opt/snowflake/snowflakeodbc/lib/universal/. For snowflake, the
.ini file (actually named
simba.snowflake.ini) indeed has the variable
ODBCInstLib set to just
libodbcinst.dylib, which I guess isn't known by my computer. So I simply fix it... by hardcoding it to my actual
libodbcinst.dylib location (something like
/usr/local/Cellar/unixodbc/<version>/lib/libodbcinst.dylib, found with
find / -name "libodbcinst.dylib"). That is it: the conections now run just fine.
What's kind of maddening is that I couldn't pin down what changed in the environment after running the
RODBC connection that caused the
odbc connections to run. The only difference I could find was that
SF_OCSP_RESPONSE_CACHE_SERVER_URL was set in the environment of R, but setting it manually did not fix the issue, so it wasn't that. Someone with more knowledge of C or C++ can maybe pin it down.
Some people seemed to have some idea, but as someone new to this
odbc stuff, reading such posts wasn't super insightful. Note that the simba
.ini file says your
DYLD_LIBRARY_PATH should be set. So why would it not be set? Apple's System Integrity Protection! I guess having to do a bit of work when installing new drivers is probably worth the security trade-off, but it never feels that way until it REALLY feels that way.