I have been working on a project which involves using some data from SQL in a Spark program.
I found one blog post which mentioned it in good detail, but I still had to spend a an hour or so working out the details. I thought it might be useful to write down every step I had to take to get this working.
I am using the following:
Microsoft SQL Server 2008 R2. (this is in a data center and I connect to it remotely).
Apache Spark 1.4.0
Mac OS X 10.10.4
Java version 1.8
If you want to use a query to load specific data instead of loading the entire table, you will have to use the as "Alias" as part of the SQL query. I kept getting a SQL Exception about the "where" clause being incorrect.
PS. This blog was written in a hurry. If you need more details leave a comment and I will try to include that.
I found one blog post which mentioned it in good detail, but I still had to spend a an hour or so working out the details. I thought it might be useful to write down every step I had to take to get this working.
I am using the following:
Microsoft SQL Server 2008 R2. (this is in a data center and I connect to it remotely).
Apache Spark 1.4.0
Mac OS X 10.10.4
Java version 1.8
If you want to use a query to load specific data instead of loading the entire table, you will have to use the as "Alias" as part of the SQL query. I kept getting a SQL Exception about the "where" clause being incorrect.
PS. This blog was written in a hurry. If you need more details leave a comment and I will try to include that.